A collection of tutorials, code and tools to help you get better using Oracle
02 September 2011
You have a table that is continuously being inserted, and you want to do something with all the new rows. Maybe copy them to a different table or system, maybe take some action on each processed row.
There are various ways of achieving these goals, such as using Materialized Views and Triggers but I have encountered scenarios where they are not suitable, and writing come custom code works better.
Generally when this problem comes up, each row in the table has column containing the time it was inserted, so the usual pattern is to use a piece of PLSQL like the following to get the rows you want:
SQl11G> declare
v_sysdate date default sysdate;
v_copied_up_to_date date;
begin
select copied_up_to_date
into v_copied_up_to_date
from runlog;
for r in (select *
from MyTable
where created_dtm >= v_copied_up_to_date
and created_dtm < v_sysdate) loop
null; -- do something with each row
end loop;
update runLog
set copied_up_to_date = v_sysdate;
commit;
end;
Basically, log the system date when records were last copied up to, and then get any new ones and update the copiedupto_date for the next run.
At first glance, this seems like a good solution. Until you start to think about the consistent read mechanism used by Oracle.
Imagine lots of sessions are inserting into this table, and each insert is part of a bigger transaction, consisting of many inserts and queries. These transactions could last for any length of time, from a few hundred milliseconds to many seconds or even minutes.
When the insert is performed, the timestamp applied to the record will be the time the insert actually occurred, but no other sessions on the database will be able to see that new record until it is actually committed.
Imagine the following scenario:
T1 Session 1 inserts new record, created_dtm = T1
T1 + 5 Session 2 queries new records where created_dtm < (T1 + 5)
T1 + 6 Session 1 commits
In this case, session 2 will not see the record inserted at T1. The next time it runs, it will not find it either, as it will only look for records newer than T1 + 5. The process has just lost that record.
One bad solution to this problem is to attempt to calculate the time of the longest transaction inserting into the table, and then introduce a time lag to the copy process, so that it only copies up to 'sysdate - X', where X is the worst case transaction time, plus a little more for a buffer.
This will usually work, but it is fragile and still prone to errors if you have a database slowdown someday.
When a row is updated or inserted, but not committed, it will have a lock against it to prevent other sessions modifying it at the same time. When you ask for an Exclusive Lock on a table, Oracle checks for the presence of these locks, and if any exist the Exclusive lock will not be granted until they all clear. Using Exclusive locks to solve this problem will not work, as it will block new inserted into the table while the process runs, which is often not acceptable.
Starting in Oracle 11gR2, a new function has been added to the DBMSUTILITY package, called WAITONPENDINGDML.
This function accepts a list of tables and a timeout. When it is called, it will block until all transactions started on the table before the call was made commit. However, it ignores all transactions that started after it was called.
This is the perfect solution to the problem outlined above. By introducing a call to DBMSUtility.WaitOnPendingDML, the code above can be made reliable:
SQl11G> declare
v_sysdate date default sysdate;
v_copied_up_to_date date;
v_scn number default null;
begin
select copied_up_to_date
into v_copied_up_to_date
from runlog;
if dbms_utility.wait_on_pending_dml('MYTABLE', null, v_scn);
for r in (select *
from MyTable
where created_dtm >= v_copied_up_to_date
and created_dtm < v_sysdate) loop
null; -- do something with each row
end loop;
update runLog
set copied_up_to_date = v_sysdate;
commit;
end if;
end
Now, even if the copy process starts at T1 + 5, and the insert at T1 is not committed until T1 + 6, the WaitOnPending_DML function will not return until that commit happens, and no records will be lost.
To demonstrate this, create a simple table called MyTable:
SQL11G> create table mytable (id integer, created_dtm date);
Now 3 sessions are required. In session 1, insert a row:
SQL11G> insert into mytable values (1, sysdate);
In session 2, attempt to select all rows less than sysdate:
SQL11G> set serveroutput on
SQL11G> declare
v_scn number default null;
begin
if dbms_utility.wait_on_pending_dml('MYTABLE', null, v_scn) then
for r in (select *
from MyTable
where created_dtm < sysdate) loop
dbms_output.put_line(r.id);
end loop;
end if;
end;
/
At this point session 2 should block as session 1 still has not committed.
In session 3, insert another record into MyTable:
SQL11G> insert into mytable values (3, sysdate);
Now go back to session 1 and issue a commit. Almost as soon as you commit session 1, notice that session 2 prints the ID of the inserted record and the PLSQL block completes. The fact that session 3 has an uncommited record doesn't affect it and no records get lost or unreliable delays were introduced to make the code work.
Oracle added this feature to help support Online Redefinition, but as you can see, it can solve a few other problems too, and is a useful tool to have in your tool box.