Better At Oracle

A collection of tutorials, code and tools to help you get better using Oracle


11 June 2012

Nested Array tables in SQL and Statistics

It is pretty well known, that you can bind an array of values into a query using PLSQL, and have Oracle treat it like a table.

For instance, first create a type:

create or replace type t_varchar2_tab is table of varchar2(50);
/

Then in PLSQL, fill it with some data and run a query using it:

declare
  v_array t_varchar2_tab;
begin
  v_array :=  t_varchar2_tab();
  for i in 1..1 loop
    v_array.extend;
    v_array(i) := 'hello';
  end loop;

  open :ref_cur for
  select /*+ gather_plan_statistics */ * 
  from table(cast(v_array as t_varchar2_tab));
end;
/

print ref_cur;

COLUMN_VALUE
--------------------------------------------------
hello

The first interesting point, is that using the TABLE and CAST functions, we can make this array behave like a single column table inside the query, which means we can join to it.

In order to join this pseudo table to other table, you need to have a name for the column so you can write code like:

select c.*
from customers c, 
     table(cast(v_array as t_varchar2_tab)) cust_list
where c.customer_name = cust_list.???

As far as I know you cannot specify the name of the column when you cast the array into a table, but luckily Oracle creates a name for us, and calls the column 'COLUMN_VALUE', so the code above becomes:

select c.*
from customers c, 
     table(cast(v_array as t_varchar2_tab)) cust_list
where c.customer_name = cust_list.column_value

What about Statistics?

Normally, real tables in your database have fairly accurate statistics against them, allowing the optimizer to make good decisions when generating query plans. So what does it do with these array tables which have no statistics? Normally the optimizer has defaults for situations like this, for example:

explain plan for
select *
from table(cast(:customer_list as t_varchar2_tab));

select * from table(dbms_xplan.display());

------------------------------------------------------------------------------------------
| Id  | Operation                         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                  |      |  8168 | 16336 |    29   (0)| 00:00:01 |
|   1 |  COLLECTION ITERATOR PICKLER FETCH|      |  8168 | 16336 |    29   (0)| 00:00:01 |
------------------------------------------------------------------------------------------

So it would appear that in the absence of any other information, Oracle assumes 8168 entries will be passed in the array, which is a lot if you only plan to pass 10. However, the optimizer is a bit smarter than this.

If I take the code I ran earlier, and this time print the explain plan, we can see something different:

declare
  v_array t_varchar2_tab;
begin
  v_array :=  t_varchar2_tab();
  for i in 1..1 loop
    v_array.extend;
    v_array(i) := 'hello';
  end loop;

  open :ref_cur for
  select /*+ gather_plan_statistics  */ * 
  from table(cast(v_array as t_varchar2_tab));
end;
/

print ref_cur;

COLUMN_VALUE
--------------------------------------------------
hello

select * from table(dbms_xplan.display_cursor(format => 'RUNSTATS_LAST'));

------------------------------------------------------------------------------------------
| Id  | Operation                         | Name | Starts | E-Rows | A-Rows |   A-Time   |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                  |      |      1 |        |      1 |00:00:00.01 |
|   1 |  COLLECTION ITERATOR PICKLER FETCH|      |      1 |      1 |      1 |00:00:00.01 |
------------------------------------------------------------------------------------------

This time, Oracle seems to have peeked at the bind variable, and noticed there is only a single entry in the array, and hence made a much more sensible estimate about how many rows it will return.

There is still one problem however, if I change the code above to put 100 entries into the array, since the plan is cached in the shared pool, Oracle will still assume only a single row - the classic bind variable peeking problem:

------------------------------------------------------------------------------------------
| Id  | Operation                         | Name | Starts | E-Rows | A-Rows |   A-Time   |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                  |      |      1 |        |    100 |00:00:00.01 |
|   1 |  COLLECTION ITERATOR PICKLER FETCH|      |      1 |      1 |    100 |00:00:00.01 |
------------------------------------------------------------------------------------------

The Cardinality Hint

Perhaps Adaptive Cursor Sharing will kick in and notice if this difference in row counts causes problems, but for many applications you have a general idea about how many entries your array will tend to receive at run time, and I think this is a good place to use the Cardinality hint to give the optimizer a hand when casting arrays into tables.

The Cardinality hint is one of the so called safe hints, in that is does not dictate a query plan, but gives more information to the optimizer to help it make a better decision. Adding it into our example gives the results below:

declare
  v_array t_varchar2_tab;
begin
  v_array :=  t_varchar2_tab();
  for i in 1..100 loop

    v_array.extend;
    v_array(i) := 'hello';
  end loop;

  open :ref_cur for
  select /*+ gather_plan_statistics cardinality(t, 10) */ * 
  from table(cast(v_array as t_varchar2_tab)) t;
end;
/

print ref_cur;

select * from table(dbms_xplan.display_cursor(format => 'RUNSTATS_LAST'));

------------------------------------------------------------------------------------------
| Id  | Operation                         | Name | Starts | E-Rows | A-Rows |   A-Time   |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                  |      |      1 |        |    100 |00:00:00.01 |
|   1 |  COLLECTION ITERATOR PICKLER FETCH|      |      1 |     10 |    100 |00:00:00.01 |
------------------------------------------------------------------------------------------

This time, the estimated rows is 10 despite the actual rows being 100. Whether or not this is a sensible thing to do, depends a lot on your application. If one run will have 1 or 2 array entries and another will have several thousand, you should be able to test this against your application to see what happens. Maybe setting the cardinality is still a good idea to help lock in a good plan for the 'normal' case. It is also worth digging into Adaptive Cursor Sharing to see how Oracle handles binding arrays of various sizes.

As usual there is no hard and fast answer - it depends on your application!

blog comments powered by Disqus