Search

Top 60 Oracle Blogs

Recent comments

Statistics on Load

One of the talks I gave recently on the Groundbreaker Yatra tour was on the changes that arrive with the optimizer when you move to 12c and above. We talked about changes to histograms, all of the adaptive features etc, but also, one of the simplest but coolest optimizer improvements when it comes to loading data. When you perform a direct-path insert into an empty table, or a create-table-as-select (CTAS) operation, there is no need to perform an additional DBMS_STATS call to gather statistics at the conclusion of the load. The load process itself handles that for you. Here’s a simple example of that in action


SQL> create table t (id primary key, tag)
  2  as select rownum id, to_char(rownum) tag
  3  from dual connect by level <= 50000;

Table created.

SQL>
SQL> select num_rows from user_tables
  2  where table_name = 'T';

  NUM_ROWS
----------
     50000

1 row selected.

SQL>
SQL> select leaf_blocks
  2  from user_indexes
  3  where table_name = 'T';

LEAF_BLOCKS
-----------
        103

1 row selected.

Of course, many people are now aware of this functionality, so why does it warrant a blog post? Mainly to make readers aware of a boundary case. If the table you are creating or loading is an index-organized table (IOT), then you do not get the statistics gathered automatically.


SQL> drop table t purge;

Table dropped.

SQL>
SQL> create table t (id primary key, tag)
  2  organization index
  3  as select rownum id, to_char(rownum) tag
  4  from dual connect by level <= 50000;

Table created.

SQL>
SQL> select num_rows from user_tables
  2  where table_name = 'T';

  NUM_ROWS
----------


1 row selected.

SQL>
SQL> select leaf_blocks
  2  from user_indexes
  3  where table_name = 'T';

LEAF_BLOCKS
-----------


1 row selected.

Initially I suspected that the reason for this is that since the statistics that are being picked up on the fly are being collected on the source data, it only becomes possible for those statistics to be transposed if the target table structure is the same as the source. But if that were (strictly) the case, then other examples would exhibit a similar restriction, such as going from compressed data to uncompressed or vice versa. But you can see that the CTAS on load statistics are fine in this circumstance:


SQL> create table t1 (id primary key, tag)
  2  compress
  3  as select * from t;

Table created.

SQL>
SQL> select num_rows from user_tables
  2  where table_name = 'T1';

  NUM_ROWS
----------
     99999

SQL>
SQL> select leaf_blocks
  2  from user_indexes
  3  where table_name = 'T1';

LEAF_BLOCKS
-----------
        208

Similarly, even if I am copying from an identical index-organized table, the statistics will not be collected.


SQL> create table t2 (id primary key, tag)
  2  organization index
  3  as select * from t;

Table created.

SQL> select num_rows from user_tables
  2  where table_name = 'T2';

  NUM_ROWS
----------


SQL>
SQL> select leaf_blocks
  2  from user_indexes
  3  where table_name = 'T2';

LEAF_BLOCKS
-----------

Whatever the root cause is, just be aware, that if you are direct-path loading IOTs, then make sure you take the time to to also collect statistics on that table before you start querying it in earnest.