Search

Top 60 Oracle Blogs

Recent comments

Index Sanity

By popular demand (well, one person emailed me to ask for it) I’m going to publish the source code for a little demo I’ve been giving since the beginning of the millennium – it concerns indexes and the potential side effects that you can get when you drop an index that you’re “not using”. I think I’ve mentioned the effect several times in the history of this blog, but I can’t find an explicit piece of demo code, so here it is – starting at the conclusion – as a cut and paste from an SQL*Plus session running against an 11g instance:


SQL> set autotrace traceonly explain
select
        t1.small_vc, t2.small_vc, t3.small_vc
from
        t1, t2, t3
where
        t1.n1 between 40 and 50
and     t2.id1 = t1.id1
and     t2.ind_pad = t1.ind_pad
and     t2.id2 = t1.id2
and     t3.id = t1.id1
 11  ;

Execution Plan
----------------------------------------------------------
Plan hash value: 1184213596

----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |   484 | 64856 |   227   (2)| 00:00:02 |
|*  1 |  HASH JOIN          |      |   484 | 64856 |   227   (2)| 00:00:02 |
|*  2 |   HASH JOIN         |      |   484 | 57596 |    14   (0)| 00:00:01 |
|   3 |    TABLE ACCESS FULL| T2   |    20 |  1160 |     4   (0)| 00:00:01 |
|*  4 |    TABLE ACCESS FULL| T1   |   484 | 29524 |    10   (0)| 00:00:01 |
|   5 |   TABLE ACCESS FULL | T3   |  5000 | 75000 |   213   (2)| 00:00:02 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T3"."ID"="T1"."ID1")
   2 - access("T2"."ID1"="T1"."ID1" AND "T2"."IND_PAD"="T1"."IND_PAD"
              AND "T2"."ID2"="T1"."ID2")
   4 - filter("T1"."N1"<=50 AND "T1"."N1">=40)

SQL> drop index t2_i1;

Index dropped.

select
        t1.small_vc, t2.small_vc, t3.small_vc
from
        t1, t2, t3
where
        t1.n1 between 40 and 50
and     t2.id1 = t1.id1
and     t2.ind_pad = t1.ind_pad
and     t2.id2 = t1.id2
and     t3.id = t1.id1
 11  ;

Execution Plan
----------------------------------------------------------
Plan hash value: 2290830436

--------------------------------------------------------------------------------------
| Id  | Operation                    | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |       |    52 |  6968 |    67   (2)| 00:00:01 |
|   1 |  NESTED LOOPS                |       |    52 |  6968 |    67   (2)| 00:00:01 |
|   2 |   NESTED LOOPS               |       |    52 |  6968 |    67   (2)| 00:00:01 |
|*  3 |    HASH JOIN                 |       |    52 |  6188 |    14   (0)| 00:00:01 |
|   4 |     TABLE ACCESS FULL        | T2    |    20 |  1160 |     4   (0)| 00:00:01 |
|*  5 |     TABLE ACCESS FULL        | T1    |   484 | 29524 |    10   (0)| 00:00:01 |
|*  6 |    INDEX UNIQUE SCAN         | T3_PK |     1 |       |     0   (0)| 00:00:01 |
|   7 |   TABLE ACCESS BY INDEX ROWID| T3    |     1 |    15 |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("T2"."ID1"="T1"."ID1" AND "T2"."IND_PAD"="T1"."IND_PAD" AND
              "T2"."ID2"="T1"."ID2")
   5 - filter("T1"."N1"<=50 AND "T1"."N1">=40)
   6 - access("T3"."ID"="T1"."ID1")

Starting from the top – I’ve enabled autotrace which, technically, could mean that the plans are not the ones I’d see at run-time, but you can take my word for it that in 11g they are the run-time plans; then I’ve supplied a query that produces a plan with 3 full tablescans, two hash joins, and no index usage at all.

You’ll notice at operation 3 of the plan that table t2 is very small – only 20 rows selected, with no predicates that could have filtered that result down from a large table (take my word for it the stats have just been collected) so, as the ancient mythology would have it, we don’t really need an index on that table (a quick check tells me that the index wasn’t there to enforce uniqueness). Immediately after the first execution plan you can see that I’ve dropped an index called t2_i1 – trust me that IS the index on table t2.

We “run” the original query again, it gets re-optimised (and there’s no question of cardinality feedback or any other feature coming into play) and we get a different plan.

Dropping, or adding, a multi-column index to a table could change execution plans – even if the index is not present in the plan.

The reason for this is the “index sanity check”. When the optimizer is doing its cardinality estimates, if it see equality conditions on the set of columns that make up an index it can use the distinct_keys statistic from the index in the calculation rather than using the standard calculation of multiplying together the num_distinct of the separate columns. In earlier versions of Oracle there were some restrictions about uniqueness, but the limitations were removed in 11.1.0.7.

In my case there were 10 distinct values for id1, just one value for ind_pad, and 20 distinct values for id2 – but a total of only 20 distinct values for the combination. With an index in place on the combination the optimizer used the value 20 in its calculation, in the absence of the index it used the value 200 – that factor of 10 led to a drop in the join cardinality estimate from 484 rows to 52 rows – at which point the optimizer calculations made the next step in the plan change from a hash join to a nested loop join.

If you want to reproduce the demo, here’s the full script – the data isn’t a realistic data set, and I’ve had to use various non-standard settings to make the script as repeatable as possible – I’ve built the data set in a tablespace using an 8KB block size, 1MB uniform extents and manual (freelist) segment space management.


rem
rem     Script:         index_sanity.sql
rem     Author:         Jonathan Lewis
rem

drop table t3;
drop table t2;
drop table t1;

execute dbms_random.seed(0);

begin   
        begin           execute immediate 'purge recyclebin';
        exception       when others then null;
        end; 

        begin
                dbms_stats.set_system_stats('MBRC',16);
                dbms_stats.set_system_stats('MREADTIM',10);
                dbms_stats.set_system_stats('SREADTIM',5);
                dbms_stats.set_system_stats('CPUSPEED',1000);
        exception
                when others then null;
        end;

end;
/

create table t1
as
select
        mod(rownum,10)          id1,
        mod(rownum,20)          id2,
        rpad('x',40,'x')        ind_pad,
        mod(rownum,100)         n1,
        lpad(rownum,10,'0')     small_vc,
        rpad('x',50)            padding
from
        all_objects
where
        rownum  <= 4000
;

create table t2 
pctfree 99
pctused 1
as
select
        mod(rownum,10)          id1,
        mod(rownum,20)          id2,
        rpad('x',40,'x')        ind_pad,
        mod(rownum,100)         n1, 
        lpad(rownum,10,'0')     small_vc,
        rpad('x',200)           padding
from
        all_objects
where
        rownum <= 20
;

create table t3
pctfree 95
pctused 1
as
select
        rownum          id,
        rpad(rownum,10) small_vc,
        rpad('x',100)   padding
from
        all_objects
where
        rownum <= 5000
;
begin
        dbms_stats.gather_table_stats(
                ownname => user,
                tabname => 'T1',
                method_opt => 'for all columns size 1'
        );

        dbms_stats.gather_table_stats(
                ownname => user,
                tabname => 'T2',
                method_opt => 'for all columns size 1'
        );

        dbms_stats.gather_table_stats(
                ownname => user,
                tabname => 'T3',
                method_opt => 'for all columns size 1'
        );

end;
/

create        index t1_i1 on t1(id1, ind_pad, id2) pctfree 91;
create        index t2_i1 on t2(id1, ind_pad, id2) pctfree 91;
alter table t3 add constraint t3_pk primary key (id);

set autotrace traceonly explain

select
        t1.small_vc, t2.small_vc, t3.small_vc
from
        t1, t2, t3
where
        t1.n1 between 40 and 50
and     t2.id1 = t1.id1
and     t2.ind_pad = t1.ind_pad
and     t2.id2 = t1.id2
and     t3.id = t1.id1
;

-- alter index t1_i1 invisible;
-- alter index t2_i1 invisible;

drop index t1_i1;
-- drop index t2_i1;

accept X prompt "Press return to coninue"

select
        t1.small_vc, t2.small_vc, t3.small_vc
from
        t1, t2, t3
where
        t1.n1 between 40 and 50
and     t2.id1 = t1.id1
and     t2.ind_pad = t1.ind_pad
and     t2.id2 = t1.id2
and     t3.id = t1.id1
;

set autotrace off

You’ll notice from the commented lines in the above that the effect appears whether you drop the index or make it invisible, also that there’s a similar index on the t1 table that matches the index on the t2 table – I could get the effect from dropping or making invisible either index.

There is a saving grace in 11g – if I do drop, or make invisible, one of these indexes I can protect myself against the statistical effect by create a column group on the same set of columns, and the num_distinct from the column group would serve the same purpose as the distinct_keys from the index.