Search

Top 60 Oracle Blogs

Recent comments

Quiz Night

I have a table with several indexes on it, and I have two versions of a query that I might run against that table. Examine them carefully, then come up with some plausible reason why it’s possible (with no intervening DDL, DML, stats collection, parameter fiddling etc., etc., etc.) for the second form of the query to be inherently more efficient than the first.


select
        bit_1, id, small_vc, rowid
from
        bit_tab
where
        bit_1 between 1 and 3
;

prompt  ===========
prompt  Split query
prompt  ===========

select
        bit_1, id, small_vc, rowid
from
        bit_tab
where
        bit_1 = 1
or      bit_1 > 1 and bit_1 <= 3
;

Update / Answers

I avoided giving any details about the data and indexes in this example as I wanted to allow free rein to readers’ imagination  – and I haven’t been disappointed with the resulting suggestions. The general principles of allowing more options to the optimizer, effects of partitioning, and effects of skew are all worth considering when the optimizer CAN’T use an execution path that you think makes sense.  (Note: I didn’t say make it clear in my original question, but I wasn’t looking for cases where you could get a better path by hinting (or profiling) I was after cases where Oracle literally could not do what you wanted.)

The specific strategy I was thinking of when I posed the question was based on a follow-up to some experiments I had done with the cluster_by_rowid() hint. and (there was a little hint in the “several indexes” and more particularly the column name “bit_1″) I was looking at a data warehouse table with a number of bitmap indexes. So here’s the execution plan for the first version of the query  when there’s a simple bitmap index on bit_1.


------------------------------------------------------------------------
| Id  | Operation                    | Name    | Rows  | Bytes | Cost  |
------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |         |   600 | 18000 |    96 |
|   1 |  TABLE ACCESS BY INDEX ROWID | BIT_TAB |   600 | 18000 |    96 |
|   2 |   BITMAP CONVERSION TO ROWIDS|         |       |       |       |
|*  3 |    BITMAP INDEX RANGE SCAN   | BT1     |       |       |       |
------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("BIT_1">=1 AND "BIT_1"<=3)

And here’s the plan for the second query:


------------------------------------------------------------------------
| Id  | Operation                    | Name    | Rows  | Bytes | Cost  |
------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |         |   560 | 16800 |    91 |
|   1 |  TABLE ACCESS BY INDEX ROWID | BIT_TAB |   560 | 16800 |    91 |
|   2 |   BITMAP CONVERSION TO ROWIDS|         |       |       |       |
|   3 |    BITMAP OR                 |         |       |       |       |
|*  4 |     BITMAP INDEX SINGLE VALUE| BT1     |       |       |       |
|   5 |     BITMAP MERGE             |         |       |       |       |
|*  6 |      BITMAP INDEX RANGE SCAN | BT1     |       |       |       |
------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("BIT_1"=1)
   6 - access("BIT_1">1 AND "BIT_1"<=3)

Clearly the second plan is more complex than the first – moreover the added complexity had resulted in the optimizer getting a different cardinality estimate – but, with my data set, there’s a potential efficiency gain. Notice how lines 5 and 6 show a bitmap range scan followed by a bitmap merge: to do the merge Oracle has to “superimpose” the bitmaps for the different key values in the range scan to produce a single bitmap that it can then OR with the bitmap for bit_1 = 1 (“bitmap merge” is effectively the same as “bitmap or” except all the bitmaps come from the same index). The result of this is that when we convert to rowids the rowids are in table order. You can see the consequences in the ordering of the result set or, more importantly for my demo, in the autotrace statistics:


For the original query:
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        604  consistent gets
          0  physical reads
          0  redo size
      27153  bytes sent via SQL*Net to client
        777  bytes received via SQL*Net from client
         25  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        600  rows processed


For the modified query
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        218  consistent gets
          0  physical reads
          0  redo size
      26714  bytes sent via SQL*Net to client
        777  bytes received via SQL*Net from client
         25  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        600  rows processed

Note, particularly, the change in the number of consistent gets. Each table block I visited held two or three rows that I needed, in the first query I visit the data in order of (bit_1, rowid) and get each table block 3 time; in the second case I visit the data in order of rowid and only get each table block once (with a “buffer is pinned count” for subsequent rows from the same block).

Here’s the starting output from each query, I’ve added the rowid to the original select statements so that you can see the block ordering:


Original query
     BIT_1         ID SMALL_VC   ROWID
---------- ---------- ---------- ------------------
         1          2 2          AAAmeCAAFAAAAEBAAB
         1         12 12         AAAmeCAAFAAAAECAAB
         1         22 22         AAAmeCAAFAAAAEDAAB
         1         32 32         AAAmeCAAFAAAAEEAAB
         1         42 42         AAAmeCAAFAAAAEFAAB
         1         52 52         AAAmeCAAFAAAAEGAAB

Modified query
     BIT_1         ID SMALL_VC   ROWID
---------- ---------- ---------- ------------------
         1          2 2          AAAmeCAAFAAAAEBAAB
         2          3 3          AAAmeCAAFAAAAEBAAC
         3          4 4          AAAmeCAAFAAAAEBAAD
         1         12 12         AAAmeCAAFAAAAECAAB
         2         13 13         AAAmeCAAFAAAAECAAC
         3         14 14         AAAmeCAAFAAAAECAAD
         1         22 22         AAAmeCAAFAAAAEDAAB
         2         23 23         AAAmeCAAFAAAAEDAAC
         3         24 24         AAAmeCAAFAAAAEDAAD

By rewriting the query I’ve managed to force a “cluster by rowid” on the data access. Of course, the simpler solution would be to add the /*+ cluster_by_rowid() */ hint to the original query – but it doesn’t work for bitmap indexes, and when I found that it worked for B-tree indexes the next test I did was to try a single bitmap index, which resulted in my writing this note.

Footnote: I don’t really expect Oracle Corp. to modify their code to make the hint work with bitmaps, after all it’s only relevant in the special case of using a bitmap index with a range scan and no subsequent bitmap AND/OR/MINUS operations where it would be needed – and you’re not really expected to use a single bitmap index to access a table, we engineer bitmaps to take advantage of combinations.