Search

Top 60 Oracle Blogs

Recent comments

Subquery Factoring (10)

What prompted me to write my previous note about subquerying was an upgrade to 12c, and a check that a few critical queries would not do something nasty on the upgrade. As ever it’s always interesting how many little oddities you can discover while looking closely as some little detail of how the optimizer works. Here’s an oddity that came up in the course of my playing around investigation in 12.1.0.2 – first some sample data:


create table t1
nologging
as
select * from all_objects;

create index t1_i1 on t1(owner) compress nologging;

begin
        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'T1',
                method_opt       => 'for all columns size 1 for columns owner size 254'
        );
end;
/

The all_objects view is convenient as a tool for modelling what I wanted to do since it has a column with a small number of distinct values and an extreme skew across those values. Here’s a slightly weird query that shows an odd costing effect:


with v1 as (
        select /*+ inline */ owner from t1 where owner > 'A'
)
select count(*) from v1 where owner = 'SYS'
union all
select count(*) from v1 where owner = 'SYSTEM'
;

Since the query uses the factored subquery twice and there’s a predicate on the subquery definition, I expect to see materialization – and that’s what happens (even though I’ve engineered the query so that materialization is more expensive than executing inline). Here are the two plans from 12.1.0.2 (the same pattern appears in 11.2.0.4, though the costs are a little less across the board):


=======================
Unhinted (materializes)
=======================

---------------------------------------------------------------------------------------------------------
| Id  | Operation                  | Name                       | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |                            |     2 |   132 |    25  (20)| 00:00:01 |
|   1 |  TEMP TABLE TRANSFORMATION |                            |       |       |            |          |
|   2 |   LOAD AS SELECT           | SYS_TEMP_0FD9D661B_876C2CB |       |       |            |          |
|*  3 |    INDEX FAST FULL SCAN    | T1_I1                      | 85084 |   498K|    21  (15)| 00:00:01 |
|   4 |   UNION-ALL                |                            |       |       |            |          |
|   5 |    SORT AGGREGATE          |                            |     1 |    66 |            |          |
|*  6 |     VIEW                   |                            | 85084 |  5483K|    13  (24)| 00:00:01 |
|   7 |      TABLE ACCESS FULL     | SYS_TEMP_0FD9D661B_876C2CB | 85084 |   498K|    13  (24)| 00:00:01 |
|   8 |    SORT AGGREGATE          |                            |     1 |    66 |            |          |
|*  9 |     VIEW                   |                            | 85084 |  5483K|    13  (24)| 00:00:01 |
|  10 |      TABLE ACCESS FULL     | SYS_TEMP_0FD9D661B_876C2CB | 85084 |   498K|    13  (24)| 00:00:01 |
---------------------------------------------------------------------------------------------------------

=============
Forced inline
=============

--------------------------------------------------------------------------------
| Id  | Operation              | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |       |     2 |    12 |    22  (14)| 00:00:01 |
|   1 |  UNION-ALL             |       |       |       |            |          |
|   2 |   SORT AGGREGATE       |       |     1 |     6 |            |          |
|*  3 |    INDEX FAST FULL SCAN| T1_I1 | 38784 |   227K|    21  (15)| 00:00:01 |
|   4 |   SORT AGGREGATE       |       |     1 |     6 |            |          |
|*  5 |    INDEX RANGE SCAN    | T1_I1 |   551 |  3306 |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------

I’m not surprised that the optimizer materialized the subquery – as I pointed out in my previous article, the choice seems to be rule-based (heuristic) rather than cost-based. What surprises me is that the cost for the default plan is not self-consistent – the optimizer seems to have lost the cost of generating the temporary table. The cost of the materialized query plan looks as if it ought to be 21 + 13 + 13 = 47. Even if the optimizer were smart enough to assume that the temporary table would be in the cache for the second scan (and therefore virtually free to access) we ought to see a cost of 21 + 13 = 34. As it is we have a cost of 25, which is 13 + 13 (or, if you check the 10053 trace file, 12.65 + 12.65, rounded).

Since the choice to materialize doesn’t seem to be cost-based (at present) this doesn’t really matter – but it’s always nice to see, and be able to understand, self-consistent figures in an execution plan.

Footnote

It is worth pointing out as a side note that materialization can actually be more expensive than running in-line, even for very simple examples. Subquery factoring seems to have become more robust and consistent over recent releases in terms of consistency of execution plans when the subqueries are put back inline, but you still need to think a little bit before rewriting a query for cosmetic (i.e. totally valid “readability”) reasons just to check whether the resulting query is going to produce an unexpected, and unexpectedly expensive, materialization.