With Subquery()

Here’s a little oddity that came up recently on the OTN database forum – an example where a “with” subquery (common table expression / factored subquery) produced a different execution plan from the equivalent statement with the subquery moved to an inline view; tested in 12.1.0.2 and 12.2.0.1. Here are the two variations:

with  tbl as (
          select 1 col1, 'a'  col2 from dual
union all select 2 , 'a' from dual
union all select 3 , 'b' from dual
union all select 4 , 'a' from dual
union all select 5 , 'a' from dual
union all select 6 , 'b' from dual
union all select 7 , 'b' from dual
),
lag_data as (
        select col1, col2, lag(col2) over (order by col1) col2a from tbl
)
select  col1, col2
from    lag_data
where   col2a is null or col2a <> col2
order by col1
;

with  tbl as (
          select 1 col1, 'a'  col2 from dual
union all select 2 , 'a' from dual
union all select 3 , 'b' from dual
union all select 4 , 'a' from dual
union all select 5 , 'a' from dual
union all select 6 , 'b' from dual
union all select 7 , 'b' from dual
)
select  col1, col2
from    (
        select col1, col2, lag(col2) over (order by col1) col2a from tbl
        )
where   col2a is null or col2a <> col2
order by col1
;

You’ll notice that there’s an explicit “order by” clause at the end of both queries. If you want the result set to appear in a specific order you should always specify the order and not assume that it will appear as a side effect; but in this case the ordering for the “order by” clause seems to match the ordering needed for the analytic function, so we might hope that the optimizer would take advantage of the analytic “window sort” and not bother with a “sort order by” clause. But here are the two plans – first with subquery factoring, then with the inline view:


-------------------------------------------------------------------------
| Id  | Operation        | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------
|   0 | SELECT STATEMENT |      |       |       |    16 (100)|          |
|   1 |  SORT ORDER BY   |      |     7 |    56 |    16  (13)| 00:00:01 |
|*  2 |   VIEW           |      |     7 |    56 |    15   (7)| 00:00:01 |
|   3 |    WINDOW SORT   |      |     7 |    42 |    15   (7)| 00:00:01 |
|   4 |     VIEW         |      |     7 |    42 |    14   (0)| 00:00:01 |
|   5 |      UNION-ALL   |      |       |       |            |          |
|   6 |       FAST DUAL  |      |     1 |       |     2   (0)| 00:00:01 |
|   7 |       FAST DUAL  |      |     1 |       |     2   (0)| 00:00:01 |
|   8 |       FAST DUAL  |      |     1 |       |     2   (0)| 00:00:01 |
|   9 |       FAST DUAL  |      |     1 |       |     2   (0)| 00:00:01 |
|  10 |       FAST DUAL  |      |     1 |       |     2   (0)| 00:00:01 |
|  11 |       FAST DUAL  |      |     1 |       |     2   (0)| 00:00:01 |
|  12 |       FAST DUAL  |      |     1 |       |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter(("COL2A" IS NULL OR "COL2A"<>"COL2"



-------------------------------------------------------------------------
| Id  | Operation        | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------
|   0 | SELECT STATEMENT |      |       |       |    15 (100)|          |
|*  1 |  VIEW            |      |     7 |    56 |    15   (7)| 00:00:01 |
|   2 |   WINDOW SORT    |      |     7 |    42 |    15   (7)| 00:00:01 |
|   3 |    VIEW          |      |     7 |    42 |    14   (0)| 00:00:01 |
|   4 |     UNION-ALL    |      |       |       |            |          |
|   5 |      FAST DUAL   |      |     1 |       |     2   (0)| 00:00:01 |
|   6 |      FAST DUAL   |      |     1 |       |     2   (0)| 00:00:01 |
|   7 |      FAST DUAL   |      |     1 |       |     2   (0)| 00:00:01 |
|   8 |      FAST DUAL   |      |     1 |       |     2   (0)| 00:00:01 |
|   9 |      FAST DUAL   |      |     1 |       |     2   (0)| 00:00:01 |
|  10 |      FAST DUAL   |      |     1 |       |     2   (0)| 00:00:01 |
|  11 |      FAST DUAL   |      |     1 |       |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(("COL2A" IS NULL OR "COL2A"<>"COL2"))

The two plans are different, and the difference is an extra “sort order by” operation even though the optimizer has moved the subquery with the analtyic function inline so that in principle both statements are the technically the same and merely cosmetically different.

It’s been some time since I’ve noticed subquery factoring resulting in a change in plan when the expected effect is purely cosmetic. Interestingly, though, the “unparsed query” in the 10053 (CBO) trace file is the same for the two cases with the only difference being the name of a generated view:


SELECT  
        "LAG_DATA"."COL1" "COL1","LAG_DATA"."COL2" "COL2" 
FROM    (SELECT 
                "TBL"."COL1" "COL1","TBL"."COL2" "COL2",
                DECODE(
                        COUNT(*) OVER ( ORDER BY "TBL"."COL1" ROWS  BETWEEN 1 PRECEDING  AND 1 PRECEDING ),
                        1, FIRST_VALUE("TBL"."COL2") OVER ( ORDER BY "TBL"."COL1" ROWS  BETWEEN 1 PRECEDING  AND 1 PRECEDING ),
                           NULL
                ) "COL2A" 
        FROM    (
                            (SELECT 1 "COL1",'a' "COL2" FROM "SYS"."DUAL" "DUAL") 
                 UNION ALL  (SELECT 2 "2",'a' "'A'" FROM "SYS"."DUAL" "DUAL") 
                 UNION ALL  (SELECT 3 "3",'b' "'B'" FROM "SYS"."DUAL" "DUAL") 
                 UNION ALL  (SELECT 4 "4",'a' "'A'" FROM "SYS"."DUAL" "DUAL") 
                 UNION ALL  (SELECT 5 "5",'a' "'A'" FROM "SYS"."DUAL" "DUAL") 
                 UNION ALL  (SELECT 6 "6",'b' "'B'" FROM "SYS"."DUAL" "DUAL") 
                 UNION ALL  (SELECT 7 "7",'b' "'B'" FROM "SYS"."DUAL" "DUAL")
                ) "TBL"
        ) "LAG_DATA" 
WHERE 
        "LAG_DATA"."COL2A" IS NULL OR "LAG_DATA"."COL2A"<>"LAG_DATA"."COL2" 
ORDER BY "LAG_DATA"."COL1"

The above is the unparsed query for the query with two factored subqueries; the only difference in the unparsed query when I moved the analytic subquery inline was that the view name in the above text changed from “LAG_DATA” to “from$_subquery$_008”.

Footnote:

When I used a real table (with the same data) instead of a “union all” factored subquery for the driving data this anomaly disappeared. The union all is a convenient dirty trick for generating very small test data sets on OTN – it remains to be seen whether a more realistic example of multiple factored subqueries would still result in the optimizer losing an opportunity for eliminating a “sort order by” operation.

In passing – did you notice how the optimizer had managed to rewrite a “lag()” analytic function as a form of “first_value()” function with decode ?