When automatic reoptimization plan is less efficient

11gR2 started to have the optimizer react at execution time when a misestimate is encountered. Then the next executions are re-optimized with more accurate estimation, derived from the execution statistics. This was called cardinality feedback. Unfortunately, in rare cases we had a fast execution plan with bad estimations, and better estimations lead to worse execution plan. This is rare, but even when 9999 queries are faster, the one that takes too long will gives a bad perception of this optimizer feature.
This feature has been improved in 12cR1 with new names: auto-reoptimization and statistics feedback. I’m showing an example here in 12.1.0.2 without adaptive statistics (the 12.2 backport) and I’ve also disabled adaptive plan because they show the wrong numbers (similar to what I described in this post). I’ll show that at one point, the re-optimization can go back to the initial plan if it was the best in execution time.

V$SQL

Basically, here is what happened: first execution was fast, but with actual number of rows far from the estimated ones. Auto-reoptimisation kicks in for next execution and get a new plan, but with longer execution time. Third execution is another re-optimization, leading to same bad plan. Finally starting at 4th execution, the time is back to reasonable and we see the same as the first plan is used:

SQL> select sql_id,child_number,plan_hash_value,is_reoptimizable,is_resolved_adaptive_plan,parse_calls,executions,elapsed_time/1e6
from v$sql where sql_id='b4rhzfw7d6vdp';
 
SQL_ID CHILD_NUMBER PLAN_HASH_VALUE I I PARSE_CALLS EXECUTIONS ELAPSED_TIME/1E6
------------- ------------ --------------- - - ----------- ---------- ----------------
b4rhzfw7d6vdp 0 1894156093 Y 1 1 .346571
b4rhzfw7d6vdp 1 955499861 Y 1 1 5.173733
b4rhzfw7d6vdp 2 955499861 Y 1 1 4.772258
b4rhzfw7d6vdp 3 1894156093 N 7 7 .5008

The scope of statistic feedback is not to get optimal execution from the first execution. This requires accurate statistics, static or dynamic, and SQL Plan Directives is a try to get that. Statistics feedback goal is to try to get a better plan rather than re-use one that is based on misestimates. But sometimes the better is the enemy of the good and we have an example here in child cursors 1 and 2. But the good thing is that finally we are back to acceptable execution time, with a final plan that can be re-used without re-optimization.

What surprised me here is that the final plan has the same hash value than the initial one. Is it a coincidence that different estimations gives the same plan? Or did the optimizer finally gave up to try to find better?

V$SQL_REOPTIMIZATION_HINTS

In 12c the statistics feedback are exposed in V$SQL_REOPTIMIZATION_HINTS.

SQL> select sql_id,child_number,hint_text,client_id,reparse from v$sql_reoptimization_hints where sql_id='b4rhzfw7d6vdp';
 
SQL_ID CHILD_NUMBER HINT_TEXT CLIENT_ID REPARSE
------------- ------------ ---------------------------------------------------------------------------------------------------- ---------- ----------
b4rhzfw7d6vdp 0 OPT_ESTIMATE (@"SEL$1" INDEX_FILTER "DM_FOLDER_R1"@"SEL$1" "D_1F0049A880000016" ROWS=1517.000000 ) 1 1
b4rhzfw7d6vdp 0 OPT_ESTIMATE (@"SEL$1" INDEX_SCAN "DM_FOLDER_R1"@"SEL$1" "D_1F0049A880000016" MIN=1517.000000 ) 1 1
b4rhzfw7d6vdp 0 OPT_ESTIMATE (@"SEL$1" TABLE "DM_FOLDER_R1"@"SEL$1" ROWS=1517.000000 ) 1 1
b4rhzfw7d6vdp 0 OPT_ESTIMATE (@"SEL$1" INDEX_FILTER "DM_SYSOBJECT_R2"@"SEL$1" "D_1F0049A880000010" MIN=3.000000 ) 1 0
b4rhzfw7d6vdp 0 OPT_ESTIMATE (@"SEL$1" INDEX_SCAN "DM_SYSOBJECT_R2"@"SEL$1" "D_1F0049A880000010" MIN=3.000000 ) 1 0
b4rhzfw7d6vdp 0 OPT_ESTIMATE (@"SEL$1" TABLE "DM_SYSOBJECT_R2"@"SEL$1" MIN=3.000000 ) 1 0
b4rhzfw7d6vdp 1 OPT_ESTIMATE (@"SEL$1" INDEX_FILTER "DM_FOLDER_R1"@"SEL$1" "D_1F0049A880000016" ROWS=1517.000000 ) 1 0
b4rhzfw7d6vdp 1 OPT_ESTIMATE (@"SEL$1" INDEX_SCAN "DM_FOLDER_R1"@"SEL$1" "D_1F0049A880000016" MIN=1517.000000 ) 1 0
b4rhzfw7d6vdp 1 OPT_ESTIMATE (@"SEL$1" TABLE "DM_FOLDER_R1"@"SEL$1" ROWS=1517.000000 ) 1 0
b4rhzfw7d6vdp 1 OPT_ESTIMATE (@"SEL$1" INDEX_FILTER "DM_SYSOBJECT_R2"@"SEL$1" "D_1F0049A880000010" MIN=3.000000 ) 1 0
b4rhzfw7d6vdp 1 OPT_ESTIMATE (@"SEL$1" INDEX_SCAN "DM_SYSOBJECT_R2"@"SEL$1" "D_1F0049A880000010" MIN=3.000000 ) 1 0
b4rhzfw7d6vdp 1 OPT_ESTIMATE (@"SEL$1" TABLE "DM_SYSOBJECT_R2"@"SEL$1" MIN=3.000000 ) 1 0
b4rhzfw7d6vdp 1 OPT_ESTIMATE (@"SEL$582FA660" QUERY_BLOCK ROWS=1491.000000 ) 1 1

The child cursor 0 was re-optimized to cursor 1 with different number of rows for “DM_FOLDER_R1″ and “DM_SYSOBJECT_R2″
The child cursor 1 has the same values, but an additional number of row correction for a query block.

But we don’t see anything about cursor 2. It was re-optimizable, and was actually re-optimized into cursor 3 but no statistics corrections are displayed here.

Trace

As it is a reproducible case, I’ve run the same while tracing 10046, 10053 and 10507 (level 512) to get all information about SQL execution, Optimiser compilation, and statistics feedback. For each child cursor, I’ll show the execution plan with estimated and actual number of rows (E-Rows and A-Rows) and then some interesting lines from the trace, mainly those returned by:
grep -E "KKSMEC|^atom_hint|^@"

Child cursor 0 – plan 1894156093 – 0.34 seconds

Plan hash value: 1894156093
----------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
----------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 171 (100)| 1 |00:00:00.04 | 17679 |
| 1 | SORT AGGREGATE | | 1 | 1 | | 1 |00:00:00.04 | 17679 |
| 2 | NESTED LOOPS | | 1 | 1 | 116 (0)| 1491 |00:00:00.04 | 17679 |
| 3 | NESTED LOOPS | | 1 | 1 | 115 (0)| 1491 |00:00:00.04 | 17456 |
| 4 | NESTED LOOPS | | 1 | 49 | 17 (0)| 5648 |00:00:00.01 | 537 |
|* 5 | INDEX RANGE SCAN | D_1F0049A880000016 | 1 | 3 | 3 (0)| 1517 |00:00:00.01 | 13 |
| 6 | TABLE ACCESS BY INDEX ROWID BATCHED | DM_SYSOBJECT_R | 1517 | 16 | 10 (0)| 5648 |00:00:00.01 | 524 |
|* 7 | INDEX RANGE SCAN | D_1F0049A880000010 | 1517 | 71 | 2 (0)| 5648 |00:00:00.01 | 249 |
|* 8 | TABLE ACCESS BY INDEX ROWID | DM_SYSOBJECT_S | 5648 | 1 | 2 (0)| 1491 |00:00:00.03 | 16919 |
|* 9 | INDEX UNIQUE SCAN | D_1F0049A880000108 | 5648 | 1 | 1 (0)| 1491 |00:00:00.03 | 15428 |
| 10 | NESTED LOOPS SEMI | | 5648 | 2 | 25 (0)| 1491 |00:00:00.02 | 14828 |
| 11 | NESTED LOOPS | | 5648 | 7 | 18 (0)| 2981 |00:00:00.02 | 12869 |
| 12 | TABLE ACCESS BY INDEX ROWID BATCHED| DM_SYSOBJECT_R | 5648 | 71 | 4 (0)| 2981 |00:00:00.01 | 7747 |
|* 13 | INDEX RANGE SCAN | D_1F0049A880000010 | 5648 | 16 | 3 (0)| 2981 |00:00:00.01 | 6145 |
|* 14 | TABLE ACCESS BY INDEX ROWID | DM_SYSOBJECT_S | 2981 | 1 | 2 (0)| 2981 |00:00:00.01 | 5122 |
|* 15 | INDEX UNIQUE SCAN | D_1F0049A880000108 | 2981 | 1 | 1 (0)| 2981 |00:00:00.01 | 2140 |
|* 16 | INDEX UNIQUE SCAN | D_1F0049A880000145 | 2981 | 52759 | 1 (0)| 1491 |00:00:00.01 | 1959 |
|* 17 | INDEX UNIQUE SCAN | D_1F0049A880000142 | 1491 | 1 | 1 (0)| 1491 |00:00:00.01 | 223 |
----------------------------------------------------------------------------------------------------------------------------------------

Because of low cardinality estimation of DM_SYSOBJECT_R predicate (E-Rows=3) the optimizer goes to NESTED LOOP. This plan has good execution time here because all blocks are in buffer cache. Reading 17679 blocks from buffer cache takes less than one second. It would have been much longer if those were physical I/O.

This is a case where the optimizer detects misestimate at execution time. Here is what is recorded in the trace:

Reparsing due to card est...
@=0x63a56820 type=3 nodeid=5 monitor=Y halias="DM_FOLDER_R1" loc="SEL$1" oname="SEL$F5BB74E1" act=1517 min=0 est=3 next=(nil)
Reparsing due to card est...
@=0x638fe2b0 type=5 nodeid=4 monitor=Y halias="" loc="SEL$F5BB74E1" onames="SEL$07BDC5B4"@"SEL$5" "SEL$2"@"SEL$5" act=5648 min=0 est=49 next=0x638fe250
Reparsing due to card est...
@=0x638fe4c0 type=5 nodeid=3 monitor=Y halias="" loc="SEL$F5BB74E1" onames="SEL$07BDC5B4"@"SEL$5" "SEL$2"@"SEL$5" "SEL$3"@"SEL$1" act=1491 min=0 est=1 next=0x638fe460
Reparsing due to card est...
@=0x638fe688 type=5 nodeid=2 monitor=Y halias="" loc="SEL$F5BB74E1" onames="SEL$07BDC5B4"@"SEL$5" "SEL$2"@"SEL$5" "SEL$3"@"SEL$1" "R_OBJECT_ID"@"SEL$1" act=1491 min=0 est=1 next=0x638fe5f8
kkocfbCheckCardEst [sql_id=b4rhzfw7d6vdp] reparse=y ecs=n efb=n ost=n fbs=n

Those are the misestimates which triggers re-optimization.

And here are all statistics feedback.

*********** Begin Dump Context (kkocfbCheckCardEst) [sql_id=b4rhzfw7d6vdp cpcnt=0] ***********
@=0x638fe688 type=5 nodeid=2 monitor=Y halias="" loc="SEL$F5BB74E1" onames="DM_FOLDER_R1"@"SEL$1" "DM_SYSOBJECT_R2"@"SEL$1" "TE_"@"SEL$2" "LJ_"@"SEL$2" act=1491 min=0 est=1 next=0x638fe5f8
@=0x638fe5f8 type=3 nodeid=17 monitor=Y halias="LJ_" loc="SEL$2" oname="D_1F0049A880000142" act=0 min=1 est=1 next=0x638fe4c0
@=0x638fe4c0 type=5 nodeid=3 monitor=Y halias="" loc="SEL$F5BB74E1" onames="DM_FOLDER_R1"@"SEL$1" "DM_SYSOBJECT_R2"@"SEL$1" "TE_"@"SEL$2" act=1491 min=0 est=1 next=0x638fe460
@=0x638fe460 type=1 nodeid=8 monitor=Y halias="TE_" loc="SEL$2" act=0 min=1 est=1 next=0x638fe3d0
@=0x638fe3d0 type=3 nodeid=9 monitor=Y halias="TE_" loc="SEL$2" oname="D_1F0049A880000108" act=0 min=1 est=1 next=0x638fe2b0
@=0x638fe2b0 type=5 nodeid=4 monitor=Y halias="" loc="SEL$F5BB74E1" onames="DM_FOLDER_R1"@"SEL$1" "DM_SYSOBJECT_R2"@"SEL$1" act=5648 min=0 est=49 next=0x638fe250
@=0x638fe250 type=1 nodeid=6 monitor=Y halias="DM_SYSOBJECT_R2" loc="SEL$1" act=3 min=1 est=16 next=0x638fe1c0
@=0x638fe1c0 type=3 nodeid=7 monitor=Y halias="DM_SYSOBJECT_R2" loc="SEL$1" oname="D_1F0049A880000010" act=3 min=1 est=71 next=0x63a56820
@=0x63a56820 type=3 nodeid=5 monitor=Y halias="DM_FOLDER_R1" loc="SEL$1" oname="D_1F0049A880000016" act=1517 min=0 est=3 next=(nil)
*********** End Dump Context ***********

We also see some information about execution performance:

kkoarCopyCtx: [sql_id=b4rhzfw7d6vdp] origin=CFB old=0x63a565d0 new=0x7fe74e2153f0 copyCnt=1 copyClient=y
**************************************************************
kkocfbCopyBestEst: Best Stats
Exec count: 1
CR gets: 17679
CU gets: 0
Disk Reads: 0
Disk Writes: 0
IO Read Requests: 0
IO Write Requests: 0
Bytes Read: 0
Bytes Written: 0
Bytes Exchanged with Storage: 0
Bytes Exchanged with Disk: 0
Bytes Simulated Read: 0
Bytes Simulated Returned: 0
Elapsed Time: 51 (ms)
CPU Time: 51 (ms)
User I/O Time: 15 (us)
*********** Begin Dump Context (kkocfbCopyBestEst) **********
*********** End Dump Context ***********

They are labeled as ‘Best Stats’ because we had only one execution at that time.

Finally, the hints are dumped:

******** Begin CFB Hints [sql_id=b4rhzfw7d6vdp] xsc=0x7fe74e215748 ********
Dumping Hints
=============
atom_hint=(@=0x7fe74e21ebf0 err=0 resol=0 used=0 token=1018 org=6 lvl=3 txt=OPT_ESTIMATE (@"SEL$1" TABLE "DM_SYSOBJECT_R2"@"SEL$1" MIN=3.000000 ) )
atom_hint=(@=0x7fe74e21e758 err=0 resol=0 used=0 token=1018 org=6 lvl=3 txt=OPT_ESTIMATE (@"SEL$1" INDEX_FILTER "DM_SYSOBJECT_R2"@"SEL$1" "D_1F0049A880000010" MIN=3.000000 ) )
atom_hint=(@=0x7fe74e21e3f0 err=0 resol=0 used=0 token=1018 org=6 lvl=3 txt=OPT_ESTIMATE (@"SEL$1" INDEX_SCAN "DM_SYSOBJECT_R2"@"SEL$1" "D_1F0049A880000010" MIN=3.000000 ) )
atom_hint=(@=0x7fe74e21dfd0 err=0 resol=0 used=0 token=1018 org=6 lvl=3 txt=OPT_ESTIMATE (@"SEL$1" INDEX_FILTER "DM_FOLDER_R1"@"SEL$1" "D_1F0049A880000016" ROWS=1517.000000 ) )
atom_hint=(@=0x7fe74e21dc68 err=0 resol=0 used=0 token=1018 org=6 lvl=3 txt=OPT_ESTIMATE (@"SEL$1" INDEX_SCAN "DM_FOLDER_R1"@"SEL$1" "D_1F0049A880000016" MIN=1517.000000 ) )
atom_hint=(@=0x7fe74e21d8c8 err=0 resol=0 used=0 token=1018 org=6 lvl=3 txt=OPT_ESTIMATE (@"SEL$1" TABLE "DM_FOLDER_R1"@"SEL$1" ROWS=1517.000000 ) )
********** End CFB Hints **********

Those are exactly what we see in V$SQL_REOPTIMIZATION_HINTS

This is all what we see for this first execution. The next execution starts with:

KKSMEC: Invalidating old cursor 0 with hash val = 1894156093
KKSMEC: Produced New cursor 1 with hash val = 955499861

As a consequence of child cursor 0 marked as reoptimizable, the next execution invalidates it and creates a new child cursor 1.

Child cursor 1 – new plan 955499861 – 5.17 seconds

Here is the new plan we see after that second execution:

Plan hash value: 955499861
------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers | Reads | OMem | 1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 30996 (100)| 1 |00:00:04.58 | 102K| 101K| | | |
| 1 | SORT AGGREGATE | | 1 | 1 | | 1 |00:00:04.58 | 102K| 101K| | | |
| 2 | VIEW | VM_NWVW_2 | 1 | 12039 | 30996 (1)| 1491 |00:00:04.58 | 102K| 101K| | | |
| 3 | HASH UNIQUE | | 1 | 12039 | 30996 (1)| 1491 |00:00:04.58 | 102K| 101K| 941K| 941K| 2597K (0)|
|* 4 | HASH JOIN RIGHT SEMI | | 1 | 12039 | 30490 (1)| 4132 |00:00:04.57 | 102K| 101K| 12M| 3867K| 14M (0)|
| 5 | TABLE ACCESS FULL | DM_DOCUMENT_S | 1 | 213K| 210 (1)| 213K|00:00:00.01 | 741 | 0 | | | |
|* 6 | HASH JOIN | | 1 | 36463 | 29665 (1)| 5622 |00:00:04.51 | 101K| 101K| 1405K| 1183K| 2026K (0)|
|* 7 | HASH JOIN | | 1 | 36463 | 18397 (1)| 5622 |00:00:02.23 | 65103 | 65050 | 940K| 940K| 1339K (0)|
|* 8 | HASH JOIN | | 1 | 2222 | 14489 (1)| 1499 |00:00:01.58 | 51413 | 51369 | 992K| 992K| 1377K (0)|
|* 9 | HASH JOIN | | 1 | 2222 | 14120 (1)| 1499 |00:00:01.46 | 50088 | 50057 | 3494K| 1598K| 4145K (0)|
|* 10 | TABLE ACCESS FULL | DM_SYSOBJECT_S | 1 | 39235 | 10003 (1)| 39235 |00:00:00.83 | 36385 | 36376 | | | |
|* 11 | HASH JOIN | | 1 | 24899 | 3920 (1)| 5648 |00:00:00.62 | 13703 | 13681 | 1199K| 1199K| 1344K (0)|
|* 12 | INDEX RANGE SCAN | D_1F0049A880000016 | 1 | 1517 | 12 (0)| 1517 |00:00:00.01 | 13 | 0 | | | |
|* 13 | TABLE ACCESS FULL| DM_SYSOBJECT_R | 1 | 646K| 3906 (1)| 646K|00:00:00.50 | 13690 | 13681 | | | |
| 14 | TABLE ACCESS FULL | DM_FOLDER_S | 1 | 431K| 367 (1)| 431K|00:00:00.04 | 1325 | 1312 | | | |
|* 15 | TABLE ACCESS FULL | DM_SYSOBJECT_R | 1 | 646K| 3906 (1)| 646K|00:00:00.51 | 13690 | 13681 | | | |
|* 16 | TABLE ACCESS FULL | DM_SYSOBJECT_S | 1 | 646K| 10000 (1)| 646K|00:00:02.14 | 36385 | 36376 | | | |
------------------------------------------------------------------------------------------------------------------------------------------------------------
Note
-----
- statistics feedback used for this statement

The notes makes it clear that the estimations comes from previous run (statistics feedback) and we see that for most operations E-Rows = A-Rows. With those a new plan has been chosen, with complex view merging: VM_NWWM. You can find clues about those internal view names on Jonathan Lewis blog. Here probably because the estimated number of rows is high, the subquery has been unnested. It is an ‘EXISTS’ subquery, which is transformed to semi join and merged to apply a distinct at the end.

So, we have a different plan, which is supposed to be better because it has been costed with more accurate cardinalities. .The goal of this post is not to detail the reason why the execution time is longer with a ‘better’ plan. If you look at ‘Reads’ column you can see that the first one has read all blocks from buffer cache but second one had to do physical I/O for all. With nothing from buffer cache, reading 101K blocks in multiblock reads may be faster than reading 17679 so the optimizer decision was not bad. I’ll have to estimate if it is expected to have most of the blocks in buffer cache in real production life as behavior in UAT is different. Some people will stop here, say that cardinality feedback is bad, disable it or even set optimizer_cost_adj to get the nested loop, but things are more complex than that.

The important thing is that the optimizer doesn’t stop there and compares the new execution statistics with the previous one.

**************************************************************
kkocfbCompareExecStats : Current
Exec count: 1
CR gets: 102226
CU gets: 3
Disk Reads: 101426
Disk Writes: 0
IO Read Requests: 1633
IO Write Requests: 0
Bytes Read: 830881792
Bytes Written: 0
Bytes Exchanged with Storage: 830881792
Bytes Exchanged with Disk: 830881792
Bytes Simulated Read: 0
Bytes Simulated Returned: 0
Elapsed Time: 4586 (ms)
CPU Time: 1305 (ms)
User I/O Time: 3040 (ms)
**************************************************************
kkocfbCompareExecStats : Best
Exec count: 1
CR gets: 17679
CU gets: 0
Disk Reads: 0
Disk Writes: 0
IO Read Requests: 0
IO Write Requests: 0
Bytes Read: 0
Bytes Written: 0
Bytes Exchanged with Storage: 0
Bytes Exchanged with Disk: 0
Bytes Simulated Read: 0
Bytes Simulated Returned: 0
Elapsed Time: 51 (ms)
CPU Time: 51 (ms)
User I/O Time: 15 (us)
kkocfbCompareExecStats: improvement BG: 0.172935 CPU: 0.039555

The first execution, with ‘bad’ statistics, is still the best one and this new execution has an improvement of 0.17, which means 5 times slower.

Then in the trace we see again that re-optimisation (reparsing) is considered:

Reparsing due to card est...
@=0x6a368338 type=5 nodeid=11 monitor=Y halias="" loc="SEL$582FA660" onames="SEL$608EC1F7"@"SEL$582FA660" "SEL$04458B50"@"SEL$582FA660" act=5648 min=0 est=24899 next=0x6a3682d8
Reparsing due to card est...
@=0x6a3687b0 type=5 nodeid=7 monitor=Y halias="" loc="SEL$582FA660" onames="SEL$608EC1F7"@"SEL$582FA660" "SEL$04458B50"@"SEL$582FA660" "SEL$FB0FE72C"@"SEL$33802F1B" "SEL$5"@"SEL$33802F1B" "SEL$07BDC5B4"@"SEL$636B5685" act=5622 min=0 est=36463 next=0x6a368750
Reparsing due to card est...
@=0x6a368990 type=5 nodeid=6 monitor=Y halias="" loc="SEL$582FA660" onames="SEL$608EC1F7"@"SEL$582FA660" "SEL$04458B50"@"SEL$582FA660" "SEL$FB0FE72C"@"SEL$33802F1B" "SEL$5"@"SEL$33802F1B" "SEL$07BDC5B4"@"SEL$636B5685" "SEL$FB0FE72C"@"SEL$4" act=5622 min=0 est=36463 next=0x6a368930
Reparsing due to card est...
@=0x6a368b90 type=5 nodeid=4 monitor=Y halias="" loc="SEL$582FA660" onames="SEL$608EC1F7"@"SEL$582FA660" "SEL$04458B50"@"SEL$582FA660" "SEL$FB0FE72C"@"SEL$33802F1B" "SEL$5"@"SEL$33802F1B" "SEL$07BDC5B4"@"SEL$636B5685" "SEL$FB0FE72C"@"SEL$4" "SEL$F5BB74E1"
@"SEL$4" act=4132 min=0 est=12039 next=0x6a368b30
Reparsing due to card est...
@=0x6a368d60 type=4 nodeid=3 monitor=Y halias="" loc="SEL$582FA660" act=1491 min=0 est=12039 next=0x6a368b90

An additional OPT_ESTIMATE hint is generated for the complext view merging view query block:

atom_hint=(@=0x7fe74e21eb90 err=0 resol=0 used=0 token=1018 org=6 lvl=3 txt=OPT_ESTIMATE (@"SEL$1" INDEX_FILTER "DM_FOLDER_R1"@"SEL$1" "D_1F0049A880000016" ROWS=1517.000000 ) )
atom_hint=(@=0x7fe74e21e7b0 err=0 resol=0 used=0 token=1018 org=6 lvl=3 txt=OPT_ESTIMATE (@"SEL$1" INDEX_SCAN "DM_FOLDER_R1"@"SEL$1" "D_1F0049A880000016" MIN=1517.000000 ) )
atom_hint=(@=0x7fe74e21e470 err=0 resol=0 used=0 token=1018 org=6 lvl=3 txt=OPT_ESTIMATE (@"SEL$1" TABLE "DM_FOLDER_R1"@"SEL$1" ROWS=1517.000000 ) )
atom_hint=(@=0x7fe74e21e050 err=0 resol=0 used=0 token=1018 org=6 lvl=3 txt=OPT_ESTIMATE (@"SEL$1" INDEX_FILTER "DM_SYSOBJECT_R2"@"SEL$1" "D_1F0049A880000010" MIN=3.000000 ) )
atom_hint=(@=0x7fe74e21dce8 err=0 resol=0 used=0 token=1018 org=6 lvl=3 txt=OPT_ESTIMATE (@"SEL$1" INDEX_SCAN "DM_SYSOBJECT_R2"@"SEL$1" "D_1F0049A880000010" MIN=3.000000 ) )
atom_hint=(@=0x7fe74e21da38 err=0 resol=0 used=0 token=1018 org=6 lvl=2 txt=OPT_ESTIMATE (@"SEL$582FA660" QUERY_BLOCK ROWS=1491.000000 ) )
atom_hint=(@=0x7fe74e21d600 err=0 resol=0 used=0 token=1018 org=6 lvl=3 txt=OPT_ESTIMATE (@"SEL$1" TABLE "DM_SYSOBJECT_R2"@"SEL$1" MIN=3.000000 ) )

Whith this new cardinality estimation, the next execution will try to get a better execution, but it doesn’t change the optimizer choice and the new child cursor gets the same execution plan:
KKSMEC: Invalidating old cursor 1 with hash val = 955499861
KKSMEC: Produced New cursor 2 with hash val = 955499861

Child cursor 2 – plan 955499861 again – 4.77 seconds

This the third execution:

Plan hash value: 955499861
------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers | Reads | OMem | 1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 30996 (100)| 1 |00:00:04.19 | 102K| 101K| | | |
| 1 | SORT AGGREGATE | | 1 | 1 | | 1 |00:00:04.19 | 102K| 101K| | | |
| 2 | VIEW | VM_NWVW_2 | 1 | 1491 | 30996 (1)| 1491 |00:00:04.19 | 102K| 101K| | | |
| 3 | HASH UNIQUE | | 1 | 1491 | 30996 (1)| 1491 |00:00:04.19 | 102K| 101K| 941K| 941K| 1355K (0)|
|* 4 | HASH JOIN RIGHT SEMI | | 1 | 12039 | 30490 (1)| 4132 |00:00:04.19 | 102K| 101K| 12M| 3867K| 14M (0)|
| 5 | TABLE ACCESS FULL | DM_DOCUMENT_S | 1 | 213K| 210 (1)| 213K|00:00:00.01 | 740 | 0 | | | |
|* 6 | HASH JOIN | | 1 | 36463 | 29665 (1)| 5622 |00:00:04.12 | 101K| 101K| 1405K| 1183K| 2021K (0)|
|* 7 | HASH JOIN | | 1 | 36463 | 18397 (1)| 5622 |00:00:03.39 | 65102 | 65050 | 940K| 940K| 1359K (0)|
|* 8 | HASH JOIN | | 1 | 2222 | 14489 (1)| 1499 |00:00:02.94 | 51412 | 51369 | 992K| 992K| 1331K (0)|
|* 9 | HASH JOIN | | 1 | 2222 | 14120 (1)| 1499 |00:00:01.04 | 50088 | 50057 | 3494K| 1598K| 4145K (0)|
|* 10 | TABLE ACCESS FULL | DM_SYSOBJECT_S | 1 | 39235 | 10003 (1)| 39235 |00:00:00.47 | 36385 | 36376 | | | |
|* 11 | HASH JOIN | | 1 | 24899 | 3920 (1)| 5648 |00:00:00.55 | 13703 | 13681 | 1199K| 1199K| 1344K (0)|
|* 12 | INDEX RANGE SCAN | D_1F0049A880000016 | 1 | 1517 | 12 (0)| 1517 |00:00:00.01 | 13 | 0 | | | |
|* 13 | TABLE ACCESS FULL| DM_SYSOBJECT_R | 1 | 646K| 3906 (1)| 646K|00:00:00.43 | 13690 | 13681 | | | |
| 14 | TABLE ACCESS FULL | DM_FOLDER_S | 1 | 431K| 367 (1)| 431K|00:00:01.82 | 1324 | 1312 | | | |
|* 15 | TABLE ACCESS FULL | DM_SYSOBJECT_R | 1 | 646K| 3906 (1)| 646K|00:00:00.33 | 13690 | 13681 | | | |
|* 16 | TABLE ACCESS FULL | DM_SYSOBJECT_S | 1 | 646K| 10000 (1)| 646K|00:00:00.60 | 36385 | 36376 | | | |
------------------------------------------------------------------------------------------------------------------------------------------------------------
Note
-----
- statistics feedback used for this statement

Same plan and same execution time here. Tables are large and SGA is small here.

*********** Begin Dump Context: best estimates ***********
 
**************************************************************
kkocfbCompareExecStats : Current
Exec count: 1
CR gets: 102224
CU gets: 3
Disk Reads: 101426
Disk Writes: 0
IO Read Requests: 1633
IO Write Requests: 0
Bytes Read: 830881792
Bytes Written: 0
Bytes Exchanged with Storage: 830881792
Bytes Exchanged with Disk: 830881792
Bytes Simulated Read: 0
Bytes Simulated Returned: 0
Elapsed Time: 4206 (ms)
CPU Time: 1279 (ms)
User I/O Time: 3084 (ms)
**************************************************************
kkocfbCompareExecStats : Best
Exec count: 1
CR gets: 17679
CU gets: 0
Disk Reads: 0
Disk Writes: 0
IO Read Requests: 0
IO Write Requests: 0
Bytes Read: 0
Bytes Written: 0
Bytes Exchanged with Storage: 0
Bytes Exchanged with Disk: 0
Bytes Simulated Read: 0
Bytes Simulated Returned: 0
Elapsed Time: 51 (ms)
CPU Time: 51 (ms)
User I/O Time: 15 (us)
kkocfbCompareExecStats: improvement BG: 0.172939 CPU: 0.040363

So where we are here? We had an execution which was based on bad estimations. Then two tries on good estimations, but because of different buffer cache behavior they are finally 5 times slower. Nothing else to try.

The good thing is that the optimizer admits it cannot do better and falls back to the best execution time, now considered as the best estimate:

kkocfbCheckCardEst: reparse using best estimates
...
kkocfbCopyCardCtx: No best stats found

We see no OPT_ESTIMATE hints here, reason why there was noting in V$SQL_REOPTIMIZATION_HINTS for cursor 2, but this cursor is still marked as re-optimizable and next execution invalidates it:

KKSMEC: Invalidating old cursor 2 with hash val = 955499861
KKSMEC: Produced New cursor 3 with hash val = 1894156093

We see that we are back to the original plan, which is expected because the static statistics have not changed, and there are no statistics feedback this time.

Child cursor 3 – back to plan 1894156093 – 0.5 seconds

This is the plan that si used for all subsequent executions now.

Plan hash value: 1894156093
----------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
----------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 171 (100)| 1 |00:00:00.04 | 17677 |
| 1 | SORT AGGREGATE | | 1 | 1 | | 1 |00:00:00.04 | 17677 |
| 2 | NESTED LOOPS | | 1 | 1 | 116 (0)| 1491 |00:00:00.04 | 17677 |
| 3 | NESTED LOOPS | | 1 | 1 | 115 (0)| 1491 |00:00:00.04 | 17454 |
| 4 | NESTED LOOPS | | 1 | 49 | 17 (0)| 5648 |00:00:00.01 | 536 |
|* 5 | INDEX RANGE SCAN | D_1F0049A880000016 | 1 | 3 | 3 (0)| 1517 |00:00:00.01 | 13 |
| 6 | TABLE ACCESS BY INDEX ROWID BATCHED | DM_SYSOBJECT_R | 1517 | 16 | 10 (0)| 5648 |00:00:00.01 | 523 |
|* 7 | INDEX RANGE SCAN | D_1F0049A880000010 | 1517 | 71 | 2 (0)| 5648 |00:00:00.01 | 249 |
|* 8 | TABLE ACCESS BY INDEX ROWID | DM_SYSOBJECT_S | 5648 | 1 | 2 (0)| 1491 |00:00:00.03 | 16918 |
|* 9 | INDEX UNIQUE SCAN | D_1F0049A880000108 | 5648 | 1 | 1 (0)| 1491 |00:00:00.03 | 15427 |
| 10 | NESTED LOOPS SEMI | | 5648 | 2 | 25 (0)| 1491 |00:00:00.02 | 14827 |
| 11 | NESTED LOOPS | | 5648 | 7 | 18 (0)| 2981 |00:00:00.02 | 12868 |
| 12 | TABLE ACCESS BY INDEX ROWID BATCHED| DM_SYSOBJECT_R | 5648 | 71 | 4 (0)| 2981 |00:00:00.01 | 7747 |
|* 13 | INDEX RANGE SCAN | D_1F0049A880000010 | 5648 | 16 | 3 (0)| 2981 |00:00:00.01 | 6145 |
|* 14 | TABLE ACCESS BY INDEX ROWID | DM_SYSOBJECT_S | 2981 | 1 | 2 (0)| 2981 |00:00:00.01 | 5121 |
|* 15 | INDEX UNIQUE SCAN | D_1F0049A880000108 | 2981 | 1 | 1 (0)| 2981 |00:00:00.01 | 2140 |
|* 16 | INDEX UNIQUE SCAN | D_1F0049A880000145 | 2981 | 52759 | 1 (0)| 1491 |00:00:00.01 | 1959 |
|* 17 | INDEX UNIQUE SCAN | D_1F0049A880000142 | 1491 | 1 | 1 (0)| 1491 |00:00:00.01 | 223 |
----------------------------------------------------------------------------------------------------------------------------------------

After a few tries to get a better plan, the optimizer finally switched back to the first one because it was the best in term of response time (I don’t know exactly which execution statistics are used for this decision, elapsed time is just my guess here).

The interesting point here is to understand that you can see a reoptimized cursor without statistics feedback:

  • No rows for the previous cursor in V$SQL_REOPTIMIZATION_HINTS
  • No ‘statistics feedback’ not in the new cursor plan
  • Difference between E-Rows and A-Rows in the new plan

So what?

SQL optimization is a complex task and there is nothing like an execution is ‘fast’ or ‘slow’, an execution plan is ‘good’ or ‘bad’, an optimizer decision is ‘right’ or ‘wrong’. What is fast after several similar executions can be slow on a busy system because less blocks remains in cache. What is slow at a time where the storage is busy may be fast at another time of the day. What is fast with one single user may raise more concurrency contention on a busy system. Cardinality feedback is a reactive attempt to improve an execution plan. On average, things go better with it, but it is not abnormal that few cases can go wrong for a few executions. You can’t blame the optimizer for that, and fast conclusions or optimizer parameter tweaking are not sustainable solutions. And don’t forget that if your data model is well designed, then the critical queries should have one clear optimal access path which will not depend on a small difference in estimated number of rows.

The only thing I can always conclude when I see cardinality feedback going wrong is that there is something to fix in the design of data model, the statistics gathering and/or the query design. When statistics feedback gives a worse execution plan, it is the consequence of the combination of:

  • mis-estimation of cardinalities: bad, insufficient, or stale statistics
  • mis-estimation of response time: bad system statistics, untypical memory sizing, unrepresentative execution context
  • no clear optimal access path: sub-optimal indexing, lack of partitioning,…

It is a good thing to have the auto-reoptimization coming back to the initial plan when nothing better has been observed. I would love to see more control about it. For example, a hint that sets a threshold of execution time where the optimizer should not try to find better. I filled this idea in https://community.oracle.com/ideas/17514 and you can vote for it.

Update 2-APR-2017

I was not clear in this post, but this is the first time I observed this behavior (multiple reoptimization and then back to original plan), so I’m not sure about the reasons and the different conditions required. This was on 12.1.0.2 with JAN17 PSU and the two Adaptive Statistics backport from 12cR2, adaptive plans set to false and no bind variables.

 

Cet article When automatic reoptimization plan is less efficient est apparu en premier sur Blog dbi services.