When doing updates with statements that use correlated subqueries, it’s important to make sure that your brackets (parentheses) are in the right place. Here are two statements that look fairly similar but have remarkably different results – and sometimes people don’t realise how different the statements are:
update t1 set padding = ( select t2.padding from t2 where t2.id = t1.id ) where t1.small_vc <= lpad(20,10,'0') ; update t1 set padding = ( select t2.padding from t2 where t2.id = t1.id and t1.small_vc <= lpad(20,10,'0') ) ;
The previous post reminded me of another (fairly special) case where the order of operations in an execution plan seems to be wrong according to the “traditional” strategy for reading execution plans. Here’s a simple select statement with its execution plan to demonstrate the point:
I’ve copied the following question up from a recent comment because it’s a useful generic type of question, and I think I can answer it in the few minutes I have to spare.
Hi Jonathan. I have a query plan where I cannot explain how time adds up. I did the ALTER SESSION trick but it changed nothing. I ran this, and got the following plan. I have two questions (I appoligize for not being able to format this code and plan but I saw not formatting buttons on the insert box).
1) @ step #8 in the plan, the query jumps to 3 and 1/2 minutes. This step says VIEW but gives no indication of what it did that actually took 3 and 1/2 minutes. Can you explain or give me some idea how to find out what is being done on this line that takes that long. Especially with so few rows.
One of my recent assignments involved a company that had run into some performance problems after upgrading from 10.2.0.3 to 11.2.0.2. We had spent half an hour on the phone discussing the system before I had arrived, and I’d made a couple of suggestions that had solved most of their problems before I got on site – but they still wanted me to come in and give them some specific ideas about why the critical part of the solution had helped.
The most critical piece of advice I had given them (after listening very carefully to their description of the system) was to get rid of ALL the histograms they had on their system, and then watch very carefully for any signs that they might need to re-introduce a handful of histograms over the next few weeks.
I see that Christian Antognini posted a note about an interesting little defect in Enterprise Manager a little while ago - it doesn’t always know how to interpret execution plans. The problem appears in Christians’ example when a filter subquery predicate is applied during an index range scan – it’s a topic I wrote about a few months ago with the title “filter bug” because the plan shows (or, rather, fails to show) a “missing” filter operation, which has been subsumed into the predicate section of the thing that would otherwise have been the first child of the filter operation – the rule of recursive descent through the plan breaks, and the ordering that OEM gives for the operations goes wrong.
Here’s a quick tutorial in hinting, promped by a question on the OTN database forum.
The OP has a hash semi-join and Oracle appears to be ignoring a hint to use a nested loop:
> select /*+ NO_USE_HASH(C2) USE_NL(C2) */ > SC.SID, SC.MID, SC.INDATE, SC.EXDATE, SC.AUDATE > FROM SSCLASS SC > WHERE SC.SID = 0 > AND SC.CID = 0 > AND SC.MID = 1 > AND SC.INDATE <= SC.EXDATE > AND EXISTS ( > SELECT SSCID FROM SSCLASS C2 > WHERE C2.SSCID = SC.SSCID > AND C2.AUDATE >= to_date('2009-01-01','yyyy-MM-dd') > ) > ORDER BY > SSCID, INDATE, EXDATE > > PLAN_TABLE_OUTPUT > Plan hash value: 1476588646 > > ------------------------------------------------------------------------------------------------------ > | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | > ------------------------------------------------------------------------------------------------------ > | 0 | SELECT STATEMENT | | 204K| 10M| | 35799 (1)| 00:07:10 | > | 1 | SORT ORDER BY | | 204K| 10M| 25M| 35799 (1)| 00:07:10 | > |* 2 | HASH JOIN SEMI | | 204K| 10M| 10M| 33077 (1)| 00:06:37 | > |* 3 | TABLE ACCESS BY INDEX ROWID| SSCLASS | 204K| 7983K| | 9110 (1)| 00:01:50 | > |* 4 | INDEX RANGE SCAN | X5_SSCLASS | 204K| | | 582 (1)| 00:00:07 | > |* 5 | INDEX RANGE SCAN | X6_SSCLASS | 4955K| 66M| | 17276 (1)| 00:03:28 | > ------------------------------------------------------------------------------------------------------
I’m not going to argue about what plans might be good or bad, and I’m going to assume the OP simply wants a nested loop semi join using a “good” index into the table aliased as C2; so I’m just going to demonstrate on this simple example how to approach that specific problem. The critical error the OP has made is that the join he’s trying to affect doesn’t exist in the query block where he’s put his hint – so he needs to find out what query will exist after the subquery has been nested and the optimizer is looking at the semi-join.
Here’s initial query, with default execution plan, I’ll point out that there is an index on the n1 column that I’m using in the existence test:
select * from t2 where t2.n2 = 15 and exists ( select null from t1 where t1.n1 = t2.n1 ) ; ------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 15 | 2865 | 26 (4)| 00:00:01 | |* 1 | HASH JOIN SEMI | | 15 | 2865 | 26 (4)| 00:00:01 | |* 2 | TABLE ACCESS FULL | T2 | 15 | 2805 | 22 (0)| 00:00:01 | | 3 | INDEX FAST FULL SCAN| T1_I1 | 3000 | 12000 | 3 (0)| 00:00:01 | ------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - access("T1"."N1"="T2"."N1") 2 - filter("T2"."N2"=15)
So I’ve emulated the hash semi-join into the second table that the OP wants to get rid of, and I’m not using the target index in a “precision” fashion.
I happen to know that there is a hint that I can use to make the subquery operate as a nested loop semijoin. It’s /*+ nl_sj */ and it has to go in the subquery. Unfortunately it’s a hint that’s deprecated in 10g, but never mind that for the moment. I’m also going to adopt “sensible practice” and give each of my query blocks a name. Let’s see what we get from dbms_xplan with the hint.
explain plan set statement_id = 'sj_hinted' for select /*+ qb_name(main) */ * from t2 where t2.n2 = 15 and exists ( select /*+ qb_name(subq) nl_sj */ null from t1 where t1.n1 = t2.n1 ) ; select * from table(dbms_xplan.display(null,'sj_hinted','outline')); PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------------------------ Plan hash value: 635111780 ---------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 15 | 2865 | 37 (0)| 00:00:01 | | 1 | NESTED LOOPS SEMI | | 15 | 2865 | 37 (0)| 00:00:01 | |* 2 | TABLE ACCESS FULL| T2 | 15 | 2805 | 22 (0)| 00:00:01 | |* 3 | INDEX RANGE SCAN | T1_I1 | 3000 | 12000 | 1 (0)| 00:00:01 | ---------------------------------------------------------------------------- Outline Data ------------- /*+ BEGIN_OUTLINE_DATA USE_NL(@"SEL$A93AFAED" "T1"@"SUBQ") LEADING(@"SEL$A93AFAED" "T2"@"MAIN" "T1"@"SUBQ") INDEX(@"SEL$A93AFAED" "T1"@"SUBQ" ("T1"."N1")) FULL(@"SEL$A93AFAED" "T2"@"MAIN") OUTLINE(@"SUBQ") OUTLINE(@"MAIN") UNNEST(@"SUBQ") OUTLINE_LEAF(@"SEL$A93AFAED") ALL_ROWS OPTIMIZER_FEATURES_ENABLE('10.2.0.3') IGNORE_OPTIM_EMBEDDED_HINTS END_OUTLINE_DATA */ Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter("T2"."N2"=15) 3 - access("T1"."N1"="T2"."N1")
Note how I’ve used a statement_id to label my plan, and I’ve added the extra predicate ‘outline’ to the call to dbms_xplan. The outline shows me the complete set of hints I need to reproduce the execution plan; technically it’s the information that would be stored by Oracle as an outline or an SQL Baseline.
There are a few session-level parameter settings I don’t really need included, and a couple of things which can’t qualify as “legal” SQL hints, though, and I’m going to ignore those. (Don’t you love the “ignore the hints” hint, though!)
So let’s take the minimum set of hints back into the SQL:
explain plan set statement_id = 'full_hints' for select /*+ qb_name(main) unnest(@subq) leading(@sel$a93afaed t2@main t1@subq) use_nl(@sel$a93afaed t1@subq) full(@sel$a93afaed t2@main) index(@sel$a93afaed t1@subq(t1.n1)) */ * from t2 where t2.n2 = 15 and exists ( select /*+ qb_name(subq) */ null from t1 where t1.n1 = t2.n1 ) ; select * from table(dbms_xplan.display(null,'full_hints','outline')); ---------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 15 | 2865 | 37 (0)| 00:00:01 | | 1 | NESTED LOOPS SEMI | | 15 | 2865 | 37 (0)| 00:00:01 | |* 2 | TABLE ACCESS FULL| T2 | 15 | 2805 | 22 (0)| 00:00:01 | |* 3 | INDEX RANGE SCAN | T1_I1 | 3000 | 12000 | 1 (0)| 00:00:01 | ---------------------------------------------------------------------------- Outline Data ------------- /*+ BEGIN_OUTLINE_DATA USE_NL(@"SEL$A93AFAED" "T1"@"SUBQ") LEADING(@"SEL$A93AFAED" "T2"@"MAIN" "T1"@"SUBQ") INDEX(@"SEL$A93AFAED" "T1"@"SUBQ" ("T1"."N1")) FULL(@"SEL$A93AFAED" "T2"@"MAIN") OUTLINE(@"SUBQ") OUTLINE(@"MAIN") UNNEST(@"SUBQ") OUTLINE_LEAF(@"SEL$A93AFAED") ALL_ROWS OPTIMIZER_FEATURES_ENABLE('10.2.0.3') IGNORE_OPTIM_EMBEDDED_HINTS END_OUTLINE_DATA */ Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter("T2"."N2"=15) 3 - access("T1"."N1"="T2"."N1")
Job done – we used a bit of hackery to get the plan we wanted, then used the legal hints to reproduce the plan.
It is important to name your query blocks as this helps you to identify what transformations apply when, and how to label your tables correctly in your code; and you have to remember that the “strange” query block names that appear (such as @”SEL$A93AFAED”) are dependent on the query block names you originally supplied.
The method isn’t perfect since (a) sometimes hints that are needed don’t get into the outline, and (b) sometimes the outline actually doesn’t reproduce the plan if all you use are the “legal” hints – but it may help you in most cases.
After publishing my note about emulating star transformations when there was something blocking the optimizer’s ability to do them automatically I ended up having an interesting email conversation with Mark Farnham about variations on the theme, concluding with a completely different (and possibly more intuitive) SQL statement to achieve the same aim.
You will recall that I started off with the following statement:
select ord.* from products prd, customers cst, orders ord where prd.grp = 50 and cst.grp = 50 and ord.id_prd = prd.id and ord.id_cst = cst.id ;
Although this specific example is so simple that the optimizer could turn it into a star transformation automatically, I used it to demonstrate a “minimum work” version of the statement as follows (with the execution plan from 11.1.0.6):
select ord.* from ( select /*+ leading(prd ord) use_nl(ord) no_merge */ ord.rowid prid from products prd, orders ord where prd.grp = 50 and ord.id_prd = prd.id ) prid, ( select /*+ leading(cst ord) use_nl(ord) no_merge */ ord.rowid crid from customers cst, orders ord where cst.grp = 50 and ord.id_cst = cst.id ) crid, orders ord where prid.prid = crid.crid and ord.rowid = prid.prid ; -------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | -------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 11 | 1650 | 570 | | 1 | NESTED LOOPS | | 11 | 1650 | 570 | |* 2 | HASH JOIN | | 11 | 264 | 559 | | 3 | VIEW | | 3361 | 40332 | 277 | | 4 | NESTED LOOPS | | 3361 | 87386 | 277 | |* 5 | TABLE ACCESS FULL | CUSTOMERS | 98 | 882 | 81 | |* 6 | INDEX RANGE SCAN | ORD_CST_FK | 34 | 578 | 2 | | 7 | VIEW | | 3390 | 40680 | 281 | | 8 | NESTED LOOPS | | 3390 | 88140 | 281 | |* 9 | TABLE ACCESS FULL | PRODUCTS | 100 | 900 | 81 | |* 10 | INDEX RANGE SCAN | ORD_PRD_FK | 34 | 578 | 2 | | 11 | TABLE ACCESS BY USER ROWID| ORDERS | 1 | 126 | 1 | -------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("PRID"."PRID"="CRID"."CRID") 5 - filter("CST"."GRP"=50) 6 - access("ORD"."ID_CST"="CST"."ID") 9 - filter("PRD"."GRP"=50) 10 - access("ORD"."ID_PRD"="PRD"."ID")
Go back to the original SQL statement, though. The select list contains columns from just the orders table, and there’s a fairly well-known suggestion (or possibly guideline) that “tables that are not in the select list should not appear in the from clause, they should appear in subqueries in the where clause”. It’s not difficult to come up with examples where this strategy is a bad idea – but it’s often worth thinking through the consequences of trying to apply it. (Of course, you often find that after you’ve rewritten your SQL to match the strategy the optimizer transforms it back into the join that you had been avoiding.)
Since the products and customers tables don’t appear in the select list, can I find a way of rewriting my statement with subqueries ? The answer is yes. Here’s the SQL, with the execution plan I got.
select ord.* from orders ord where ord.rowid in ( select /*+ no_use_hash_aggregation */ prid.prid from ( select /*+ no_merge */ ord.rowid prid from products prd, orders ord where prd.grp = 50 and ord.id_prd = prd.id ) prid, ( select /*+ no_merge */ ord.rowid crid from customers cst, orders ord where cst.grp = 50 and ord.id_cst = cst.id ) crid where prid.prid = crid.crid ) ; ------------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | 138 | 591 (1)| 00:00:08 | | 1 | NESTED LOOPS | | 1 | 138 | 591 (1)| 00:00:08 | | 2 | VIEW | VW_NSO_1 | 11 | 132 | 589 (1)| 00:00:08 | | 3 | SORT UNIQUE | | 1 | 264 | | | |* 4 | HASH JOIN | | 11 | 264 | 589 (1)| 00:00:08 | | 5 | VIEW | | 3314 | 39768 | 294 (1)| 00:00:04 | | 6 | NESTED LOOPS | | 3314 | 86164 | 294 (1)| 00:00:04 | |* 7 | TABLE ACCESS FULL | PRODUCTS | 100 | 900 | 94 (2)| 00:00:02 | |* 8 | INDEX RANGE SCAN | ORD_PRD_FK | 33 | 561 | 2 (0)| 00:00:01 | | 9 | VIEW | | 3314 | 39768 | 294 (1)| 00:00:04 | | 10 | NESTED LOOPS | | 3314 | 86164 | 294 (1)| 00:00:04 | |* 11 | TABLE ACCESS FULL | CUSTOMERS | 100 | 900 | 94 (2)| 00:00:02 | |* 12 | INDEX RANGE SCAN | ORD_CST_FK | 33 | 561 | 2 (0)| 00:00:01 | | 13 | TABLE ACCESS BY USER ROWID| ORDERS | 1 | 126 | 1 (0)| 00:00:01 | ------------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 4 - access("PRID"."PRID"="CRID"."CRID") 7 - filter("PRD"."GRP"=50) 8 - access("ORD"."ID_PRD"="PRD"."ID") 11 - filter("CST"."GRP"=50) 12 - access("ORD"."ID_CST"="CST"."ID")
You’ll notice that this plan is remarkably similar to the plan I got from the joins with inline views – although the new plan has one extra “sort unique” operation at line 3. The optimizer has transformed my query by unnesting the subqueries and turning them into inline views – adding in a “distinct” operation because that’s what happens when you turn an “IN subquery” into a join when there isn’t a suitable uniqueness constraint on the join column.
There are two reasons for adopting this subquery approach. There’s the (highly subjective) argument that the code is a little easier to understand in this form, especially if you then want to join the orders table onwards to other tables. There’s also an objective argument relating to the “sort unique”. Note that I included the hint “no_use_hash_aggregation” to stop Oracle from using a “hash unique” operation at this point. By forcing Oracle to sort for uniqueness I know that the rowids will appear in (at least) an “extent-based” order rather than a completely random order. It’s possible that walking the fact table in physical order will be a little more efficient than accessing it in a completely randomised order. The former might find multiple rows in a block or benefit from some form of O/S or SAN read-ahead; the latter is more likely to turn one row into one random physical read.
One final thought – I said that you might want to take this type of approach for queries where the optimizer can’t do a star transformation automatically. There’s one very important case where this is always true – when you’re running Standard Edition, which doesn’t support bitmap indexes. So if you’re running SE and want to do star transformations – you now know how.
Footnote: I haven’t spent much time testing the limits of variations of this code although I did spend 30 minutes extending the example to use three dimension tables. If you think that this strategy might solve a production problem, think about using the sample code to create and test a better model of your production requirement before rewriting the production code – and if you go to production, check very carefully that the resulting code does produce the same results.
While googling I found a fresh article from Oracle for the VLDB journal: Enhanced Subquery Optimization in Oracle. It primarily discusses subqueries – how Oracle deals with them on optimization and at run-time. Paper’s topics overview: subquery coalescing – here I’ve done some testing of the feature partially available in 11gR2 (parallel) group-by pushdown – [...]
Recent comments
3 years 5 weeks ago
3 years 17 weeks ago
3 years 21 weeks ago
3 years 22 weeks ago
3 years 27 weeks ago
3 years 48 weeks ago
4 years 16 weeks ago
4 years 46 weeks ago
5 years 30 weeks ago
5 years 31 weeks ago