This is possibly my longest title to date – I try to keep them short enough to fit the right hand column of the blog without wrapping – but I couldn’t think of a good way to shorten it (Personally I prefer to use the expression CTE – common table expression – over “factored subquery” or “subquery factoring” or “with subquery”, and that would have achieved my goal, but might not have meant anything to most people.)
If you haven’t come across them before, recursive CTEs appeared in 11.2, are in the ANSI standard, and are (probably) viewed by Oracle as the strategic replacement for “connect by” queries. Here’s a simple (and silly) example:
From time to time I’ve posted a reminder that subquery factoring (“with subquery”) can give you changes in execution plans even if the subquery that you’ve taken out of line is written back inline by Oracle rather than being materialized. This can still happen in 12c – here’s a sample query in the two forms with the result sets and execution plans. First, the “factored” version:
Here’s a little script to demonstrate an interesting observation that appeared in my email this morning (that’s morning Denver time):
I was involved in a thread on Oracle-L recently started with the question: “How many LIOs is too many LIOs”. Rather than rewrite the whole story, I’ve supplied a list of links to the contributions I made, in order – the final “answer” is actually the answer to a different question – but travels an interesting path to get there.#
Here’s an example I saw a few months ago of the confusion caused by NULL. As the owner of the problem put it: the first query, run from SQL*Plus for testing purposes, takes no time to complete; but when “put into a pl/sql cursor” (as shown in the second query) it takes ages to complete.
What’s going on ?
select id , description , inventory from the_table where category = nvl(null, category); -- open c_results for select id , description , inventory from the_table where category = nvl(p_user_category, category); fetch c_results into v_id, v_description, v_inventory; close c_results;
After pondering the problem for a while the OP managed to get the required level performance by changing the predicate to this:
How not to write subqueries:
AND sal.ticket_airline || sal.ticket_number NOT IN ( SELECT sub.dsd_airline || sub.dsd_ticket_number FROM ... )
If there had been any opportunity for the optimizer to do something clever with the NOT IN, you’ve just made sure it can’t happen. On top of that you may find that you don’t get the right results – consider the following cut-n-paste:
SQL> select user from dual where 1 || 23 = 12 || 3; USER ------------------------------ TEST_USER 1 row selected.
Sometimes people simply forget that you can have multiple columns in subqueries (or in “IN Lists”) – so it’s perfectly valid to write the subquery as:
I’ve written notes about the different joins in the past – but such things are always worth revisiting, so here’s an accumulated bundle of comments about hash joins.
There are a couple of posts on the blog describing problems with updateable join views or, to be more precise, join views which were key-preserved but which the optimizer did not recognize as key-preserved. Both scenarios are addressed in 12c:
I’ve written a few notes about anomalies in subquery factoring (with subquery) in the past, principally making a fuss about the fact that moving an inline view into a “with subquery” can cause a plan to change even when the internal code moves the subquery back in line. With the arrival of 12c one of my first sets of tests was to rerun all the examples to see how many of them had been addressed. I hadn’t written about as many examples as I had thought, and some of them had been fixed before 12c, but here are few references to a couple of outstanding items that I thought worth a mention:
There have been a couple of nice posts about the “Top N” (or First N / Next N) syntax that has appeared in 12c, here and here, for example. I particularly like the first set of examples because they include some execution plans that give you a good idea of what’s going on under the covers. “Under the covers” is important, because if you don’t actually have a large data set to test on you might not realise what impact a “Top N” query might have on a production data set.