I’ve just responded to the call for items for the “IOUG Quick Tips” booklet for 2015 – so it’s probably about time to post the quick tip that I put into the 2014 issue. It’s probably nothing new to most readers of the blog, but sometimes an old thing presented in a new way offers fresh insights or better comprehension.
A histogram, created in the right way, at the right time, and supported by the correct client-side code, can be a huge benefit to the optimizer; but if you don’t create and use them wisely they can easily become a source of inconsistent performance, and the automatic statistics gathering can introduce an undesirable overhead during the overnight batch. This note explains how you can create histograms very cheaply on the few columns where they are most likely to have a beneficial effect.
Here’s a simple data set – I’m only interested in three of the columns in the work that follows, but it’s a data set that I use for a number of different models:
Here’s a thought that came to me while I was writing up a note about identifying redundant indexes a few minutes ago. Sometimes you end up supporting applications with unexpected duplication of data and indexes and need to find ways to reduce overheads. Here’s some code modelling a scenario that I’ve seen more often than I like (actually, just once would be more often than I’d like):
It’s interesting to watch the CBO evolving and see how an enhancement in one piece of code doesn’t necessarily echo through to all the other places it seems to fit. Here’s an example of an enhancement that spoiled (or, rather, made slightly more complicated) a little demonstration I had been running for about the last 15 years – but (in a fashion akin to another partitioning limitation) doesn’t always work in exactly the way you might expect.
What prompted me to write my previous note about subquerying was an upgrade to 12c, and a check that a few critical queries would not do something nasty on the upgrade. As ever it’s always interesting how many little oddities you can discover while looking closely as some little detail of how the optimizer works. Here’s an oddity that came up in the course of my
playing around investigation in 184.108.40.206 – first some sample data:
Several years go (eight to be precise) I wrote a note suggesting that Oracle will not materialize a factored subquery unless it is used at least twice in the main query. I based this conclusion on a logical argument about the cost of creating and using a factored subquery and, at the time, I left it at that. A couple of years ago I came across an example where even with two uses of a factored subquery Oracle still didn’t materialize even though the cost of doing so would reduce the cost of the query – but I never got around to writing up the example, so here it is:
No, not really – but sometimes the optimizer gets better and gives you worse performance as a side effect when you upgrade. Here’s an example where 220.127.116.11 recognised (with a few hints) the case for a nested loop semi-join and 12c went a bit further and recognised the opportunity for doing a cunning “semi_to_inner” transformation … which just happened to do more work than the 11g plan.
Here’s a data set to get things going, I’ve got “parent” and “child” tables, but in this particular demonstration I won’t be invoking referential integrity:
I’ve written about optimizer defects with descending indexes before now but a problem came up on the OTN database forum a few days ago that made me decide to look very closely at an example where the arithmetic was clearly defective. The problem revolves around a table with two indexes, one on a date column (TH_UPDATE_TIMESTAMP) and the other a compound index which starts with the same column in descending order (TH_UPDATE_TIMESTAMP DESC, TH_TXN_CODE). The optimizer was picking the “descending” index in cases where it was clearly the wrong index (even after the statistics had been refreshed and all the usual errors relating to date-based indexes had been discounted). Here’s an execution plan from the system which shows that there’s something wrong with the optimizer:
Here’s a little surprise that came up on the OTN database forum a few days ago. Rather than describe it, I’m just going to create a data set to demonstrate it, initially using 18.104.22.168 although the same thing happens on 22.214.171.124. The target is a query that joins to a range/hash composite partitioned table and uses a Bloom filter to do partition pruning at the subpartition level. (Note to self: is it possible to see Bloom filters that operate at both the partition and subpartition level from a single join? I suspect not.). Here’s my code to create and populate both the partitioned table and a driving table for the query:
Update 9th July 2015: part 4 now published.
I’ve changed the catalogue from a post to a page so that it gets a static address: https://jonathanlewis.wordpress.com/cbo-series/
I’ll leave this posting here for a while, but will probably remember to remove it some time in the future.