Towards the end of last year I used a query with a couple of “constant” subqueries as a focal point for a blog note on reading parallel execution plans. One of the comments on that note raised a question about cardinality estimates and, coincidentally, I received an email about the cost calculations for a similar query a few days later.
Unfortunately there are all sorts of anomalies, special cases, and changes that show up across versions when subqueries come into play – it’s only in recent versions of 11.2, for example, that a very simple example I’ve got of three equivalent statements that produce the same execution plan report the same costs and cardinality. (The queries are: table with IN subquery, table with EXISTS subquery, table joined to “manually unnested” subquery – the three plans take the unnested subquery shape.)
The Oracle database has all sorts of little details built into it to help it deal with multi-national companies, but since they’re not commonly used you can find all sorts of odd “buggy” bits of behaviour when you start to look closely. I have to put “buggy” in quotes because some of the reported oddities are the inevitable consequences of (for example) how multi-byte character sets have to work; but some of the oddities look as if they simply wouldn’t be there if the programmer writing the relevant bit of code had remembered that they also had to cater for some NLS feature.
I think column groups can be amazingly useful in helping the optimizer to generate good execution plans because of the way they supply better details about cardinality; unfortunately we’ve already seen a few cases (don’t forget to check the updates and comments) where the feature is disabled, and another example of this appeared on OTN very recently.
Modifying the example from OTN to make a more convincing demonstration of the issue, here’s some SQL to prepare a demonstration:
Here’s a live one from OTN – here are a couple of extracts from the problem statement:
We’re experiencing an issue where it seems that the query plan changes from day to day for a particular procedure that runs once a night.
It’s resulting in a performance variance of 10 second completion time vs 20 minutes (nothing in between).
It started occurring about 2 months ago and now it’s becoming more prevalent where the bad query plan is coming up more often.
I noticed that the query plans vary for a simple query.
We do run gather statistics every night. (DBMS_STATS.GATHER_SCHEMA_STATS (ownname=>sys_context( ‘userenv’, ‘current_schema’ ), estimate_percent => 1);)
The query and two execution plans look like this:
I’ve just responded to the call for items for the “IOUG Quick Tips” booklet for 2015 – so it’s probably about time to post the quick tip that I put into the 2014 issue. It’s probably nothing new to most readers of the blog, but sometimes an old thing presented in a new way offers fresh insights or better comprehension.
A histogram, created in the right way, at the right time, and supported by the correct client-side code, can be a huge benefit to the optimizer; but if you don’t create and use them wisely they can easily become a source of inconsistent performance, and the automatic statistics gathering can introduce an undesirable overhead during the overnight batch. This note explains how you can create histograms very cheaply on the few columns where they are most likely to have a beneficial effect.
Here’s a simple data set – I’m only interested in three of the columns in the work that follows, but it’s a data set that I use for a number of different models:
Here’s a thought that came to me while I was writing up a note about identifying redundant indexes a few minutes ago. Sometimes you end up supporting applications with unexpected duplication of data and indexes and need to find ways to reduce overheads. Here’s some code modelling a scenario that I’ve seen more often than I like (actually, just once would be more often than I’d like):
The idea for this blog post started a few weeks ago when i had to troubleshoot some Oracle database / SQL performance issues at client site. The SQL itself included several views and so placing hints (for testing purpose) into the views was not possible, especially as the views were used widely and not only by the SQL with the performance issue. In consequence this blog post is about the difference between embedded global and local hints and how to use them.
It’s interesting to watch the CBO evolving and see how an enhancement in one piece of code doesn’t necessarily echo through to all the other places it seems to fit. Here’s an example of an enhancement that spoiled (or, rather, made slightly more complicated) a little demonstration I had been running for about the last 15 years – but (in a fashion akin to another partitioning limitation) doesn’t always work in exactly the way you might expect.
What prompted me to write my previous note about subquerying was an upgrade to 12c, and a check that a few critical queries would not do something nasty on the upgrade. As ever it’s always interesting how many little oddities you can discover while looking closely as some little detail of how the optimizer works. Here’s an oddity that came up in the course of my
playing around investigation in 220.127.116.11 – first some sample data: