Search

Top 60 Oracle Blogs

Recent comments

CBO Oddities – 1

I’ve decided to do a little rewriting and collating so that I can catalogue related ideas in an order that makes for a better narrative. So this is the first in a series of notes designed to help you understand why the optimizer has made a particular choice and why that choice is (from your perspective) a bad one, and what you can do either to help the optimizer find a better plan, or subvert the optimizer and force a better plan.

If you’re wondering why I choose to differentiate between “help the optimizer” and “subvert the optimizer” consider the following examples.

  • A query is joining two tables in the wrong order with a hash join when you know that a nested loop join in the opposite order would far better because you know that the data you want is very nicely clustered and there’s a really good index that would make access to that data very efficient. You check the table preferences and discover that the table_cached_blocks preference (see end notes) is at its default value of 1, so you set it to 16 and gather fresh stats on the indexes on the table. Oracle now recognises the effectiveness of this index and changes plan accordingly.
  • The optimizer has done a surprising transformation of a query, aggregating a table before joining to a couple of other tables when you were expecting it to use the joins to eliminate a huge fraction of the data before aggregating it.  After a little investigation you find that setting hidden parameter _optimizer_distinct_placement to false stops this happening.

You may find the distinction unnecessarily fussy, but I’d call the first example “helping the optimzier” – it gives the optimizer some truthful information about your data that is potentially going to result in better decisions in many different statements – and the second example “subverting the optimizer” – you’ve brute-forced it into not taking a path you didn’t like but at the same time you may have stopped that feature from appearing in other ways or in other queries. Of course, you might have minimised the impact of setting the parameter by using the opt_param() hint to apply the restriction to just this one query, nevertheless it’s possible that there is a better plan for the query that would have used the feature at some other point in the query if you’d managed to do something to help the optimizer rather than constraining it.

What’s up with the Optimizer

It’s likely that most of the articles will be based around interpreting execution plans since those are the things that tell us what the optimizer thinks will happen when it executes a statement, and within execution plans there are three critical aspects to consider –

  1. the numbers (most particularly Cost and Rows),
  2. the shape of the plan,
  3. the Predicate Information.

I want to use this note to make a couple of points about just the first of the three.

  • First – the estimates on any one line of an execution plan are “per start” of the line; some lines of an execution plan will be called many times in the course of a statement. In many cases the Rows estimate from one line of a plan will dictate the number of times that some other line of the plan will be executed – so a bad estimate of “how much data” can double up as a bad estimate of “how many times”, leading to a plan that looks efficient on paper but does far too much work at run-time. A line in a plan that looks a little inefficient may be fine if it executes only one, a line that looks very efficient may be a disaster if it executes a million time. Being able to read a plan and spot the places where the optimizer has produced a poor estimate of Rows is a critical skill – and there are many reasons why the optimizer produces poor estimates. Being able to spot poor estimates depends fairly heavily on knowing the data, but if you know the generic reasons for the optimizer producing poor estimates you’ve got a head start for recognising and addressing the errors when they appear.
  • Second – Cost is synonymous with Time. For a given instance at a given moment there is a simple, linear, relationship between the figure that the optimizer reports for the Cost of a statement (or subsection of a statement) and the Time that the optimizer reports. For many systems (those that have not run the calibrate_io procedure) the Time is simply the Cost multiplied by the time the optimizer thinks it will take to satisfy a single block read request, and the Cost is the optimizer’s estimate of the I/O requirement to satisfy the statement – with a fudge factor introduced to recognise the fact that a “single block” read request ought to complete in less time than a “multiblock” read request. Generally speaking the optimizer will consider many possible plans for a statement and pick the plan with the lowest estimated cost – but there is at least one exception to this rule, and it is an unfortunate weakness in the optimizer that there are many valid reasons why its estimates of Cost/Time are poor. Of course, you will note that the values that Oracle reports for the Time column are only accurate to the second – which isn’t particularly helpful when a single block read typically operates in the range of a few milliseconds.

To a large degree the optimizer’s task boils down to:

  • What’s the volume and scatter of the data I need
  • What access paths, with what wastage, are available to get to that data
  • How much time will I spend on I/O reading (and possibly discarding) data to extract the bit I want

Of course there are other considerations like the amount of CPU needed for a sort, the potential for I/O as sorts or hash joins, the time to handle a round-trip to a remote system, and RAC variations on the basic theme. But for many statements the driving issue is that any bad estimates of “how much data” and “how much (real) I/O” will lead to bad, potentially catastrophic, choices of execution plan. In the next article I’ll list all the different reasons (that I can think of at the time) why the optimizer can produce bad estimates of volume and time.

References for Cost vs. Time

References for table_cached_blocks: