Here’s a funny little problem I came across some time ago when setting up some materialized views. I have two tables, orders and order_lines, and I’ve set up materialized view logs for them that allow a join materialized view (called orders_join) to be fast refreshable. Watch what happens if I refresh this view just before gathering stats on the order_lines table.
I have a little script that start with “set echo on”, then calls two packaged procedures, one to refresh the join view, the other to collect stats on the order_lines table; here’s the output from that script:
I posted a couple of examples some time ago of oddities and boundary cases for frequency histograms on character columns. Part of the process of playing around with the 12c Beta was to re-run such cases to see if newer code made any difference. Looking at these examples, one was fixed (or improved, at least) the other wasn’t, so I’ve added a footnote to each and produced this little note to highlight the changes:
In part 2 of this mini-series I’ll be describing the new mechanism for the simple frequency histogram and the logic of the Top-N frequency histogram. In part 3 I’ll be looking at the new hybrid histogram. You need to know about the approximate NDV before you start – but there’s a thumbnail sketch at the end of the posting if you need a quick reminder.
To allow for collection of simple frequency histogram – record the first rowid for each hash value generated and count the number of times the hash value is generated. If, by the end of the table you have no more than the requested (default 254, max 2,000) distinct hash values you can look up the actual values with a query by rowid.
I’ve mentioned “linear decay” in several posts when explaining a problem that someone has seen with an execution path – but I’ve recently realised that I don’t have a post describing what it is and how it works – although it’s in Cost Based Oracle – Fundamentals, of course, if you want some detail – so here’s a brief introduction (based on simple stats with no histograms).
There are a few enhancements in 12c that might make a big difference to performance for a small investment in effort. One of the important enhancements comes from changes in histograms – which improve speed of collection with accuracy of results. The changes are so significant that I chose the topic as my presentation at OpenWorld last year.
Here’s a couple of extracts from a trace file after I’ve set optimizer_dynamic_sampling to level 3. I’ve run two, very similar, SQL statements that both require dynamic sampling according to the rules for the parameter – but take a look at the different ways that sampling has happened, and ask yourself what’s going on:
Statement 1 produced this sampling code:
One of the sad things about trying to keep on top of Oracle is that there are so many little things that could go wrong and take a long time to identify. In part this is why I try to accumulate test cases for all the oddities and anomalies I come across as I travel around the world – if I’ve spent the time recreating a problem I’ll probably remember it the next time I see the symptoms.
Here’s a little threat that comes into play when a couple of events occur simultaneously, in this case: automatically selected indexes being rebuilt combined with an unfortunate choice of index definitions. Here’s a demonstration (running 184.108.40.206, 1MB uniform extents, 8KB block size, freelist management – first the symptoms, script, followed by results:
In part 1 of this mini-series we looked at the effects of costing a tablescan serially and then parallel when the maxthr and slavethr statistics had not been set.
In part 2 we looked at the effect of setting just the maxthr - and this can happen if you don’t happen to do any parallel execution while the stats collection is going on.
In part 3 we’re going to look at the two variations the optimizer displays when both statistics have been set. So here are the starting system stats:
Actually, there hasn’t been a “maxthr – 1″, I called the first part of this series“System Stats”. If you look back at it you’ll see that I set up some system statistics, excluding the maxthr and slavethr values, and described how the optimizer would calculate the cost of a serial tablescan, then I followed this up with a brief description of how the calculations changed if I hinted the optimizer into a parallel tablescan.