Usually the Cost-Based Optimizer arrives at a reasonable execution plan if it gets the estimates regarding cardinality and data scattering / clustering right (if you want to learn more about that why not watch my Webinar available at "AllThingsOracle.com"?).
Here is an example I've recently come across where this wasn't case - the optimizer obviously preferred plans with a significantly higher cost.
The setup to reproduce the issue is simple:
Oracle 11g added Extended Statistics support for column groups in order to detect correlated columns for filter predicates using an equal comparison.
Note that Oracle 11g also added the ability to use the number of distinct keys of a composite index as an upper limit for the cardinality estimates for matching column predicates, which means that the optimizer is now capable of detecting correlated columns without the explicit addition of Extended Statistics / Column Groups.
Oracle 11.2 introduced a set of new Query Transformations, among others the ability to coalesce subqueries which means that multiple correlated subqueries can be merged into a number of less subqueries.
Timur Akhmadeev already demonstrated the basic principles in a blog entry, but when I was recently involved into supporting a TPC-H benchmark for a particular storage vendor I saw a quite impressive application of this optimization that I would like to share here.
Here is an odd little bug that was discussed a couple of weeks ago on the OTN forums.
It's about queries on join views by ROWID that fail with "ORA-01410: invalid ROWID" under certain circumstances. The bug can only be reproduced when using the 11.2 code base. In fact the same setup will cause an internal error in 11.1, but 10.2 will return correct results.
It's probably not a very common scenario but it is an interesting example of how features that work fine by themselves can cause problems when used together.
First of all (hopefully) some of you may ask: How is it possible to query from a join view by ROWID, since the view is based on multiple objects and hence doesn't have a simple one-to-one mapping to a ROWID of a single table?
There is a lot more to say about Dynamic Sampling and indexes, and I'll try to cover these basics in my Dynamic Sampling series on AllThingsOracle.com, but two recent discussions on the OTN forums and on Charles Hooper's blog prompted me to publish this blog post.
These discussions revolved around the following issues with Dynamic Sampling and indexes:
1. CREATE INDEX On Empty Table
This is just a short note that Oracle has added several nice details to 220.127.116.11 and 18.104.22.168 respectively that can be helpful for troubleshooting.
ASH, PGA Memory And TEMP Consumption
Since 22.214.171.124 the V$ACTIVE_SESSION_HISTORY view (that requires Enterprise Edition plus Diagnostic License) contains the PGA_ALLOCATED and TEMP_SPACE_ALLOCATED columns.
In particular the latter closes an instrumentation gap that always bothered me in the past: So far it wasn't easy to answer the question which session used to allocate TEMP space in the past. Of course it is easy to answer while the TEMP allocation was still held by a session by looking at the corresponding V$ views like V$SORT_USAGE, but once the allocation was released answering questions like why was my TEMP space exhausted three hours ago was something that couldn't be told by looking at the information provided by Oracle.
Here is a summary of the findings while evaluating Incremental Partition Statistics that have been introduced in Oracle 11g.
The most important point to understand is that Incremental Partition Statistics are not "cost-free", so anyone who is telling you that you can gather statistics on the lowest level (partition or sub-partition in case of composite partitioning) without any noticeable overhead in comparison to non-incremental statistics (on the lowest level) is not telling you the truth.
Although this might be obvious I've already personally heard someone making such claims so it's probably worth to mention.
In principle you need to test on your individual system whether the overhead that is added to each statistics update on the lowest level outweighs the overhead of actually gathering statistics on higher levels, of course in particular on global level.
This post is about a potential pitfall when migrating from 11.2.0.x to the next point release. I stumbled over problem this one on a two node cluster.
The operating system is Oracle Linux 5.5 running 126.96.36.199.3 and I wanted to go to 188.8.131.52.0. As you know, Grid Infrastructure upgrades are out-of-place, in other words require a separate Oracle home. This is also one of the reasons I wouldn’t want less than 20G on a non-lab like environment for the Grid Infrastructure mount points …
Now when you are upgrading from 11.2.0.x to 184.108.40.206 you need to apply a one-off patch, but the correct one! Search for patch number 12539000 (11203:ASM UPGRADE FAILED ON FIRST NODE WITH ORA-03113) and apply the one that matches your version-and pay attention to these PSUs! There is the obvious required opatch update to be performed before again as well.
In a recent OTN thread I've been reminded of two facts about Dynamic Sampling that I already knew but had forgotten in the meantime:
1. The table level dynamic sampling hint uses a different number of blocks for sampling than the session / cursor level dynamic sampling. So even if for both for example level 5 gets used the number of sampled blocks will be different for most of the 10 levels available (obviously level 0 and 10 are exceptions)
2. The Dynamic Sampling code uses a different approach for partitioned objects if it is faced with the situation that there are more partitions than blocks to sample according to the level (and type table/cursor/session) of Dynamic Sampling
Note that all this here applies to the case where no statistics have been gathered for the table - I don't cover the case when Dynamic Sampling gets used on top of existing statistics.
If you consider the usage of Table Functions then you should be aware of some limitations to the optimizer calculations, in particular when considering a join between a Table Function and other row sources.
As outlined in one of my previous posts you can and should help the optimizer to arrive at a reasonable cardinality estimate when dealing with table functions, however doing so doesn't provide all necessary inputs to the join cardinality calculation that are useful and available from the statistics when dealing with regular tables.
Therefore even when following the recommended practice regarding the cardinality estimates it is possible to end up with some inaccuracies. This post will explain why.
Join Cardinality Basics