This is the second part of the video tutorial "Analysing Parallel Execution Skew". In this part I introduce the concept of "Data Flow Operations (DFOs)" and "DFO Trees", which is what a Parallel Execution plan is made of. DFOs / DFO Trees are specific to Parallel Execution and don't have any counterpart in a serial execution plan.
Understanding the implications of DFOs / DFO Trees is important as prerequisite for understanding some of the effects shown in the later parts of the video tutorial, hence I covered this as a separate topic.
Note that this tutorial also demonstrates some new 12c features regarding Parallel Execution, in particular how Oracle 12c now lifts many of the previous limitations that lead to the generation of multiple DFO Trees.
Here is the video:
A minor update 4.01 to the XPLAN_ASH utility is available for download.
These are the notes from the change log:
- More info for RAC Cross Instance Parallel Execution: Many sections now show a GLOBAL aggregate info in addition to instance-specific data
- The Parallel Execution Server Set detection and ASSUMED_DEGREE info now makes use of the undocumented PX_STEP_ID and PX_STEPS_ARG info (bit mask part of the PX_FLAGS column) on 22.214.171.124+
- Since version 4.0 added from 126.96.36.199 on the PX *MAX* DOP in the "SQL statement execution ASH Summary" based on the new PX_FLAGS column of ASH it makes sense to add a PX *MIN* DOP in the summary to see at one glance if different DOPs were used or not
Today, while tuning a fairly complex query experiencing wrong cardinality estimates, I noticed something I was not aware of. Hence, I thought to write this short post to illustrate how to reproduce the problem I experienced…
A new version of the XPLAN_ASH tool (detailed analysis of a single SQL statement execution) is available for download. The previous post includes links to video tutorials explaining what the tool is about.
The new version comes with numerous improvements and new features. The most important ones are:
When the optimizer has to estimate the data volume (the BYTES column in the plan output), it usually bases this information on the column statistics, if applicable and available (think of complex expressions).However, whenever there is a VIEW operator in an execution plan, that represents an unmerged view, the optimizer obviously "loses" this information and starts applying defaults that are based on the column definition.Depending on the actual content of the columns this can lead to dramatic differences in data volume estimates.Both, under- and overestimates are possible, because for character based columns these defaults seem to be based on an assumed 50% fill grade, so a VARCHAR2(100 BYTE) column counts as 50 bytes data volume.For multi-byte character sets the same rule applies based on the maximum width of a column using the "char" semantics, so a VARCHAR2(1000 CHAR) column counts as 2000 byte
The aim of this post isn’t to explain what the “exadata mode” is. Hence, if you don’t know what it is, before continuing reading have a look to this post published on Kerry Osborne’s blog. The only thing I would like to add is that the “exadata mode” is available as of 188.8.131.52 or when a patch implementing the enhancement associated to bug 10248538 is installed.
The key information I would like to share with you is that, in some situations, gathering system statistics in “exadata mode” is pointless. Let me explain why… But, before doing so, it’s important to review how the query optimizer computes the cost of full scans.
The key formula used by the query optimizer to compute the I/O cost of a full scan is the following:
io_cost = ceil ( blocks / mbrc * mreadtim / sreadtim ) + 1
The values provided by the “parse count (total)” and “session cursor cache hits” statistics are subject to several bugs. And, what’s worse, for years Oracle didn’t care to fix it. This is my impression, at least.
Then, when few weeks ago I read in the Oracle Support note 13837105.8 (Bug 13837105 – statistics “parse count (total)” and “session cursor cache hits” miscounted) that the bug introduced in 184.108.40.206 was fixed, I hoped that others bugs in this area were fixed as well.
Unfortunately, it’s not the case. What a disappointment!
The TIMESTAMP WITH TIME ZONE data type that got introduced a long time ago is known for some oddities, for example Tony Hasler has a nice summary of some of them here.Here is another oddity that shows up when trying to aggregate on such a data type. Have a look at the following simple example:
create table t
rownum as id
, date '2000-01-01' + rownum - 1 as some_date
, cast(date '2000-01-01' + rownum - 1 as timestamp) as some_timestamp
, cast(date '2000-01-01' + rownum - 1 as timestamp with local time zone) as some_timestamp_with_local_tz
, cast(date '2000-01-01' + rownum - 1 as timestamp with time zone) as some_timestamp_with_timezone
This is a short post about a strange behavior I discovered today…
The fact is that it is not really possible to use the named notation with the SEM_MATCH table function. In fact, even though the parameter’s names can be specified, the order of the parameters overrides the specification done with the named notation!?!
Here is an example:
A new major release (version 3.0) of my XPLAN_ASH tool is available for download.
In addition to many changes to the way the information is presented and many other smaller changes to functionality there is one major new feature: XPLAN_ASH now also supports S-ASH, the free ASH implementation.
If you run XPLAN_ASH in a S-ASH repository owner schema, it will automatically detect that and adjust accordingly.
XPLAN_ASH was tested against the latest stable version of S-ASH (2.3). There are some minor changes required to that S-ASH release in order to function properly with XPLAN_ASH. Most of them will be included in the next S-ASH release as they really are only minor and don't influence the general S-ASH functionality at all.