In the second part of this post (go to part 1) I want to focus on the hybrid distribution for skewed join expressions.
The HYBRID HASH distribution allows to some degree addressing data distribution skew in case of HASH distributions, which I've described in detail already in the past.
In this blog post I want to cover some aspects of the the new HYBRID HASH adaptive distribution method that I haven't covered yet in my other posts.As far as I know it serves two purposes for parallel HASH and MERGE JOINs, adaptive broadcast distribution and hybrid distribution for skewed join expressions. In the first part of this post I want to focus on former one (goto part 2).
It allows the PX SEND / RECEIVE operation for the left (smaller estimated row source) of the hash join to decide dynamically at runtime, actually at each execution, if it should use either a BROADCAST or HASH distribution, and correspondingly for the other row source to use then either a ROUND-ROBIN or a HASH distribution, too.
Now that I've shown in the previous post in general that sometimes Parallel Execution plans might end up with unnecessary BUFFER SORT operations, let's have a look what particular side effects are possible due to this.
What would you say if someone tells you that (s)he just did a simple, straightforward "SELECT * FROM TABLE" that took several minutes to execute without returning, only to then error out with "ORA-01652 unable to extend temp segment", and the TABLE in question is actually nothing but a simple, partitioned heap table, so no special tricks, no views, synonyms, VPD etc. involved, it's really just a plain simple table?
When using Parallel Execution, depending on the plan shape and the operations used, Oracle sometimes needs to turn non-blocking operations into blocking operations, which means in this case that the row source no longer passes its output data directly to the parent operation but buffers some data temporarily in PGA memory / TEMP. This is either accomplished via the special HASH JOIN BUFFERED operation, or simply by adding BUFFER SORT operations to the plan.The reason for such a behaviour in parallel plans is the limitation of Oracle Parallel Execution that allows only a single data redistribution to be active concurrently.
This is just an addendum to the previous post demonstrating one example (out of many possible) where the join skew handling feature fails. The test case setup is the same as in the previous post.As mentioned in the AllThingsOracle.com article and in the introduction of the previous post, the feature at present only applies to a rather limited number of scenarios. To wrap things up and to give an idea what can happen with that new feature, here's a three table join that actually makes use of the feature for one join, only to suffer from the skew problem in the next join that uses the same join expression, but doesn't qualify (yet) for the skew handling feature:
Besides the officially available information about new optimizer features in 12c it is always a good idea to have a look at the internal optimizer parameters that show what features are enabled when running with OPTIMIZER_FEATURES_ENABLE = 12.1.0.1. Here is the list of internal optimizer parameters and fix controls that are different between 11.2.0.4 and 12.1.0.1:
Optimizer parameters:
#eeeeee; border: 1px dashed rgb(204, 204, 204); overflow: auto;">_optimizer_partial_join_eval partial join evaluation parameter
_optimizer_unnest_scalar_sq enables unnesting of of scalar subquery
_optimizer_ansi_join_lateral_enhance optimization of left/full ansi-joins and lateral views
_optimizer_multi_table_outerjoin allows multiple tables on the left of outerjoin
Oracle 12c introduces several new features in the area of Parallel Execution. Over the next couple of weeks I attempt to publish more about them - Jonathan Lewis for example already published a note about the new "PQ Replication" feature that applies to the BROADCAST distribution of small tables.One important new feature is the automatic skew handling for parallel joins. I've already given an overview of the feature in my mini-series "Parallel Execution Skew" at "AllThingsOracle.com", so if all you want is a high-level overview I recommend reading the article there.The purpose of this note here is to provide a few more internals and details about that feature.First, just a short summary of the prerequisites of the feature to work:1.
When the optimizer has to estimate the data volume (the BYTES column in the plan output), it usually bases this information on the column statistics, if applicable and available (think of complex expressions).However, whenever there is a VIEW operator in an execution plan, that represents an unmerged view, the optimizer obviously "loses" this information and starts applying defaults that are based on the column definition.Depending on the actual content of the columns this can lead to dramatic differences in data volume estimates.Both, under- and overestimates are possible, because for character based columns these defaults seem to be based on an assumed 50% fill grade, so a VARCHAR2(100 BYTE) column counts as 50 bytes data volume.For multi-byte character sets the same rule applies based on the maximum width of a column using the "char" semantics, so a VARCHAR2(1000 CHAR) column counts as 2000 byte
Recent comments
3 years 5 weeks ago
3 years 17 weeks ago
3 years 22 weeks ago
3 years 23 weeks ago
3 years 27 weeks ago
3 years 48 weeks ago
4 years 16 weeks ago
4 years 46 weeks ago
5 years 30 weeks ago
5 years 31 weeks ago