This is the second part of the video tutorial "Analysing Parallel Execution Skew". In this part I introduce the concept of "Data Flow Operations (DFOs)" and "DFO Trees", which is what a Parallel Execution plan is made of. DFOs / DFO Trees are specific to Parallel Execution and don't have any counterpart in a serial execution plan.
Understanding the implications of DFOs / DFO Trees is important as prerequisite for understanding some of the effects shown in the later parts of the video tutorial, hence I covered this as a separate topic.
Note that this tutorial also demonstrates some new 12c features regarding Parallel Execution, in particular how Oracle 12c now lifts many of the previous limitations that lead to the generation of multiple DFO Trees.
Here is the video:
A minor update 4.01 to the XPLAN_ASH utility is available for download.
These are the notes from the change log:
- More info for RAC Cross Instance Parallel Execution: Many sections now show a GLOBAL aggregate info in addition to instance-specific data
- The Parallel Execution Server Set detection and ASSUMED_DEGREE info now makes use of the undocumented PX_STEP_ID and PX_STEPS_ARG info (bit mask part of the PX_FLAGS column) on 184.108.40.206+
- Since version 4.0 added from 220.127.116.11 on the PX *MAX* DOP in the "SQL statement execution ASH Summary" based on the new PX_FLAGS column of ASH it makes sense to add a PX *MIN* DOP in the summary to see at one glance if different DOPs were used or not
Along the new mini series "Parallel Execution Skew" at "AllThingsOracle.com" that provides some information what you can do if you happen to have a parallel SQL execution that is affected by work distribution problems I'm publishing a series of video tutorials that explain how you can actually detect and measure whether a SQL execution is affected by skew or not.
Originally this tutorial was planned as one part (Part 5 actually) of the XPLAN_ASH video tutorial series, however so far I've only managed to publish just the inital two parts of that series, and these are already a bit outdated as I've released new versions of the XPLAN_ASH tool with significant changes and enhancements since then.
I’ve popped this note to the top of the stack because I’ve added an index to Randolf Geist’s series on parallel execution skew, and included a reference his recent update to the XPLAN_ASH utility.
This is the directory for a short series I wrote discussing how to interpret parallel execution plans in newer versions of Oracle.
In the last article (I hope) of this series I want to look at what happens when I change the parallel distribution method on the query that I’ve been using in my previous demonstrations. This was a query first introduced in a note on Bloom Filters (opens in a separate window) where I show two versions of a four-table parallel hash join, one using using the broadcast distribution mechanism throughout, the other using the hash distribution method. For reference you can review the table definitions and plan (with execution stats) for the serial join in this posting (also opens in a separate window).
Another day, another airport lounge – another quick note: one of the changes that appeared in 12c was a tweak to the “broadcast” distribution option of parallel queries. I mentioned this in a footnote to a longer article a couple of months ago; this note simply expands on that brief comment with an example. We’ll start with a simple two-table hash join – which I’ll first construct and demonstrate in 18.104.22.168:
A little while ago I highlighted a special case with the MINUS operator (that one of the commentators extended to include the INTERSECT operator) relating to the way the second subquery would take place even if the first subquery produced no rows. I’ve since had an email from an Oracle employee letting me know that the developers looked at this case and decided that it wasn’t feasible to address it because – taking a wider view point – if the query were to run parallel they would need a mechanism that allowed some synchronisation between slaves so that every slave could find out that none of the slaves had received no rows from the first subquery, and this was going to lead to hanging problems.
I’m aware that in the previous article in this series I said I’d continue “in a few days” and it has now been more like 11 weeks – but finally I’ve got the time. In this article I’m going to talk primarily about Bloom filters and their impact on performance, but I’ll need to say something about the “virtual tables” and “parallel execution message size” before I begin. Take a look at this fragment of a parallel execution plan:
It’s finally time to take a close look at the parallel versions of the execution plan I produced a little while ago for a four-table hash join. In this note I’ll examine the broadcast parallel distribution. First, here’s a list of the hints I’m going to use to get the effect I want:
/*+ leading(t4 t1 t2 t3) full(t4) parallel(t4, 2) use_hash(t1) swap_join_inputs(t1) pq_distribute(t1 none broadcast) full(t1) parallel(t1, 2) use_hash(t2) swap_join_inputs(t2) pq_distribute(t2 none broadcast) full(t2) parallel(t2, 2) use_hash(t3) swap_join_inputs(t3) pq_distribute(t3 none broadcast) full(t3) parallel(t3, 2) monitor */
If you compare this set of hints to the hints I used for the serial plan you’ll see that the change involves two features: