Continuing from the previous installment of this series I'll cover in this post some of the inevitable classics regarding Parallel Execution Control. So forgive me if you're bored by the repetition of known facts - however I still see these things too often used incorrectly, therefore I decided: This is worth to mention and remember!
- Up to and including version 10.2 PARALLEL (without any parameters) and NOLOGGING are valid keywords only in DDL commands
- Applies to all versions: NOLOGGING can not be used as a hint. It can only be specified as part of DDL, for example ALTER INDEX ... REBUILD PARALLEL NOLOGGING.
Motivated by Arup Nanda's great "100 Things You Probably Didn't Know About Oracle Database" and inspired by Jonathan Lewis' "Philosophy" series I'll start a new series about things worth to mention and remember, but in contrast to my usual posts I'll furthermore set a new challenge for myself - trying to write as short and concise as Jonathan does in his series. It is probably no coincidence that Charles Hooper and myself contributed by far the longest chapter to "Expert Oracle Practices" :-)
So don't expect any lengthy descriptions and testcases in this series - also I don't intend to provide here ground-breaking material. All this has probably been published several times in the past, nevertheless I believe it's worth to mention these things again (and again...).
Here we go, the first one worth to mention and remember is important for those that have an Enterprise Edition license and make use of the Parallel Execution feature:
If you haven't enabled Parallel DML on session level (ALTER SESSION ENABLE/FORCE PARALLEL DML) you are not able to force parallel DML execution on statement level using explicit PARALLEL hints for the DML operation. (This one is probably known)
If you disable Parallel Query on session level (ALTER SESSION DISABLE PARALLEL QUERY) explicit PARALLEL hints on statement level for a query operation will overrule this and therefore still use Parallel Query. The session setting only overrules the object level parallel settings, not the PARALLEL hint (This is probably less known)
Although there is probably a rationale behind this it is still obvious I think that there is a major inconsistency here.
In particular the latter is unfortunate if you have existing code with embedded PARALLEL hints and want to ensure that the execution is done serially without any code changes in order to not waste any parallel slaves processes meant to be used by other concurrent executions.
To me it would be much more intuitive and sensible if the setting on session level overruled anything else and therefore allowed to control the code using that general setting, as it does with parallel DML.
By the way, a parallel DML degree forced via ALTER SESSION FORCE PARALLEL DML PARALLEL N will then still be overruled by explicit PARALLEL hints applied to the DML operation, it's just that PARALLEL DML is simply not possible if not at least enabled at session level.
Actually I'm not aware of any officially supported way (which means without fiddling with any undocumented parameters) of preventing the Parallel Query execution of statements with embedded PARALLEL hints on session level besides using the Resource Manager and limiting the Parallel Degree to 1.
Note that this approach has again its own quirks and drawbacks, since obviously the execution plan generated based on the optimizer assumption of Parallel Execution will be effectively downgraded to serial execution at runtime rather than being re-optimized for serial execution. In many cases not a good idea - an execution plan generated for serial execution might look quite different and might be much more efficient when being executed serially.
As a final note, if you think you can work around this by using the undocumented "_optimizer_ignore_hints" parameter, you will be in for a surprise that apparently the APPEND, PARALLEL and NO_PARALLEL hints do not get ignored and still apply and therefore parallelize query execution.
Addendum March 2011: One further oddity should not be left unmentioned: In above mentioned scenario of parallel query being disabled on SESSION level via ALTER SESSION DISABLE PARALLEL QUERY and the usage of PARALLEL hints the following applies:
- Potentially a plan based on parallel execution is generated by the optimizer (as outlined)
- But the costing of the operations is based on serial execution, so the costs of parallel full table scans or index scans are not scaled down based on the degree of parallelism as usually
This means that the resulting execution plans are possibly different to a session where parallel query is enabled - it is more likely to get a serial execution plan favoured by the optimizer in the scenario with parallel query disabled due to the costing based on serial costs.
Nevertheless at runtime parallel execution (with the computed parallel degree) will be used if a parallel plan has been generated by the optimizer.
You’ve probably heard sayings like “sometimes things aren’t always what they seem” and “people lie”. Well, sometimes execution plans lie. It’s not really by intent, but it is sometimes difficult (or impossible) to represent everything in a query execution tree in nice tabular format like dbms_xplan gives. One of the optimizations that was introduced back in 10gR2 was the use of bloom filters. Bloom filters can be used in two ways: 1) for filtering or 2) for partition pruning (bloom pruning) starting with 11g. Frequently the data models used in data warehousing are dimensional models (star or snowflake) and most Oracle warehouses use simple range (or interval) partitioning on the fact table date key column as that is the filter that yields the largest I/O reduction from partition pruning (most queries in a time series star schema include a time window, right!). As a result, it is imperative that the join between the date dimension and the fact table results in partition pruning. Let’s consider a basic two table join between a date dimension and a fact table. For these examples I’m using STORE_SALES and DATE_DIM which are TPC-DS tables (I frequently use TPC-DS for experiments as it uses a [...]
[back to Introduction] Leveraging Oracle’s Parallel Execution (PX) in your Oracle data warehouse is probably the most important feature/technology one can use to speed up operations on large data sets. PX is not, however, “go fast” magic pixi dust for any old operation (if thats what you think, you probably don’t understand the parallel computing paradigm). With Oracle PX, a large task is broken up into smaller parts, sub-tasks if you will, and each sub-task is then worked on in parallel. The goal of Oracle PX: divide and conquer. This allows a significant amount of hardware resources to be engaged in solving a single problem and is what allows the Oracle database to scale up and out when working with large data sets. I though I’d touch on some basics and add my observations but this is by far not an exhaustive write up on Oracle’s Parallel Execution. There is an entire chapter in the Oracle Database documentation on PX as well as several white papers. I’ve listed all these in the Resources section at the bottom of this post. Read them, but as always, feel free to post questions/comments here. Discussion adds great value. A Basic Example of Parallel Execution [...]