Search

Top 60 Oracle Blogs

Recent comments

Parallel Execution

Parallel Execution – 1

When you read an execution plan you’re probably trying to identify the steps that Oracle went through to acquire the final result set so that you can decide whether or not there is a more efficient way of getting the same result.

Quiz Night

Here’s a little quiz about Bloom filtering. There seem to be at least three different classes of query where Bloom filters can come into play – all involving hash joins: partition elimination, aggregate reduction on non-mergeable aggregate views, and parallelism.

This quiz is about parallel queries – and all you have to do is work out how many Bloom filters were used in the following two execution plans (produced by 11.2.0.2), and where they were used.

I’ve got 4 tables, 3 very small dimensions and one large fact. I’ve joined the three dimensions to the fact on their primary key, and filtered on each dimension. Stripping out the eighteen hints that I inserted to get the plans I wanted the queries both looked like this:

Parallel Execution

While checking out potential scalability threats recently on a client system, I was directed to a time-critical task that was currently executing the same PL/SQL procedure 16 times (with different parameters) between 6:00 and 7:00 am; as the system went through its next phase of expansion the number of executions of this procedure was likely to grow. An interesting detail, though, was that nothing else was going on while the task was running so the machine (which had 6 cores) was running at 16% CPU.

An obvious strategy for handling the required growth target was to make sure that four (possibly 5) copies of the procedure were allowed to run concurrently. Fortunately the different executions were completely independent of each other and didn’t interfere with each other’s data, so the solution simply required a mechanism to control the parallelism. Conveniently 11gR2 gave us one.

Parallel to Serial

Here’s a little problem that came up on the Oracle-L listserver today:

I’m trying to write a query which reads the corresponding partition of the fact, extracts the list of join keys, materialises this result set, and finally joins the necessary dimensions. The key thing I’m trying to do is to run the initial query on the fact in parallel and then the rest of the query serially.

12c Top N (px)

A comment from Greg Rahn in response to my posting yesterday prompted me to do a quick follow-up (test time ca. 3 minutes, write-up time, ca. 50 minutes – thanks for the temptation, Greg ;). Greg asked if the “Top N” would push down for a parallel query, so all I had to do was re-run my script with a parallel hint in place. Here’s the resulting execution plan (from explain plan, but v$sql_plan showed the same structure):

maxthr – 3

In part 1 of this mini-series we looked at the effects of costing a tablescan serially and then parallel when the maxthr and slavethr statistics had not been set.

In part 2 we looked at the effect of setting just the maxthr - and this can happen if you don’t happen to do any parallel execution while the stats collection is going on.

In part 3 we’re going to look at the two variations the optimizer displays when both statistics have been set. So here are the starting system stats:

maxthr – 2

Actually, there hasn’t been a “maxthr – 1″, I called the first part of this series“System Stats”. If you look back at it you’ll see that I set up some system statistics, excluding the maxthr and slavethr values, and described how the optimizer would calculate the cost of a serial tablescan, then I followed this up with a brief description of how the calculations changed if I hinted the optimizer into a parallel tablescan.

System Stats

Several years ago I wrote the following in “Cost Based Oracle – Fundamentals” (p.47):

The maxthr and slavethr figures relate to throughput for parallel execution slaves. I believe that the figures somehow control the maximum degree of parallelism that any given query may operate at by recording the maximum rate at which slaves have historically been able to operate—but I have not been able to verify this.

Browsing the internet recently, I discovered that that no-one else seems to have published anything to very my comment, so I decided it was about time I did so myself.  I’m going to work up to it in two blog notes , so if you do happen to know of any document that describes the impact of maxthr and slavethr on the optimizer’s costing algorithms please give me a reference in the comments – that way I might not have to write the second note.

TIMESTAMP WITH TIME ZONE Aggregation

The TIMESTAMP WITH TIME ZONE data type that got introduced a long time ago is known for some oddities, for example Tony Hasler has a nice summary of some of them here.Here is another oddity that shows up when trying to aggregate on such a data type. Have a look at the following simple example:


create table t
as
select
rownum as id
, date '2000-01-01' + rownum - 1 as some_date
, cast(date '2000-01-01' + rownum - 1 as timestamp) as some_timestamp
, cast(date '2000-01-01' + rownum - 1 as timestamp with local time zone) as some_timestamp_with_local_tz
, cast(date '2000-01-01' + rownum - 1 as timestamp with time zone) as some_timestamp_with_timezone
from
dual
connect by

Parallel DML

Parallel DML is not enabled by default, and it’s easy to forget this and think you’ve made an operation “as parallel as possible” by declaring objects parallel or putting in hints for parallelism.

A recent question on OTN asked about speeding up a  materialized view refresh that seemed to be spending a lot of its time waiting on “PX Deq Credit: send blkd”. The manuals describe this as an “idle event”; but that’s not always true. The OP had supplied the output from tkprof for one of the sessions showing the “insert as select” that was the (complete) refresh and it was clear that the select was running in parallel, but the insert wasn’t – and that’s one case in which the “PX Deq Credit: send blkd” is arguably an “idle” wait (with a timeout of 2 seconds). It’s possible that the refresh could go faster if the OP enabled parallel DML.