Note: This blog post actually serves three purposes:
Just a pointer to two blog posts that I find worth mentioning:
1. Christo Kutrovsky from Pythian writes about some quirks he found regarding Parallel Distribution of aggregation and analytic functions. In particular the lower part of the post (which is not about the initial Interval Partitioning issue) gives a lot of food for thought how the chosen Parallel Distribution can influence the performance of operations
A few years ago (2007) I wrote about a problem that could appear when you mixed parallel execution with system managed extent allocation. A couple of years later I added a note that Christian Antognini had observed a patch in 18.104.22.168 that addressed the specific issue I had raised. Today, thanks to an email exchange with Christo Kutrovsky of Pythian, I can report that there is a variation of this issue still available even in 22.214.171.124.
The basic problem is that you can end up with a very large number of very small extents, leading to poor performance in parallel queries and a significant waste of space in a data segment. Here’s a simple, though not completely realistic, way to demonstrate the problem.
[This post was originally published on 2012/02/29 and was hidden shortly thereafter. I’m un-hiding it as of 2012/05/30 with some minor edits.]
Many Oracle Database users like tools with GUI interfaces because they add features and functionality that are not easily available from the command line interfaces like SQL*Plus. One of the more popular tools from my experiences is Oracle SQL Developer in part because it’s a free tool from Oracle. Given SQL Developer’s current design (as of version 3.1.07.42), some issues frequently show up when using it with Oracle Databases with Parallel Execution. SQL Developer also contains a bug that exacerbates this issue as well.
This is just a short note about one of the potential side-effects of the new Auto Degree Of Parallelism (DOP) feature introduced in 11.2.
If you happen to have Parallel DML enabled in your session along with Auto DOP (and here I refer to the PARALLEL_DEGREE_POLICY = AUTO setting, not LIMITED) then it might take you by surprise that INSERT statements that are neither decorated with a parallel hint nor use any parallel enabled objects can be turned into direct-path inserts.
A few weeks back one of the Vertica developers put up a blog post on counting triangles in an undirected graph with reciprocal edges. The author was comparing the size of the data and the elapsed times to run this calculation on Hadoop and Vertica and put up the work on github and encouraged others: “do try this at home.” So I did.
Vertica draws attention to the fact that their compression brought the size of the 86,220,856 tuples down to 560MB in size, from a flat file size of 1,263,234,543 bytes resulting in around a 2.25X compression ratio. My first task was to load the data and see how Oracle’s Hybrid Columnar Compression would compare. Below is a graph of the sizes.
There are many reasons why a parallel execution might not run with the expected degree of parallelism (DOP), beginning with running out of parallel slaves (PARALLEL_MAX_SERVERS or PROCESSES reached), PARALLEL_ADAPTIVE_MULTI_USER, downgrades at execution time via the Resource Manager, or the more recent features like PARALLEL_DEGREE_LIMIT or the Auto DOP introduced in Oracle 11.2.
I’ve said in the past that one of the best new features, in my view, in 11g was the appearance of proper virtual columns; and I’ve also been very keen on the new “approximate NDV” that makes it viable to collect stats with the “auto_sample_size”.
Who’d have guessed that if you put them both together, then ran a parallel stats collection it would break
The bug number Karen quotes (10013177.8) doesn’t (appear to) mention extended stats – but since virtual columns, function-based indexes, and extended stats share a number of implementation details I’d guess that they might be affected as well.
Continuing from the previous part of this series I'll cover in this post some further basics about parallel execution control:
- Keep in mind that there are two classes of parallel hints: PARALLEL and PARALLEL_INDEX. One is about the costing of parallel full table / index fast full scans, the other one about costing (driving) parallel index scans, which are only possible with partitioned indexes (PX PARTITION granule vs. PX BLOCK granule)
This is just a short heads-up for those that come across an execution plan showing the PX COORDINATOR FORCED SERIAL operation. I don't have official confirmation but according to my tests a plan with such an operation effectively means: Cost for parallel execution but execute serially (You might remember that I've recently mentioned in another post that there is the possibility to have a plan executed in parallel but costed serially, weird isn't it). Why such an operation exists is not clear to me - obviously it would make much more sense to cost straight away for serial execution in such a case. Probably there is a good reason, otherwise such an operation didn't exist but I think at least the costing is questionable in current versions.