From time to time someone publishes a query on the OTN database forum and asks how to make it go faster, and you look at it and think it’s a nice example to explain a couple of principles because it’s short, easy to understand, obvious what sort of things might be wrong, and easy to fix. Then, after you’ve made a couple of suggestions and explained a couple of ideas the provider simply fades into the distance and doesn’t tell you any more about the query, or whether they’ve taken advantage of your advice, or found some other way to address the problem.
Such a query, with its execution plan, appeared a couple of weeks ago:
This is another little reference list I should have created some time ago. It covers a series of posts on interpreting parallel execution plans and understanding where the work happens.
There’s a live example on OTN at the moment of an interesting class of problem that can require some imaginative thinking. It revolves around a design that uses a row in one table to hold the low and high values for a range of values in another table. The problem is then simply to count the number of rows in the second table that fall into the range given by the first table. There’s an obvious query you can write (a join with inequality) but if you have to join each row in the first table to several million rows in the second table, then aggregate to count them, that’s an expensive strategy. Here’s the query (with numbers of rows involved) that showed up on OTN; it’s an insert statement, and the problem is that it takes 7 hours to insert 37,600 rows:
It’s easy to make mistakes, or overlook defects, when constructing parallel queries – especially if you’re a developer who hasn’t been given the right tools to make it easy to test your code. Here’s a little trap I came across recently that’s probably documented somewhere, which could be spotted easily if you had access to the OEM SQL Monitoring screen, but would be very easy to miss if you didn’t check the execution plan very carefully. I’ll start with a little script to generate some data:
A couple of days ago I highlighted an optimizer anomaly caused by the presence of an index with a descending column. This was a minor (unrelated) detail that appeared in a problem on OTN where the optimizer was using an index FULL scan when someone was expecting to see an index RANGE scan. My earlier posting supplies the SQL to create the table and indexes I used to model the problem – and in this posting I’ll explain the problem and answer the central question.
I’ve just had cause to resurrect a blog note I wrote three years ago. The note says that an anomaly I discovered in 220.127.116.11 wasfixed in 10.2.0.3 – and this is true for the simple example in the posting; but a recent question on the OTN database forum has shown that the bug still appears in more complex cases. Here’s some code to create a table and two indexes:
There’s a thread on the OTN database forum at present asking for advice on optimising a query that’s trying to find “the most recent price” for a transaction given that each transaction is for a stock item on a given date, and each item has a history of prices where each historic price has an effective start date. This means the price for a transaction is the price as at the most recent date prior to the transaction date.
I’ve probably seen a transformation like the following before and I may even have written about it (though if I have I can’t the article), but since it surprised me when I was experimenting with a little problem a few days ago I thought I’d pass it on as an example of how sophisticated the optimizer can be with query transformation. I’ll be talking about the actual problem that I was working on in a later post so I won’t give you the table and data definitions in this post, I’ll just show some SQL and its plan:
As I write, there’s an ongoing thread on Oracle-L that started with the (paraphrased) question: “I’ve got this query that returns 7 million rows; when I change it to ‘select count(*)’ it returns in 4 seconds but when we display the full result set on screen it takes hours, and every second or two the screen pauses; how do I make it go faster.”
The general rapid response was: “You shouldn’t be running 7M rows to a screen – the time is the time for the network traffic and display.”