It’s easy to make mistakes, or overlook defects, when constructing parallel queries – especially if you’re a developer who hasn’t been given the right tools to make it easy to test your code. Here’s a little trap I came across recently that’s probably documented somewhere, which could be spotted easily if you had access to the OEM SQL Monitoring screen, but would be very easy to miss if you didn’t check the execution plan very carefully. I’ll start with a little script to generate some data:
A couple of days ago I highlighted an optimizer anomaly caused by the presence of an index with a descending column. This was a minor (unrelated) detail that appeared in a problem on OTN where the optimizer was using an index FULL scan when someone was expecting to see an index RANGE scan. My earlier posting supplies the SQL to create the table and indexes I used to model the problem – and in this posting I’ll explain the problem and answer the central question.
I’ve just had cause to resurrect a blog note I wrote three years ago. The note says that an anomaly I discovered in 22.214.171.124 wasfixed in 10.2.0.3 – and this is true for the simple example in the posting; but a recent question on the OTN database forum has shown that the bug still appears in more complex cases. Here’s some code to create a table and two indexes:
There’s a thread on the OTN database forum at present asking for advice on optimising a query that’s trying to find “the most recent price” for a transaction given that each transaction is for a stock item on a given date, and each item has a history of prices where each historic price has an effective start date. This means the price for a transaction is the price as at the most recent date prior to the transaction date.
I’ve probably seen a transformation like the following before and I may even have written about it (though if I have I can’t the article), but since it surprised me when I was experimenting with a little problem a few days ago I thought I’d pass it on as an example of how sophisticated the optimizer can be with query transformation. I’ll be talking about the actual problem that I was working on in a later post so I won’t give you the table and data definitions in this post, I’ll just show some SQL and its plan:
As I write, there’s an ongoing thread on Oracle-L that started with the (paraphrased) question: “I’ve got this query that returns 7 million rows; when I change it to ‘select count(*)’ it returns in 4 seconds but when we display the full result set on screen it takes hours, and every second or two the screen pauses; how do I make it go faster.”
The general rapid response was: “You shouldn’t be running 7M rows to a screen – the time is the time for the network traffic and display.”
One of the worst problems with upgrades is that things sometimes stop working. A particular nuisance is the execution plan that suddenly stops appearing, to be replaced by an alternative plan that is much less efficient.
Apart from the nuisance of the time spent trying to force the old plan to re-appear, plus the time spent working out a way of rewriting the query when you finally decide the old plan simply isn’t going to re-appear, there’s also the worry about WHY the old plan won’t appear. Is it some sort of bug, is it that some new optimizer feature has disabled some older optimizer feature, or is it that someone in the optimizer group realised that the old plan was capable of producing the wrong results in some circumstances … it’s that last possibility that I find most worrying.
No, not the 10th posting about first_rows() this week – whatever it may seem like – just an example that happens to use the “calculate costs for fetching the first 10 rows” optimizer strategy and does it badly. I think it’s a bug, but it’s certainly a defect that is a poster case for the inherent risk of using anything other than all_rows optimisation. Here’s some code to build a couple of sample tables:
A recent posting on OTN reminded me that I haven’t been poking Oracle 12c very hard to see which defects in reporting execution plans have been fixed. The last time I wrote something about the problem was about 20 months ago referencing 126.96.36.199; but there are still oddities and irritations that make the nice easy “first child first” algorithm fail because the depth calculated by Oracle doesn’t match the level that you would get from a connect-by query on the underlying plan table. Here’s a simple fail in 12c:
It’s been about 8 months since I posted a little note about a “notable change in behaviour” of the optimizer when dealing with subqueries in the where clause that could be used to return a constant, e.g.:
select * from t1 where id between (select 10001 from dual) and (select 90000 from dual) ;
There’s been a note at the start of the script ever since saying: Check if this is also true for any table with ‘select fixed_value from table where primary = constant’ I finally had a few minutes this morning (San Francisco time) to check – and it does, in both 188.8.131.52 and 184.108.40.206. With the t1 table from the previous article run the following: