Here’s an interesting (and potentially very useful) observation from an OTN database forum thread that appeared at the end of last week. It concerns the problems of pulling data from remote systems, and I’ll start by building some data:
Here’s a surprising (to me) execution plan from 126.96.36.199 – parallel execution to find one row in a table using a unique scan of a unique index – produced by running the following script (data creation SQL to follow):
In an earlier post I’ve described how a distributed query can operate at a remote site if it’s a simple select but has to operate at the local site if it’s a CTAS (create as select) or insert as select. There’s (at least) one special case where this turns out to be untrue … provided you write the query in the correct fashion. I discovered this only as a result of doing a few experiments in response to a question on the OTN database forum.
Here’s a little demonstration, cut-n-pasted with a little cosmetic editing from an 11gR1 SQL*Plus session:
A comment I’ve made many times in the past about distributed queries is that Oracle doesn’t try to retrieve histogram information from remote databases when optimizing a query. Checking back through previous posts, though, I don’t think I’ve ever mentioned it on the blog – so here’s an example demonstrating the point.
I’ll start with 188.8.131.52, with some fixed system stats (as they could make a difference to repeatability), and the following data set.
I have often said that the optimizer “forgets” that it is dealing with a distributed query once it has collected the stats that it can about the objects in the query, and that as a consequence the driving site for a distributed query will be the local database unless you use the /*+ driving_site */ hint to change it.
While investigating an oddity with a distributed query between two 184.108.40.206 databases a few days, I noticed something in the 10053 trace file that made me change my mind, and go back to look at earlier versions of Oracle.
Here are two sections extracted from a 10053 trace file running under 10.2.0.3 with CPU costing (system statistics) enabled:
SINGLE TABLE ACCESS PATH Table: T1 Alias: AWAY Card: Original: 3240 Rounded: 41 Computed: 40.50 Non Adjusted: 40.50 Access Path: TableScan Cost: 53.22 Resp: 53.22 Degree: 0 Cost_io: 53.00 Cost_cpu: 2073815 Resp_io: 53.00 Resp_cpu: 2073815 Access Path: index (AllEqRange) Index: 0 resc_io: 4.00 resc_cpu: 29536 ix_sel: 0.0125 ix_sel_with_filters: 0.0125 Cost: 4.00 Resp: 4.00 Degree: 1 Remote table cost added, new values: cost 4.00 resc 4.00 resp .2f Best:: AccessPath: IndexRange Index: 0 <<=== Cost: 4.00 Degree: 1 Resp: 4.00 Card: 40.50 Bytes: 0 ... HA cost: 50.54 resc: 50.54 resc_io: 50.00 resc_cpu: 5187262 resp: 50.54 resp_io: 50.00 resp_cpu: 5187262 Cost adjustment for NL join with remote table: 0.72 <<=== Join order aborted: cost > best plan cost ***********************
Note the two lines with the reference to “remote” (I’d highlight them properly, but you can’t do highlighing and code in the same text). Notice, also that one of the programmers made a bit of a mistake with their printf() call in the first of the lines – a bug that is still there in 220.127.116.11
Clearly Oracle is doing some arithmetic relating to the costs of accessing distributed data from at least 10.2.0.3 (there was nothing similar in the equivalent trace file for 18.104.22.168, and I don’t have a 10.1 available for testing). Unfortunately I have yet to see a single distributed execution plan where it does the right thing – but that might be a problem related to histograms (and the failure to use them) rather than a defect in the algorithms for distributed cost.
I’ll have to spend some time looking at what it does before I can write any more about it – but given the number of times I’ve said the optimizer doesn’t do any arithmetic I thought it was important to point out that I was wrong as soon as I discovered the change.
Footnote: I have added a category “distributed” to my list of categories – and added a link to it at the bottom of every article I’ve written about distributed SQL. That’s a pattern that I may copy across other articles in the future – especially if I can find out how to order the articles by date (ascending).