Here’s the output I got from a 10.2.0.5 system after generating a stored outline on a query – then dropping the index that was referenced by the stored outline and creating an alternative index. Spot the problem:
When the optimizer has to estimate the data volume (the BYTES column in the plan output), it usually bases this information on the column statistics, if applicable and available (think of complex expressions).However, whenever there is a VIEW operator in an execution plan, that represents an unmerged view, the optimizer obviously "loses" this information and starts applying defaults that are based on the column definition.Depending on the actual content of the columns this can lead to dramatic differences in data volume estimates.Both, under- and overestimates are possible, because for character based columns these defaults seem to be based on an assumed 50% fill grade, so a VARCHAR2(100 BYTE) column counts as 50 bytes data volume.For multi-byte character sets the same rule applies based on the maximum width of a column using the "char" semantics, so a VARCHAR2(1000 CHAR) column counts as 2000 byte
It probably won’t surprise many people to hear me say that the decode() function can be a bit of a nuisance; and I’ll bet that quite a lot of people have had trouble occasionally trying to get function-based indexes that use this function to behave properly. So (to put it all together and support the general directives that case is probably a better choice than decode() and that the cast() operator is an important thing to learn) here’s an example of how function-based indexes don’t always allow you to work around bad design/code. (Note: this is a model of a problem I picked up at a client site, stripped to a minimum – you have to pretend that I’m not allowed to fix the problem by changing code).
First we create some data and indexes, and gather all relevant stats:
Or – to be more accurate – real statistics on a virtual column.
This is one of the “10 top tips” that I came up with for my session with Maria Colgan at OOW13. A method of giving more information that might improve execution plans when you can change the code. I’ll start with a small data set including a virtual column (running 18.104.22.168), and a couple of problem queries:
Here’s a little demo cut-n-pasted from a session running Oracle 22.214.171.124 (it works on 11g, too). All it does is create a table by copying from a well-known table, gather extended stats on a column group, then show you the resulting column names by querying view user_tab_cols.
I’ve written notes about the different joins in the past – but such things are always worth revisiting, so here’s an accumulated bundle of comments about hash joins.
Here’s a funny little problem I came across some time ago when setting up some materialized views. I have two tables, orders and order_lines, and I’ve set up materialized view logs for them that allow a join materialized view (called orders_join) to be fast refreshable. Watch what happens if I refresh this view just before gathering stats on the order_lines table.
I have a little script that start with “set echo on”, then calls two packaged procedures, one to refresh the join view, the other to collect stats on the order_lines table; here’s the output from that script:
Here’s a little detail that I hadn’t noticed before (and it goes back to at least 8i). This is running on 126.96.36.199, and table t1 is just all_objects where rownum <= 20000:
I’ve posted this note as a quick way of passing on an example prompted by a twitter conversation with Timur and Maria about Bloom filters:
— Jonathan Lewis (@JLOracle) August 5, 2013
The Bloom filter (capital B because it’s named after a person) is not supposed to appear in Oracle plans unless the query is executing in parallel but here’s an example which seems to use a serial Bloom filter. Running in 188.8.131.52 and 184.108.40.206 (the results shown are the latter – the numbers are slightly different between versions):
Continuing from the previous post, here is one more case when adjusting optimizer_index_cost_adj may hurt you.
drop table t1 cascade constraints purge; create table t1 (id, x, pad, constraint t1_pk primary key(id, x)) as select trunc(rownum/10) , mod(rownum, 10) , s1.text from all_source s1, all_source s2 where rownum <= 1e6; exec dbms_stats.gather_table_stats(user, 't1', method_opt=>'for all columns size 1', cascade=>true, no_invalidate=>false) alter session set optimizer_index_cost_adj = 100; alter session set optimizer_index_caching = 0; explain plan for select * from t1 where x = :1; @xp
Here’s the plan: