Search

Top 60 Oracle Blogs

Recent comments

February 2011

WordPress 3.0.5…

WordPress 3.0.5 has been released. Happy upgrading.

Cheers

Tim…

GTTs

That’s Global Temporary Table, of course.

A request appeared on the OTN database forum recently asking if it was possible to allocate a global temporary table to a particular temporary tablespace because the programmer didn’t want typical users to be at risk of running out of temporary space when a particular global temporary table suddenly got loaded with millions of rows.

If you’re running 11.1 or above, go straight to the footnote, otherwise read on.

At first sight the answer seems to be no – but an important point in the question was the comment: “I have a procedure which …”. This may make it possible to isolate this table in the required way.

When someone inserts data into a global temporary table the tablespace used to instantiate their version of the table is the temporary tablespace of the schema that is inserting the data. But remember, if you call a procedure to insert the data it’s (usually) the schema that owns the procedure that is doing the insert. (I say “usually” because you could, of course, have a procedure defined as “authid current_user”)

Consider the following (running on 11.1.0.6):

select
	username, temporary_tablespace
from
	dba_users
where
	username in ('U1','TEST_USER')
;

USERNAME        TEMPORARY_TABLESPACE
--------------- --------------------
U1              TEMP
TEST_USER       TEMP2

I have two users with different temporary tablespaces. In schema test_user we do the following:

create global temporary table t
as
select	*
from	all_objects
where	rownum = 0;

grant all on t to u1;

create or replace procedure pt
as
begin
	insert into t select * from all_objects where rownum <= 1000;
end;
/

grant execute on pt to u1;

Now we connect as user u1 and run the following:


SQL> insert into test_user.t select * from all_objects where rownum < = 10;

10 rows created.

SQL> commit;

Commit complete.

SQL> execute test_user.pt

PL/SQL procedure successfully completed.

SQL> commit;

Commit complete.

The first commit it very important, by the way. In another session logged in through a suitably privileged account, execute the following before each commit:

select username, tablespace from v$sort_usage;

The results after the direct call to insert into the GTT will be:

USERNAME TABLESPACE
-------- ----------------
U1       TEMP

The results from the call to the procedure will be:

USERNAME TABLESPACE
-------- ----------------
U1       TEMP2

The direct insert has used the temporary tablespace of schema U1, but the insert from the procedure call has used the temporary tablespace of schema test_user. So there are ways to allocate a global temporary table to a particularly temporary tablespace if you always use a procedure to populate it. I did point out, though, that the first commit in the demonstration was important. If you don’t do this commit the global temporary table will have been instantiated in the “wrong” temporary tablespace at the point you call the procedure, so the data from the procedure call would be added to the existing data – i.e. where you didn’t want it to go.

Footnote: Everything changes – you’ll see in a response to the thread from Uwe Hesse that from 11g onwards you can specify a tablespace for a global temporary tablespace, and that’s where the data will go. (But I’d drafted this note before I got back to the thread – and it’s still valid for earlier versions of Oracle.)

Footnote 2: I had to create a spare temporary tablespace for this demo. It’s funny little detail that although you: “create TEMPORARY tablespace … TEMPfile…”, when it comes to dropping it’s: “drop tablespace … including … datafiles”.

SQL Performance Problem, AWR Reports Query is #1 in Elapsed Time, #2 in CPU Time – How Would You Help?

February 6, 2011 (Updated February 25, 2011) I occasionally read various forums on the Internet including a couple of Usenet groups (comp.databases.oracle.server, comp.databases.oracle.misc), OTN forums (Database – General, SQL and PL/SQL), Oracle-L, AskTom, and a handful of other forums.  I don’t participate on all of the forums, but I try to help when possible on [...]

Things worth to mention and remember (I) - Parallel Execution Control

Continue to Part II

Motivated by Arup Nanda's great "100 Things You Probably Didn't Know About Oracle Database" and inspired by Jonathan Lewis' "Philosophy" series I'll start a new series about things worth to mention and remember, but in contrast to my usual posts I'll furthermore set a new challenge for myself - trying to write as short and concise as Jonathan does in his series. It is probably no coincidence that Charles Hooper and myself contributed by far the longest chapter to "Expert Oracle Practices" :-)

So don't expect any lengthy descriptions and testcases in this series - also I don't intend to provide here ground-breaking material. All this has probably been published several times in the past, nevertheless I believe it's worth to mention these things again (and again...).

Here we go, the first one worth to mention and remember is important for those that have an Enterprise Edition license and make use of the Parallel Execution feature:

If you haven't enabled Parallel DML on session level (ALTER SESSION ENABLE/FORCE PARALLEL DML) you are not able to force parallel DML execution on statement level using explicit PARALLEL hints for the DML operation. (This one is probably known)

If you disable Parallel Query on session level (ALTER SESSION DISABLE PARALLEL QUERY) explicit PARALLEL hints on statement level for a query operation will overrule this and therefore still use Parallel Query. The session setting only overrules the object level parallel settings, not the PARALLEL hint (This is probably less known)

Although there is probably a rationale behind this it is still obvious I think that there is a major inconsistency here.

In particular the latter is unfortunate if you have existing code with embedded PARALLEL hints and want to ensure that the execution is done serially without any code changes in order to not waste any parallel slaves processes meant to be used by other concurrent executions.

To me it would be much more intuitive and sensible if the setting on session level overruled anything else and therefore allowed to control the code using that general setting, as it does with parallel DML.

By the way, a parallel DML degree forced via ALTER SESSION FORCE PARALLEL DML PARALLEL N will then still be overruled by explicit PARALLEL hints applied to the DML operation, it's just that PARALLEL DML is simply not possible if not at least enabled at session level.

Actually I'm not aware of any officially supported way (which means without fiddling with any undocumented parameters) of preventing the Parallel Query execution of statements with embedded PARALLEL hints on session level besides using the Resource Manager and limiting the Parallel Degree to 1.

Note that this approach has again its own quirks and drawbacks, since obviously the execution plan generated based on the optimizer assumption of Parallel Execution will be effectively downgraded to serial execution at runtime rather than being re-optimized for serial execution. In many cases not a good idea - an execution plan generated for serial execution might look quite different and might be much more efficient when being executed serially.

As a final note, if you think you can work around this by using the undocumented "_optimizer_ignore_hints" parameter, you will be in for a surprise that apparently the APPEND, PARALLEL and NO_PARALLEL hints do not get ignored and still apply and therefore parallelize query execution.

Addendum March 2011: One further oddity should not be left unmentioned: In above mentioned scenario of parallel query being disabled on SESSION level via ALTER SESSION DISABLE PARALLEL QUERY and the usage of PARALLEL hints the following applies:

- Potentially a plan based on parallel execution is generated by the optimizer (as outlined)
- But the costing of the operations is based on serial execution, so the costs of parallel full table scans or index scans are not scaled down based on the degree of parallelism as usually

This means that the resulting execution plans are possibly different to a session where parallel query is enabled - it is more likely to get a serial execution plan favoured by the optimizer in the scenario with parallel query disabled due to the costing based on serial costs.

Nevertheless at runtime parallel execution (with the computed parallel degree) will be used if a parallel plan has been generated by the optimizer.

Dynamic Views

People forget from time to time that when you query the dynamic performance views (v$ views) you shouldn’t expect to get a read-consistent result set – and this can lead to some very strange results, especially when you start writing joins between views (for example you may be able to find a session is both “not waiting” and “waiting” if you still join v$session and v$session_wait).

Here’s a visual analogy to help you remember what’s going on: “Australian Propellors ?”

And a graphic to help explain the phenomenon: “Rolling Shutter Effect.”

And here, for stunning effect only, is another clip that appeared in the “related topics”.

Cunning way to thwart spammers…

I wrote the other day about the level of spam I was having on my forum. I did a little playing about and all seemed to calm down. In fact the forum got very quiet indeed. It was quite pleasant not having to worry about it.

Anyway, today I got a little suspicious as I had no posts at all. I tried to write a post myself and got a server error. It seems the last switch I flicked on the control panel tried to do something my hosting provider doesn’t allow and blocked all posts.

So my cunning plan turned out to be not so cunning afterall… :(

Let’s see if the deluge starts again…

Cheers

Tim…

Getting up and running with Universal Connection Pool – Take 2

In yesterday’s post (which is actually didn’t want to post that day) I wrote about the Universal Connection Pool feature. You should be able to get started with the information I gave you, but it didn’t include any hints on how to have a look under the covers of UCP. This can be changed …Oracle includes very fine-grained logging information with UCP, but experiment show that you have to either use log level FINE or FINEST to get to the real information of what’s going on.

LOGGING

Tomcat uses the log4j framework to define its own logging, as shown in the catalina.{bat,sh} file:

set LOGGING_CONFIG=-Djava.util.logging.config.file=”%CATALINA_BASE%\conf\logging.properties”

One thing you possibly don’t want to do is to include your own log4j configuration there-any change to your application’s file requires a restart of tomcat. Think of a production environment and then it becomes clear why such an approach is neither desirable nor practical.

Instead, you can put a file called logging.properties into the src directory within your application. In it you define logging to your heart’s delight. I have taken the following from the documentation and Pas Apicella’s blog. It really opens the floodgates for very detailed logging, so don’t use this in any other than a development environment…

handlers = org.apache.juli.FileHandler, java.util.logging.ConsoleHandler

org.apache.juli.FileHandler.level = ALL
org.apache.juli.FileHandler.directory = ${catalina.base}/logs
org.apache.juli.FileHandler.prefix = ucp.
org.apache.juli.FileHandler.formatter = oracle.ucp.util.logging.UCPFormatter

java.util.logging.FileHandler.level = WARNING

.level = FINEST

# FCF classes
oracle.ucp.common.FailoverEventHandlerThreadBase.level = ALL

oracle.ucp.jdbc.oracle.ONSDatabaseFailoverEvent.level = ALL
oracle.ucp.jdbc.oracle.ONSRuntimeLBEventHandlerThread.level = ALL
oracle.ucp.jdbc.oracle.ONSOracleRuntimeLBEventSubscriber.level = ALL
oracle.ucp.jdbc.oracle.ONSRuntimeLoadBalancingEvent.level = ALL
oracle.ucp.jdbc.oracle.ONSOracleFailoverEventSubscriber.level = ALL
oracle.ucp.jdbc.oracle.ONSDatabaseEventHandlerThread.level = ALL
oracle.ucp.jdbc.oracle.OracleFailoverEvent.level = ALL
oracle.ucp.jdbc.oracle.OracleFailoverEventSubscriber.level = ALL
oracle.ucp.jdbc.oracle.OracleFailoverEventImpl.level = ALL
oracle.ucp.jdbc.oracle.OracleFailoverEventNotification.level = ALL
oracle.ucp.jdbc.oracle.OracleFailoverHandler.level = ALL
oracle.ucp.jdbc.oracle.OracleFailoverablePooledConnection.level = ALL

oracle.ucp.jdbc.oracle.OraclePooledConnectionConnectionPool.level = ALL
oracle.ucp.jdbc.oracle.OracleConnectionConnectionPool.level = ALL
oracle.ucp.jdbc.oracle.OracleXAConnectionConnectionPool.level = ALL
oracle.ucp.jdbc.oracle.OracleJDBCConnectionPool.level = ALL

oracle.ucp.jdbc.oracle.OracleDatabaseInstanceInfo.level = ALL
oracle.ucp.jdbc.oracle.OracleDatabaseInstanceInfoList.level = ALL

# RCLB classes
oracle.ucp.jdbc.oracle.ONSRuntimeLBEventHandlerThread.level = ALL
oracle.ucp.jdbc.oracle.ONSOracleRuntimeLBEventSubscriber.level = ALL
oracle.ucp.jdbc.oracle.OracleRuntimeLoadBalancingHandler.level = ALL
oracle.ucp.jdbc.oracle.ONSRuntimeLoadBalancingEvent.level = ALL

Now when you redeploy your application, a new file ucp.timestamp will be created in %CATALINA_HOME%\logs and list all the beauty of what’s going on. The log file is now very verbose though.

FAST CONNECTION FAILOVER

The previous source code didn’t take full potential of the fact that the fastConnectionFailover functionality was enabled. For the following to work you might want to disable the validateConnectionOnBorrow setting in META-INF/context.xml. In fact, we’d like to validate the connection when we get it from the pool ourselves. The code of the Hello.java servlet has changed as follows:

// time to get UCP to start
writer.println("

UCP

"); Connection conn = null; oracle.ucp.jdbc.PoolDataSource ds = null; try { Context ctx = new InitialContext(); Context envContext = (Context) ctx.lookup("java:/comp/env"); ds = (oracle.ucp.jdbc.PoolDataSource) envContext.lookup ("jdbc/UCPPool"); writer.println("Got the datasource - "); writer.println("FCF enabled? " + ds.getFastConnectionFailoverEnabled()); conn = ds.getConnection(); } catch (Exception e) { try { // here's where the FCF comes in if (conn == null || !((ValidConnection) conn).isValid()) { writer.println("

Have to retry connection (" + e.getMessage() + ")

"); OracleJDBCConnectionPoolStatistics stats = (OracleJDBCConnectionPoolStatistics) ds.getStatistics(); writer.println("Pool stats: " + stats.getFCFProcessingInfo()); conn.close(); } else { writer.println("Unknown exception: " + e); } } catch (SQLException sqle) { e.printStackTrace(); return; } } try { writer.println("

Connected to Oracle intance

"); Statement stmt = conn.createStatement(); ResultSet rs = stmt.executeQuery("select 'Hello World from '||sys_context('userenv','instance_name') from dual"); while (rs.next()) { writer.println("

" + rs.getString(1) + "

");
}

rs.close();
stmt.close();
conn.close();

} catch (Exception e) {
writer.println("

" + e + "

");
}

writer.println("");
writer.println("");

}

I will probably attach the code to the post as the layout is pretty broken.

IS THE POOL INTELLIGENT?

To see if the runtime load balancing feature works at all I shut down the second of my two database instances before restarting the application. I expect all sessions to be on the first instance (to be expected as there is not a second one). Is that so?

SQL> ed
Wrote file afiedt.buf

 1  select count(inst_id),inst_id,
 2  status from gv$session where username='SCOTT'
 3* group by inst_id, status
SQL> /
COUNT(INST_ID)    INST_ID STATUS
-------------- ---------- --------
58          1 INACTIVE

Now what happens when I start the second instance?

SQL> !srvctl start instance -d lnrut1d -i LNRUT1D_2
/

SQL>
COUNT(INST_ID)    INST_ID STATUS
-------------- ---------- --------
 69          1 INACTIVE

SQL> /

COUNT(INST_ID)    INST_ID STATUS
-------------- ---------- --------
 57          1 INACTIVE
 19          2 INACTIVE

SQL> /

COUNT(INST_ID)    INST_ID STATUS
-------------- ---------- --------
 25          1 INACTIVE
 45          2 INACTIVE

That’s neat-the second instance catches up. I eventually ended up near-equilibrium. The inverse is also true. Consider two instances up as shown here:

SQL> select count(inst_id),inst_id,
 2  status from gv$session where username='SCOTT'
 3  group by inst_id, status
 4  /

COUNT(INST_ID)    INST_ID STATUS
-------------- ---------- --------
 21          1 INACTIVE
 26          2 INACTIVE
 1          2 ACTIVE
 2          1 ACTIVE

SQL> !srvctl stop instance -d lnrut1d -i LNRUT1D_2 -o abort

SQL> /

COUNT(INST_ID)    INST_ID STATUS
-------------- ---------- --------
 36          1 INACTIVE

Now when I shut the second instance down, the number of sessions on the first node goes up:


SQL> /

COUNT(INST_ID)    INST_ID STATUS
-------------- ---------- --------
 65          1 INACTIVE

I like this!

A short podcast...

We've just posted a short podcast with more information about the Real World Performance days I'm participating in. We are starting on the west coast of the US and will be presenting across the country over the next few months - eventually ending up in the UK.

For more information or to sign up - goto http://www.ioug.org/ and select Events -> A day of Real World Performance from the menu.
Hope to see you there!

A question came

A question came up on my blog entry http://arup.blogspot.com/2011/01/more-on-interested-transaction-lists.html. I think the question warrants a more comprehensive explanation instead of an answer of a few lines. So I decided to create another blog.

Here was the question:

Could you please explain on the scenario when multiple transactions try to update the same row as well. Will there be any ITL allocated? Yes, I am talking about the real locking scenario.

Paraphrased differently, the reader wants to know what would happen when this series of event happens:

Upgrade issues

Here’s an example of how a bug-fix can create problems. It’s a code change in 11.2.x.x and (I believe) 10.2.0.5 relating to the costing of queries involving (but perhaps not restricted to) composite partitioned tables. I first saw this change in an email from Doug Burns, who sent me the 10053 traces from a very simple query that had started using the wrong index after an upgrade from 10.2.0.4 to 11.2.0.2.

As part of his testing he had set the optimizer_features_enable parameter back to 10.2.0.4 and found that not only did the choice of index change back to the expected index, but the costs of the two indexes changed dramatically. (The cost of using the “right” index changed from 15 to something in excess of 9,000 on the upgrade!)

The example I’m going to show you demonstrates the principles of the cost change – but with the data set and statistics I’ve generated you won’t see a change of execution plan. This is typical of some of the models I create – it’s enough to establish a principle, after which it’s simple enough to recognise the problems that the principle can cause.

So here’s a data set. I created it in a default 11.2.0.2 install on Windows 32-bit, so it’s running with 8KB blocks, autoallocate extents, and ASSM – but I’ve disabled CPU costing:

create table t1(
	list_col	number,
	range_col	number,
	v100		varchar2(10),
	v10		varchar2(10),
	vid		varchar2(10),
	padding		varchar2(100)
)
partition by range(range_col)
subpartition by list (list_col)
subpartition template (
	subpartition s0		values(0),
	subpartition s1		values(1),
	subpartition s2		values(2),
	subpartition s3		values(3),
	subpartition s4		values(4),
	subpartition s5		values(5),
	subpartition s6		values(6),
	subpartition s7		values(7),
	subpartition s8		values(8),
	subpartition s9		values(9)
)
(
	partition p_10 values less than (10),
	partition p_20 values less than (20),
	partition p_30 values less than (30),
	partition p_40 values less than (40),
	partition p_50 values less than (50),
	partition p_60 values less than (60),
	partition p_70 values less than (70),
	partition p_80 values less than (80),
	partition p_90 values less than (90),
	partition p_100 values less than (100)
)
;

insert into t1
with generator as (
	select	--+ materialize
		rownum id
	from dual
	connect by
		rownum <= 10000
)
select
	mod(rownum,10),
	mod(rownum,100),
	lpad(mod(rownum,10),10),
	lpad(mod(rownum,100),10),
	lpad(rownum,10),
	rpad('x',100)
from
	generator	v1,
	generator	v2
where
	rownum <= 1000000
;

alter table t1 add constraint t1_pk primary key(v10, vid, v100, range_col, list_col) using index local;

create index t1_one_col on t1(v100) local;

begin
	dbms_stats.gather_table_stats(
		ownname		 => user,
		tabname		 =>'T1',
		method_opt 	 => 'for all columns size 1'

	);
end;
/

You’ll notice that one of the indexes I’ve created has only one column, while the other has five columns (including the single column of the smaller index). Since I’ve created only 1,000,000 rows spread across 100 partitions every partition is small and the corresponding index partitions are also small, so I’m going to create use dbms_stats.get_index_stats() and dbms_stats.set_index_stats() to make the indexes appear much larger. Specifically I will set the blevel on the single column index to 2, and the blevel on the multi-column index to 3. (This difference in settings isn’t just whimsy, it helps to emulate Doug’s problem.) I’ve previously published the type of code to make this possible; in this case I only set the table-level stats because the queries I’m going to use will trigger Oracle to use just the table-level stats.

select
	/*+ index(t1(v100)) */
	count(*)
from
	t1
where
	v10 = '0000000005'
and	v100 = '0000000005'
;

select
	/*+ index(t1(v10, vid, v100)) */
	count(*)
from
	t1
where
	v10 = '0000000005'
and	v100 = '0000000005'
;

I’m going to show you two sets of execution plans. The first one is where I’ve set optimizer_features_enable to ’10.2.0.4′, the second it where I’ve left it to default.

10.2.0.4 execution plans:
===================
---------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name       | Rows  | Bytes | Cost  | Pstart| Pstop |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |            |     1 |    22 |  2350 |       |       |
|   1 |  SORT AGGREGATE                      |            |     1 |    22 |       |       |       |
|   2 |   PARTITION RANGE ALL                |            |  1000 | 22000 |  2350 |     1 |    10 |
|   3 |    PARTITION LIST ALL                |            |  1000 | 22000 |  2350 |     1 |    10 |
|*  4 |     TABLE ACCESS BY LOCAL INDEX ROWID| T1         |  1000 | 22000 |  2350 |     1 |   100 |
|*  5 |      INDEX RANGE SCAN                | T1_ONE_COL |   100K|       |   330 |     1 |   100 |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - filter("V10"='0000000005')
   5 - access("V100"='0000000005')

------------------------------------------------------------------------------
| Id  | Operation            | Name  | Rows  | Bytes | Cost  | Pstart| Pstop |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |       |     1 |    22 |   100 |       |       |
|   1 |  SORT AGGREGATE      |       |     1 |    22 |       |       |       |
|   2 |   PARTITION RANGE ALL|       |  1000 | 22000 |   100 |     1 |    10 |
|   3 |    PARTITION LIST ALL|       |  1000 | 22000 |   100 |     1 |    10 |
|*  4 |     INDEX RANGE SCAN | T1_PK |  1000 | 22000 |   100 |     1 |   100 |
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("V10"='0000000005' AND "V100"='0000000005')
       filter("V100"='0000000005')

11.2.0.2 execution plans:
===================
Execution Plan
----------------------------------------------------------
Plan hash value: 3019183742

---------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name       | Rows  | Bytes | Cost  | Pstart| Pstop |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |            |     1 |    22 |  2530 |       |       |
|   1 |  SORT AGGREGATE                      |            |     1 |    22 |       |       |       |
|   2 |   PARTITION RANGE ALL                |            |  1000 | 22000 |  2530 |     1 |    10 |
|   3 |    PARTITION LIST ALL                |            |  1000 | 22000 |  2530 |     1 |    10 |
|*  4 |     TABLE ACCESS BY LOCAL INDEX ROWID| T1         |  1000 | 22000 |  2530 |     1 |   100 |
|*  5 |      INDEX RANGE SCAN                | T1_ONE_COL |   100K|       |   510 |     1 |   100 |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - filter("V10"='0000000005')
   5 - access("V100"='0000000005')

------------------------------------------------------------------------------
| Id  | Operation            | Name  | Rows  | Bytes | Cost  | Pstart| Pstop |
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |       |     1 |    22 |   370 |       |       |
|   1 |  SORT AGGREGATE      |       |     1 |    22 |       |       |       |
|   2 |   PARTITION RANGE ALL|       |  1000 | 22000 |   370 |     1 |    10 |
|   3 |    PARTITION LIST ALL|       |  1000 | 22000 |   370 |     1 |    10 |
|*  4 |     INDEX RANGE SCAN | T1_PK |  1000 | 22000 |   370 |     1 |   100 |
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("V10"='0000000005' AND "V100"='0000000005')
       filter("V100"='0000000005')

Regardless of the setting of optimizer_features_enable, if I hint the single-column index I have a plan that visits each partition and subpartition in turn for the index then table; and if I hint the primary key I do that same walk through each partition and subpartition and visit just the index.

The big difference is in the cost. By default, both execution plans cost more in 11.2.0.2 – but the odd thing is that the change in costs is different. The cost of using the single-column index has gone up by 180, the cost of using the primary key index has gone up by 270; and the change in costs can be attributed completely to the indexes.

Is there a pattern in the change ? Yes, it’s not immediately obvious, of course, and needs a few extra tests to confirm it, but the change in cost can be calculated as: (number of range partitions * (number of list partitions – 1) * blevel of index).

    For the single column index this is: 10 * 9 * 2 = 180
    For the primary key index this is: 10 * 9 * 3 = 270

You don’t have to see an example of a change in plan actually happening once you’ve seen this change in arithmetic. In certain circumstances the cost of using composite partitions goes up when you upgrade to 11.2 – and the change in cost is proportional to the number of data segments (subpartitions) and the blevel of the indexes.

In Doug’s case he had a table with roughly 4,000 subpartitions in total – and he had a single column index with a blevel of 2, and a multi-column index with a blevel of 3: so the cost of using the (smaller, but less efficient) index went up by about 8,000 and the cost of using the (larger but efficient) index went up by about 12,000. The difference of 4,000 between these two increases was far higher than the original cost of using the bad index – so Oracle switched from the good index to the bad index.

The nasty thing about this problem is that it’s correcting an error – 11.2 is allowing for the cost of probing every single index subpartition, that’s what the extra multiples of the blevel represent; unfortunately the optimizer’s underlying paradigm of “every block visit is a disk I/O” makes this correction a threat.

Solutions and Workarounds

There is an /*+ optimizer_features_enable(‘x.x.x.x’) */ hint that could be used if the problem applies to just a couple of queries.

If the problem applies to classes of queries involving a couple of big tables and indexes you could use dbms_stats.set_index_stats() to adjust the blevel of the critical indexes.

If the problem appears all over the place you could set the optimizer_features_enable parameter to ’10.2.0.4′ in the parameter file, or in a session logon trigger.

If the problem appears all over the place, but there are other features of the 11.2.0.2 optimizer that are very helpful you could take advantage of the “_fix_control” parameter – after talking to Oracle support. This particular problem comes under bug fix 7132684, which appears in v$system_fix_control under 10.2.0.5, with the description “Fix costing for non prefixed local index”. If you’ve got the right symptoms, then the following statement will bypass the problem:

alter session set "_fix_control"='7132684:OFF';

Final Warning:

This note comes from 30 minutes looking at Doug’s trace files, one hour building a test case, and an hour spent writing this blog. This is not a complete determination of the problem it’s just an initial outline of what can happen and the symptoms to watch out for, posted as an early warning to help a few people save some time. Do not treat this note as a definitive reference.