Tuning

Video

Here’s a video of the presentation that I did as the keynote for the Turkish Oracle Users’ Group conference earlier on this year. Recording courtesy of the people at Formspider.

The topic is about tuning a two-table join, and the video is about an hour long (so there’s a lot more that could be  said on the topic, and one day I might do a half  day “masterclass” on it).

Optimisation

A question came up on Oracle-L recently about the difference in work done by the following two queries:

SELECT /*+ RULE */
	DOM_NAME
FROM
	DOMAINS,
	TABLE(CAST(:B1 AS DOMAIN_LIST)) DL
WHERE
	DOM_NAME = DL.COLUMN_VALUE
;

SELECT
	DOM_NAME
FROM
	DOMAINS
WHERE
	DOM_NAME IN (
		SELECT	COLUMN_VALUE
		FROM	TABLE(CAST(:B1 AS  DOMAIN_LIST))
	)
;

Before saying anything else, I should point out that these two queries are NOT logically equivalent unless you can guarantee that the table() operator returns a unique set of values – and Oracle doesn’t allow uniqueness to be enforced on collections.

However, the author of the question had ensured that the results from the table() operator contained no duplicates so the two versions of the query returned the same result set. The question is, why does one query do fewer buffer visits than the other – as evidenced by the results from sql_trace and tkprof.

Rule-based Join
===============

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.01       0.01          0          0          0           0
Fetch        1      0.86       0.86          0     200047          0      115195
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        3      0.87       0.87          0     200047          0      115195

Rows     Row Source Operation
-------  ---------------------------------------------------
 115195  NESTED LOOPS  (cr=200047 pr=0 pw=0 time=6355 us)
  99704   COLLECTION ITERATOR PICKLER FETCH (cr=0 pr=0 pw=0 time=284 us)
 115195   INDEX RANGE SCAN DOM_NAME_IDX (cr=200047 pr=0 pw=0 time=3191 us)(object id 54309)

CBO with subquery
=================

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.01       0.01          0          0          0           0
Fetch        1      0.78       0.78          0     157986          0      115195
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        3      0.80       0.80          0     157986          0      115195

Rows     Row Source Operation
-------  ---------------------------------------------------
 115195  NESTED LOOPS (cr=157986 pr=0 pw=0 time=4720 us cost=536 size=6141 card=267)
  99704   SORT UNIQUE (cr=0 pr=0 pw=0 time=417 us)
  99704    COLLECTION ITERATOR PICKLER FETCH (cr=0 pr=0 pw=0 time=259 us)
 115195   INDEX RANGE SCAN DOM_NAME_IDX (cr=157986 pr=0 pw=0 time=3353 us cost=2 size=21 card=1)(object id 54309)

Notice how the optimizer has unnested the subquery and introduced a “sort unique” – this demonstrates the optimizer’s recognition that the two queries are not logically identical. However, the number of rows from the collection is the same (99,704) and the number of rows after joining is the same (115,195) – the data sets have been rigged so that it is a fair comparison. So why does the explicit join take 200,047 buffer visits when the transformed  subquery approach take only 157,986 buffer visits.

The answer is that you can visit buffers without doing “logical I/Os” – and this benefit accrues most frequently to indexed access paths. Thanks to the way Oracle has sorted the collection before doing the join the nested loop follows the second index (dom_name_idx) in order – this increases the chance that index blocks that were used in the previous cycle of the loop will be re-used in the next cycle, which means that Oracle need not release the pins on the index blocks, but can revisit them incrementing the statitsic: “buffer is pinned count”.

I have to say that I haven’t proved that this is exactly what was happening, but when I suggested to the person who had asked the question that he re-run and check to see if the decrease in “session logical reads” was balanced by an increase in “buffer is pinned count” the answer seemed to confirm the hypothesis. [Correction (see comment #1 from Randolf): the results didn't confirm the hypothesis - but I think that' s because of the change in "pinning" that Randolf describes, so I'll have to find a way to confirm the hypothesis some other time.]

 

Star Transformation – 2

After publishing my note about emulating star transformations when there was something blocking the optimizer’s ability to do them automatically I ended up having an interesting email conversation with Mark Farnham about variations on the theme, concluding with a completely different (and possibly more intuitive) SQL statement to achieve the same aim.

You will recall that I started off with the following statement:

select
	ord.*
from
	products	prd,
	customers	cst,
	orders		ord
where
	prd.grp = 50
and	cst.grp = 50
and	ord.id_prd = prd.id
and	ord.id_cst = cst.id
;

Although this specific example is so simple that the optimizer could turn it into a star transformation automatically, I used it to demonstrate a “minimum work” version of the statement as follows (with the execution plan from 11.1.0.6):

select
	ord.*
from
	(
	select
		/*+
			leading(prd ord)
			use_nl(ord)
			no_merge
		*/
		ord.rowid 	prid
	from
		products	prd,
		orders		ord
		where
		prd.grp = 50
	and	ord.id_prd = prd.id
		)	prid,
	(
	select
		/*+
			leading(cst ord)
			use_nl(ord)
			no_merge
		*/
		ord.rowid 	crid
	from
		customers	cst,
		orders		ord
	where
		cst.grp = 50
	and	ord.id_cst = cst.id
	)	crid,
	orders	ord
where
	prid.prid = crid.crid
and	ord.rowid = prid.prid
;

--------------------------------------------------------------------------
| Id  | Operation                   | Name       | Rows  | Bytes | Cost  |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |            |    11 |  1650 |   570 |
|   1 |  NESTED LOOPS               |            |    11 |  1650 |   570 |
|*  2 |   HASH JOIN                 |            |    11 |   264 |   559 |
|   3 |    VIEW                     |            |  3361 | 40332 |   277 |
|   4 |     NESTED LOOPS            |            |  3361 | 87386 |   277 |
|*  5 |      TABLE ACCESS FULL      | CUSTOMERS  |    98 |   882 |    81 |
|*  6 |      INDEX RANGE SCAN       | ORD_CST_FK |    34 |   578 |     2 |
|   7 |    VIEW                     |            |  3390 | 40680 |   281 |
|   8 |     NESTED LOOPS            |            |  3390 | 88140 |   281 |
|*  9 |      TABLE ACCESS FULL      | PRODUCTS   |   100 |   900 |    81 |
|* 10 |      INDEX RANGE SCAN       | ORD_PRD_FK |    34 |   578 |     2 |
|  11 |   TABLE ACCESS BY USER ROWID| ORDERS     |     1 |   126 |     1 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("PRID"."PRID"="CRID"."CRID")
   5 - filter("CST"."GRP"=50)
   6 - access("ORD"."ID_CST"="CST"."ID")
   9 - filter("PRD"."GRP"=50)
  10 - access("ORD"."ID_PRD"="PRD"."ID")

Go back to the original SQL statement, though. The select list contains columns from just the orders table, and there’s a fairly well-known suggestion (or possibly guideline) that “tables that are not in the select list should not appear in the from clause, they should appear in subqueries in the where clause”. It’s not difficult to come up with examples where this strategy is a bad idea – but it’s often worth thinking through the consequences of trying to apply it. (Of course, you often find that after you’ve rewritten your SQL to match the strategy the optimizer transforms it back into the join that you had been avoiding.)

Since the products and customers tables don’t appear in the select list, can I find a way of rewriting my statement with subqueries ? The answer is yes. Here’s the SQL, with the execution plan I got.

select
	ord.*
from
	orders ord
where
	ord.rowid in (
		select
			/*+
				no_use_hash_aggregation
			*/
			prid.prid
		from
			(
			select	/*+ no_merge */
				ord.rowid 	prid
			from
				products	prd,
				orders		ord
				where
				prd.grp = 50
			and	ord.id_prd = prd.id
			)	prid,
			(
			select	/*+ no_merge */
				ord.rowid 	crid
			from
				customers	cst,
				orders		ord
			where
				cst.grp = 50
			and	ord.id_cst = cst.id
			)	crid
		where
			prid.prid = crid.crid
	)
;

------------------------------------------------------------------------------------------
| Id  | Operation                   | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |            |     1 |   138 |   591   (1)| 00:00:08 |
|   1 |  NESTED LOOPS               |            |     1 |   138 |   591   (1)| 00:00:08 |
|   2 |   VIEW                      | VW_NSO_1   |    11 |   132 |   589   (1)| 00:00:08 |
|   3 |    SORT UNIQUE              |            |     1 |   264 |            |          |
|*  4 |     HASH JOIN               |            |    11 |   264 |   589   (1)| 00:00:08 |
|   5 |      VIEW                   |            |  3314 | 39768 |   294   (1)| 00:00:04 |
|   6 |       NESTED LOOPS          |            |  3314 | 86164 |   294   (1)| 00:00:04 |
|*  7 |        TABLE ACCESS FULL    | PRODUCTS   |   100 |   900 |    94   (2)| 00:00:02 |
|*  8 |        INDEX RANGE SCAN     | ORD_PRD_FK |    33 |   561 |     2   (0)| 00:00:01 |
|   9 |      VIEW                   |            |  3314 | 39768 |   294   (1)| 00:00:04 |
|  10 |       NESTED LOOPS          |            |  3314 | 86164 |   294   (1)| 00:00:04 |
|* 11 |        TABLE ACCESS FULL    | CUSTOMERS  |   100 |   900 |    94   (2)| 00:00:02 |
|* 12 |        INDEX RANGE SCAN     | ORD_CST_FK |    33 |   561 |     2   (0)| 00:00:01 |
|  13 |   TABLE ACCESS BY USER ROWID| ORDERS     |     1 |   126 |     1   (0)| 00:00:01 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("PRID"."PRID"="CRID"."CRID")
   7 - filter("PRD"."GRP"=50)
   8 - access("ORD"."ID_PRD"="PRD"."ID")
  11 - filter("CST"."GRP"=50)
  12 - access("ORD"."ID_CST"="CST"."ID")

You’ll notice that this plan is remarkably similar to the plan I got from the joins with inline views – although the new plan has one extra “sort unique” operation at line 3. The optimizer has transformed my query by unnesting the subqueries and turning them into inline views – adding in a “distinct” operation because that’s what happens when you turn an “IN subquery” into a join when there isn’t a suitable uniqueness constraint on the join column.

There are two reasons for adopting this subquery approach. There’s the (highly subjective) argument that the code is a little easier to understand in this form, especially if you then want to join the orders table onwards to other tables. There’s also an objective argument relating to the “sort unique”. Note that I included the hint “no_use_hash_aggregation” to stop Oracle from using a “hash unique” operation at this point. By forcing Oracle to sort for uniqueness I know that the rowids will appear in (at least) an “extent-based” order rather than a completely random order. It’s possible that walking the fact table in physical order will be a little more efficient than accessing it in a completely randomised order. The former might find multiple rows in a block or benefit from some form of O/S or SAN read-ahead; the latter is more likely to turn one row into one random physical read.

One final thought – I said that you might want to take this type of approach for queries where the optimizer can’t do a star transformation automatically. There’s one very important case where this is always true – when you’re running Standard Edition, which doesn’t support bitmap indexes. So if you’re running SE and want to do star transformations – you now know how.

Footnote: I haven’t spent much time testing the limits of variations of this code although I did spend 30 minutes extending the example to use three dimension tables. If you think that this strategy might solve a production problem, think about using the sample code to create and test a better model of your production requirement before rewriting the production code – and if you go to production, check very carefully that the resulting code does produce the same results.

Star Transformation

A little while ago I published a note explaining how it was possible to find queries which ran faster if you manually de-coupled the index and table accesses. Here’s a further example that came up in discussion on a client site recently. The query looks something like this (at least in concept, although it was a little more complex, with some messy bits around the edges):

select
	ord.*
from
	products	prd,
	customers	cst,
	orders		ord
where
	prd.grp = 50
and	cst.grp = 50
and	ord.id_prd = prd.id
and	ord.id_cst = cst.id
;

There are indexes on products(id), customers(id) and orders(id), as well as indexes on orders(id_prd) and orders(id_cst). Basically it’s a collection of primary key and “foreign key” indexes – except there are no defined constraints and none of the indexes is unique.

The production “orders” table is very large (hundreds of millions of rows). There are lots of customers – and the orders for each customer are scattered throughout the entire orders table; similarly there are lots of products, and the orders for each product are scattered throughout the orders table. What execution plans (with the right indexes, of course) might you get for this query. There are two obvious possibilities:

-----------------------------------------------------------------
| Id  | Operation           | Name      | Rows  | Bytes | Cost  |
-----------------------------------------------------------------
|   0 | SELECT STATEMENT    |           |    11 |  1584 |   492 |
|*  1 |  HASH JOIN          |           |    11 |  1584 |   492 |
|*  2 |   TABLE ACCESS FULL | CUSTOMERS |   100 |   900 |    14 |
|*  3 |   HASH JOIN         |           |  3314 |   436K|   477 |
|*  4 |    TABLE ACCESS FULL| PRODUCTS  |   100 |   900 |    14 |
|   5 |    TABLE ACCESS FULL| ORDERS    |  1000K|   120M|   446 |
-----------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("ORD"."ID_CST"="CST"."ID")
   2 - filter("CST"."GRP"=50)
   3 - access("ORD"."ID_PRD"="PRD"."ID")
   4 - filter("PRD"."GRP"=50)

-----------------------------------------------------------------------------
| Id  | Operation                      | Name       | Rows  | Bytes | Cost  |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |            |    11 |  1584 | 10223 |
|   1 |  NESTED LOOPS                  |            |       |       |       |
|   2 |   NESTED LOOPS                 |            |    11 |  1584 | 10223 |
|   3 |    NESTED LOOPS                |            |  3314 |   436K|  3614 |
|*  4 |     TABLE ACCESS FULL          | PRODUCTS   |   100 |   900 |    14 |
|   5 |     TABLE ACCESS BY INDEX ROWID| ORDERS     |    33 |  4158 |    36 |
|*  6 |      INDEX RANGE SCAN          | ORD_PRD_FK |    33 |       |     2 |
|*  7 |    INDEX RANGE SCAN            | CST_I1     |     1 |       |     1 |
|*  8 |   TABLE ACCESS BY INDEX ROWID  | CUSTOMERS  |     1 |     9 |     2 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - filter("PRD"."GRP"=50)
   6 - access("ORD"."ID_PRD"="PRD"."ID")
   7 - access("ORD"."ID_CST"="CST"."ID")
   8 - filter("CST"."GRP"=50)

The first option is to hash the two dimension tables into memory then probe each in turn after a tablescan of the orders table – and this could be a very effective choice in some cases; the second option is to drive from one of the dimension tables into the orders table, picking up a relatively large number of rows, then discard a large fraction of those rows by indexing (or possibly doing a hash join) into the second dimension table. (Obviously the roles of products and customers could be reversed in the nested loop plan I’ve shown).

Both plans are likely to be rather expensive – a full scan on orders could be a huge amount of I/O in the form of “db file scattered read” waits; the indexed access path from one dimension to the orders table could be a huge amount of I/O in the form of “db file sequential read” wait collecting rows we won’t eventually need. In the correct circumstances (with the right data patterns) then, we might like to see a star transformation. But there are two possible problems:

  • It is “common knowledge” that you need primary keys on the dimension tables to do a star transformation. (In fact I was sure of this until about 30 minutes ago but then I discovered I was wrong.)
  • It is also “common knowledge” that you need bitmap indexes on the fact table to do a star transformation. Again this is wrong (but fortunately I’ve known about that for years).

So if we enable star transformations (alter session set star_transformation_enabled=true) we can get a plan that looks like this:

----------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name                       | Rows  | Bytes | Cost  |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |                            |     1 |   136 |   102 |

|   1 |  TEMP TABLE TRANSFORMATION            |                            |       |       |       |
|   2 |   LOAD AS SELECT                      | SYS_TEMP_0FD9D6601_17047D9 |       |       |       |
|*  3 |    TABLE ACCESS FULL                  | CUSTOMERS                  |   100 |   900 |    14 |
|   4 |   LOAD AS SELECT                      | SYS_TEMP_0FD9D6601_17047D9 |       |       |       |
|*  5 |    TABLE ACCESS FULL                  | PRODUCTS                   |   100 |   900 |    14 |

|*  6 |   HASH JOIN                           |                            |     1 |   136 |    74 |
|*  7 |    HASH JOIN                          |                            |     1 |   131 |    71 |

|   8 |     TABLE ACCESS BY INDEX ROWID       | ORDERS                     |    11 |  1400 |    68 |
|   9 |      BITMAP CONVERSION TO ROWIDS      |                            |       |       |       |
|  10 |       BITMAP AND                      |                            |       |       |       |

|  11 |        BITMAP MERGE                   |                            |       |       |       |
|  12 |         BITMAP KEY ITERATION          |                            |       |       |       |
|  13 |          TABLE ACCESS FULL            | SYS_TEMP_0FD9D6600_17047D9 |     1 |    13 |     2 |
|  14 |          BITMAP CONVERSION FROM ROWIDS|                            |       |       |       |
|* 15 |           INDEX RANGE SCAN            | ORD_CST_FK                 |       |       |     3 |

|  16 |        BITMAP MERGE                   |                            |       |       |       |
|  17 |         BITMAP KEY ITERATION          |                            |       |       |       |
|  18 |          TABLE ACCESS FULL            | SYS_TEMP_0FD9D6601_17047D9 |     1 |    13 |     2 |
|  19 |          BITMAP CONVERSION FROM ROWIDS|                            |       |       |       |
|* 20 |           INDEX RANGE SCAN            | ORD_PRD_FK                 |       |       |     3 |

|  21 |     TABLE ACCESS FULL                 | SYS_TEMP_0FD9D6600_17047D9 |   100 |   500 |     2 |
|  22 |    TABLE ACCESS FULL                  | SYS_TEMP_0FD9D6601_17047D9 |   100 |   500 |     2 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("CST"."GRP"=50)
   5 - filter("PRD"."GRP"=50)
   6 - access("ORD"."ID_PRD"="C0")
   7 - access("ORD"."ID_CST"="C0")
  15 - access("ORD"."ID_CST"="C0")
  20 - access("ORD"."ID_PRD"="C0")

In this example you can see that Oracle has decided to use a hash join to do a “join-back” to the dimension tables and has decided to extract the subset of rows (grp = 50) from each of those tables and save them in a global temporary. This join-back is a little odd since the query isn’t selecting any date from those two tables so we shouldn’t need to do it – but I think that may be a side effect of not having the primary key declared.

I’ve broken the plan up a little bit to make it a little easier to see the critical steps:

  • Lines 11 – 15 we select the index ranges we want from the customer index on orders, convert to bitmaps and merge.
  • Lines 16 – 20 we do the same for the products index on orders
  • Lines 8 – 10 we do the “bitmap and” between the two constructed bit strings, convert to rowids, and pick up exactly the rows we want
  • Lines 6,7,21,22 show two hash joins to the temporary tables which represent the subset of rows we have from customers and products
  • Lines 1 – 5 are where we started by extracting the two subsets into a form of internalised global temporary table.

If this doesn’t work well enough for you – and, allowing for a few little tweaks here and there, it really should – or if there is something about your code that makes it impossible for Oracle to do a star transformation, here’s an example of taking total control by restructuring the SQL to do a similar operation:

select
	ord.*
from
	(
	select
		/*+
			leading(prd ord)
			use_nl(ord)
			no_merge
		*/
		ord.rowid 	prid
	from
		products	prd,
		orders		ord
		where
		prd.grp = 50
	and	ord.id_prd = prd.id
		)	prid,
	(
	select
		/*+
			leading(cst ord)
			use_nl(ord)
			no_merge
		*/
		ord.rowid 	crid
	from
		customers	cst,
		orders		ord
	where
		cst.grp = 50
	and	ord.id_cst = cst.id
	)	crid,
	orders	ord
where
	prid.prid = crid.crid
and	ord.rowid = prid.prid

With the no_merge hints we produce two result sets: prid - the rowids from the orders table for all rows which match the products we are interested in; and crid – the rowids from the orders table for all rows which match the customers we are interested in.

We then do a join between crid and prid – which has to leave us with just those rowids for the rows which represent the customers we’re interested in who have placed orders for products we’re interested in. So we can then do a join, by rowid, to the orders table to collect the data for those orders. Here’s the plan:

--------------------------------------------------------------------------
| Id  | Operation                   | Name       | Rows  | Bytes | Cost  |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |            |    11 |  1650 |   570 |
|   1 |  NESTED LOOPS               |            |    11 |  1650 |   570 |
|*  2 |   HASH JOIN                 |            |    11 |   264 |   559 |
|   3 |    VIEW                     |            |  3361 | 40332 |   277 |
|   4 |     NESTED LOOPS            |            |  3361 | 87386 |   277 |
|*  5 |      TABLE ACCESS FULL      | CUSTOMERS  |    98 |   882 |    81 |
|*  6 |      INDEX RANGE SCAN       | ORD_CST_FK |    34 |   578 |     2 |
|   7 |    VIEW                     |            |  3390 | 40680 |   281 |
|   8 |     NESTED LOOPS            |            |  3390 | 88140 |   281 |
|*  9 |      TABLE ACCESS FULL      | PRODUCTS   |   100 |   900 |    81 |
|* 10 |      INDEX RANGE SCAN       | ORD_PRD_FK |    34 |   578 |     2 |
|  11 |   TABLE ACCESS BY USER ROWID| ORDERS     |     1 |   126 |     1 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("PRID"."PRID"="CRID"."CRID")
   5 - filter("CST"."GRP"=50)
   6 - access("ORD"."ID_CST"="CST"."ID")
   9 - filter("PRD"."GRP"=50)
  10 - access("ORD"."ID_PRD"="PRD"."ID")

The SQL has obviously become more complex – and complexity is generally something to be wary of as it’s an easy step from “complex” to “wrong”. So don’t try something like this unless you’re sure it’s necessary, and then document what you’re doing very carefully.

Footnote:

If you want to play around with this example, here’s the code to generate the tables. I was using my standard LMT, 1MB uniform extents, 8KB block size, freelist management model. The database was version 11.1.0.6 and CPU costing was disabled.

create table products
as
select
	rownum			id,
	mod(rownum,300)		grp,
	lpad(rownum,10,'0')	small_vc,
	rpad('x',100)		padding
from
	all_objects
where
	rownum <= 30000;

create index prd_i1 on products(id);
-- alter table products add constraint prd_pk primary key(id);

create table customers
as
select
	rownum			id,
	mod(rownum,300)		grp,
	lpad(rownum,10,'0')	small_vc,
	rpad('x',100)		padding
from
	all_objects
where
	rownum <= 30000;

create index cst_i1 on customers(id);
-- alter table customers add constraint cst_pk primary key(id);

create table orders
as
with generator as (
	select	--+ materialize
		rownum id
	from dual
	connect by
		rownum <= 1000)
select
	rownum			id,
	trunc(dbms_random.value(1,30000))	id_prd,
	trunc(dbms_random.value(1,30000))	id_cst,
	lpad(rownum,10,'0')	small_vc,
	rpad('x',100)		padding
from
	generator	v1,
	generator	v2
where
	rownum  <= 1000000;

create index ord_prd_fk on orders(id_prd);
create index ord_cst_fk on orders(id_cst);

Reminder and Public Appearances 2011

First, a reminder – my Advanced Oracle Troubleshooting v2.0 online seminar starts next week already. Last chance to sign up, I can accept registrations until Sunday :-)

I won’t do another AOT seminar before Oct (or Nov) this year. More details and sign-up here:

I have rescheduled my Advanced SQL Tuning and Partitioning & Parallel Execution for Performance seminars too. I will do them in September/October. Unfortunately I’m too busy right now to do them before the summer.

Public Appearances:

  • I will be speaking at the UKOUG Exadata Special Event in London on 18th April
  • I have submitted a few papers for Oracle OpenWorld in San Francisco as well (end of Sep/beginning of Oct), all about Exadata. Let’s see how it goes, but I’ll be there anyway, which means that I’ll probably show up at the Oracle Closed World event too!

And that’s all the travel I will do this year…

Virtual Conferences:

I’ll soon announce the 2nd EsSN virtual conference too ;-)

Free online stuff:

Perhaps in a month or so I will do another hacking session (I’ll plan 2 hours this time, 1 hour isn’t nearly enough for going deep). The topic will probably be about low-level details of SQL plan execution internals… stay tuned!

Latch contention troubleshooting case study and Flashback Database performance issues with LOBs

Steve Bamber has written up a case study of library cache latch contention troubleshooting of an Apex application with LatchProf. I’m happy that others also see the value and have had success with my new LatchProf based latch contention troubleshooting approach which takes into account both sides of the contention story (latch waiters and latch holders/blockers) as opposed to the guesswork used previously (hey if it’s shared pool latch contention – is must be about bad SQL not using bind variables …. NOT always…)

Anyway, I’m happy. If you have success stories with LatchProf, please let me know!

As a second topic of interest, Laimutis Nedzinskas has written some good notes about the effect and overhead of Flashback Database option when you are using and modifying (nocache) LOBs. We’ve exchanged some mails on this topic and yeah, my clients have sure seen some problems with this combination as well. You basically want to keep your LOBs cached when using FB database…

MOATS: The Mother of All Tuning Scripts!

People talk about the Oracle SQL Developer 3 being out, which is cool, but I have something even cooler for you today ;-)

I finally figured out how to convert my screen-recordings to uploadable videos, so that the text wouldn’t get unreadable and blurry.

So, here’s the first video, about a tool called MOATS, which we have built together with fellow OakTable Network member and a PL/SQL wizard Adrian Billington (of oracle-developer.net).

Here’s the video, it’s under 3 minutes long. Play the video in full screen for best results (and if it’s too slow loading, change it to lower resolution from HD mode):

Check it out and if you like MOATS, you can download it from Adrian’s website site (current version 1.05) and make sure you read the README.txt file in the zip!

Also thanks to Randolf Geist for finding and fixing some bugs in our alpha code… Note that MOATS is still kind of beta right now…

P.S. I will post my ORA-4031 and shared pool hacking video real soon now, too! :-)

P.P.S. Have you already figured out how it works?! ;-)

Update: Now you can suggest new features and improvement requests here:

An index of my TPT scripts

A lot of people have asked me whether there’s some sort of index or “table of contents” of my TPT scripts (there’s over 500 scripts in the tpt_public.zip file – http://tech.e2sn.com/oracle-scripts-and-tools )

I have planned to create such index for years, but never got to it. I probably never will :) So a good way to extract the descriptions of some scripts is this (run the command in the directory where you extracted my scripts to):

$ grep -i Purpose: *.sql | awk -F: ‘{ printf(“%20s %-50s\n”, $1, $3) }’
            bhla.sql      Report which blocks are in buffer cache, protected by a cache
         bufprof.sql      Display buffer gets done by a session and their reason
            calc.sql      Basic calculator and dec/hex converter       
        channels.sql      Report KSR channel message counts by channel endpoints
        curheaps.sql      Show main cursor data block heap sizes and their contents
             dba.sql      Convert Data Block Address (a 6 byte hex number) to file#, block#
             ddl.sql      Extracts DDL statements for specified objects
              df.sql  Show Oracle tablespace free space in Unix df style
        diag_sid.sql      Display current Session Wait info            
        diag_sid.sql      An easy to use Oracle session-level performance snapshot utility
           disco.sql      Generates commands for disconnecting selected sessions
     getplusparm.sql      get sqlplus parameter value (such linesize, pagesize, sqlcode,
            grpn.sql      Quick group by query for aggregating Numeric columns
            hash.sql      Show the hash value, SQL_ID and child number of previously
             i2h.sql      Advanced Oracle Troubleshooting Seminar demo script
              im.sql      Display In-Memory Undo (IMU) buffer usage    
            init.sql  Initializes sqlplus variables for 156 character terminal
       kglbroken.sql                                                   
       kglbroken.sql      Report broken kgl locks for an object this can be used for 
            kill.sql      Generates commands for killing selected sessions
              la.sql      Show which latch occupies a given memory address and its stats
     lastchanged.sql      Detect when a datablock in table was last changed
       latchprof.sql      Perform high-frequency sampling on V$LATCHHOLDER
   latchprof_old.sql      Perform high-frequency sampling on V$LATCHHOLDER
      latchprofx.sql      Perform high-frequency sampling on V$LATCHHOLDER
              lh.sql      Show latch holding SIDs and latch details from V$LATCHHOLDER
             lhp.sql      Perform high-frequency sampling on V$LATCHHOLDER
            lhpx.sql      Perform high-frequency sampling on V$LATCHHOLDER
     lotshparses.sql      Generate Lots of hard parses and shared pool activity 
    lotshparses2.sql      Generate Lots of hard parses and shared pool activity 
        lotslios.sql      Generate Lots of Logical IOs for testing purposes
        lotspios.sql      Generate Lots of Physical IOs for testing purposes
     lotssparses.sql      Generate Lots of soft parses and library cache/mutex activity 
    lotssparses2.sql      Generate Lots of soft parses and library cache/mutex activity 
       mutexprof.sql      Display KGX mutex sleep history from v$mutex_sleep_history
       nonshared.sql      Print reasons for non-shared child cursors from v$sql_shared_cursor
      nonshared2.sql      Show the reasons why more child cursors were created instead of
      ostackprof.sql      Take target process stack samples and show an execution profile
            pmem.sql      Show process memory usage breakdown – lookup by process SPID
     pmem_detail.sql      Show process memory usage breakdown details – lookup by process SPID
        prefetch.sql      Show KCB layer prefetch                      
          pvalid.sql      Show valid parameter values from V$PARAMETER_VALID_VALUES
        rowcache.sql      Show parent rowcache entries mathcing an object name
              rs.sql      Display available Redo Strands               
               s.sql      Display current Session Wait and SQL_ID info (10g+)
          sample.sql      Sample any V$ view or X$ table and display aggregated results
      sampleaddr.sql      High-frequency sampling of contents of a SGA memory address
             ses.sql      Display Session statistics for given sessions, filter by
            ses2.sql      Display Session statistics for given sessions, filter by
        sgastatx.sql      Show shared pool stats by sub-pool from X$KSMSS
            smem.sql      Show process memory usage breakdown – lookup by session ID
     smem_detail.sql      Show process memory usage breakdown details – lookup by session ID
         snapper.sql      An easy to use Oracle session-level performance snapshot utility
     snapper3.15.sql      An easy to use Oracle session-level performance snapshot utility
      snapper_v1.sql      An easy to use Oracle session-level performance snapshot utility
      snapper_v2.sql      An easy to use Oracle session-level performance snapshot utility
            stat.sql      Execute SQL statement in script argument and report basic
              sw.sql      Display current Session Wait info            
             sw2.sql      Display current Session Wait info            
             swg.sql      Display given Session Wait info grouped by state and event
             swo.sql      Display current Session Wait info            
          topsql.sql      Show TOP SQL ordered by user-provided criteria
            usql.sql      Show another session’s SQL directly from library cache
           usqlx.sql      Show another session’s SQL directly from library cache
        waitprof.sql      Sample V$SESSION_WAIT at high frequency and show resulting 
              xb.sql      Explain a SQL statements execution plan with execution 
            xde2.sql      Describe X$ tables, column offsets and report indexed fixed table
              xm.sql      Explain a SQL statements execution plan directly from library cache
             xma.sql      Explain a SQL statements execution plan directly from library cache
            xmai.sql      Explain a SQL statements execution plan with execution 
             xms.sql      Explain your last SQL statements execution plan with execution 
            xmsh.sql      Explain a SQL statements execution plan with execution 
            xmsi.sql      Explain a SQL statements execution plan with execution

Share

LOBREAD SQL Trace entry in Oracle 11.2 (and tracing OPI calls with event 10051)

A few days ago I looked into a SQL Tracefile of some LOB access code and saw a LOBREAD entry there. This is a really welcome improvement (or should I say, bugfix of a lacking feature) for understanding resource consumption by LOB access OPI calls. Check the bottom of the output below:

*** 2011-03-17 14:34:37.242
WAIT #47112801352808: nam='SQL*Net message from client' ela= 189021 driver id=1413697536 #bytes=1 p3=0 obj#=99584 tim=1300390477242725
WAIT #0: nam='gc cr multi block request' ela= 309 file#=10 block#=20447903 class#=1 obj#=99585 tim=1300390477243368
WAIT #0: nam='cell multiblock physical read' ela= 283 cellhash#=379339958 diskhash#=787888372 bytes=32768 obj#=99585 tim=1300390477243790
WAIT #0: nam='SQL*Net message to client' ela= 2 driver id=1413697536 #bytes=1 p3=0 obj#=99585 tim=1300390477243865
[...snipped...]
WAIT #0: nam='SQL*Net more data to client' ela= 2 driver id=1413697536 #bytes=2048 p3=0 obj#=99585 tim=1300390477244205
WAIT #0: nam='SQL*Net more data to client' ela= 4 driver id=1413697536 #bytes=2048 p3=0 obj#=99585 tim=1300390477244221
WAIT #0: nam='gc cr multi block request' ela= 232 file#=10 block#=20447911 class#=1 obj#=99585 tim=1300390477244560
WAIT #0: nam='cell multiblock physical read' ela= 882 cellhash#=379339958 diskhash#=787888372 bytes=32768 obj#=99585 tim=1300390477245579
WAIT #0: nam='SQL*Net more data to client' ela= 16 driver id=1413697536 #bytes=2020 p3=0 obj#=99585 tim=1300390477245685
WAIT #0: nam='SQL*Net more data to client' ela= 6 driver id=1413697536 #bytes=2048 p3=0 obj#=99585 tim=1300390477245706
WAIT #0: nam='SQL*Net more data to client' ela= 5 driver id=1413697536 #bytes=1792 p3=0 obj#=99585 tim=1300390477245720
#ff0000;">LOBREAD: c=1000,e=2915,p=8,cr=5,cu=0,tim=1300390477245735

In past versions of Oracle the CPU (c=) usage figures and other stats like number of physical/logical reads of the LOB chunk read OPI call were just lost – they were never reported in the tracefile. In past only the most common OPI calls, like PARSE, EXEC, BIND, FETCH (and recently CLOSE cursor) were instrumented with SQL Tracing. But since 11.2(.0.2?) the LOBREAD’s are printed out too. This is good, as it reduces the amount of guesswork needed to figure out what are those WAITs for cursor #0 – which is really a pseudocursor.

Why cursor#0? It’s because normally, with PARSE/EXEC/BIND/FETCH, you always had to specify a cursor slot number you operated on (if you fetch from cursor #5, it means that Oracle process went to slot #5 in the open cursor array in your session’s UGA and followed the pointers to shared cursor’s executable parts in library cache from there). But LOB interface works differently – if you select a LOB column using your query (cursor), then all your application gets is a LOB LOCATOR (sort of a pointer with LOB item ID and consistent read/version SCN). Then it’s your application which must issue another OPI call (LOBREAD) to read the chunks of that LOB out from the database. And the LOB locator is independent from any cursors, it doesn’t follow the same cursor API as regular SQL statements (as it requires way different functionality compared to a regular select or update statement).

So, whenever a wait happened in your session due to an access using a LOB locator, then there’s no specific cursor responsible for it (as far as Oracle sees internally) and that’s why a fake, pseudocursor #0 is used.

Note that on versions earlier than 11.2(.0.2?) when the LOBREAD wasn’t printed out to trace – you can use OPI call tracing (OPI stands for Oracle Program Interface and is the server-side counterpart to OCI API in the client side) using event 10051. First enable SQL Trace and then the event 10051 (or the other way around if you like):

SQL> @oerr 10051

ORA-10051: trace OPI calls

SQL> alter session set events '10051 trace name context forever, level 1';

Session altered.

Now run some LOB access code and check the tracefile:

*** 2011-03-17 14:37:07.178
WAIT #47112806168696: nam='SQL*Net message from client' ela= 6491763 driver id=1413697536 #bytes=1 p3=0 obj#=99585 tim=1300390627178602
OPI CALL: type=105 argc= 2 cursor=  0 name=Cursor close all
CLOSE #47112806168696:c=0,e=45,dep=0,type=1,tim=1300390627178731
OPI CALL: type=94 argc=28 cursor=  0 name=V8 Bundled Exec
=====================
PARSING IN CURSOR #47112802701552 len=19 dep=0 uid=93 oct=3 lid=93 tim=1300390627179807 hv=1918872834 ad='271cc1480' sqlid='3wg0udjt5zb82'
select * from t_lob
END OF STMT
PARSE #47112802701552:c=1000,e=1027,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,plh=3547887701,tim=1300390627179805
EXEC #47112802701552:c=0,e=29,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=3547887701,tim=1300390627179884
WAIT #47112802701552: nam='SQL*Net message to client' ela= 2 driver id=1413697536 #bytes=1 p3=0 obj#=99585 tim=1300390627179939
WAIT #47112802701552: nam='SQL*Net message from client' ela= 238812 driver id=1413697536 #bytes=1 p3=0 obj#=99585 tim=1300390627418785
OPI CALL: type= 5 argc= 2 cursor= 26 name=FETCH
WAIT #47112802701552: nam='SQL*Net message to client' ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=99585 tim=1300390627418945
FETCH #47112802701552:c=0,e=93,p=0,cr=5,cu=0,mis=0,r=1,dep=0,og=1,plh=3547887701,tim=1300390627418963
WAIT #47112802701552: nam='SQL*Net message from client' ela= 257633 driver id=1413697536 #bytes=1 p3=0 obj#=99585 tim=1300390627676629
#ff0000;">OPI CALL: type=96 argc=21 cursor=  0 name=#ff0000;">LOB/FILE operations
WAIT #0: nam='SQL*Net message to client' ela= 2 driver id=1413697536 #bytes=1 p3=0 obj#=99585 tim=1300390627676788
[...snip...]
WAIT #0: nam='SQL*Net more data to client' ela= 2 driver id=1413697536 #bytes=1792 p3=0 obj#=99585 tim=1300390627677054
LOBREAD: c=0,e=321,p=0,cr=5,cu=0,tim=1300390627677064

Check the bold and especially the red string above.  Tracing OPI calls gives you some extra details of what kind of tasks are executed in the session. The “LOB/FILE operations” call indicates that whatever lines come after it (unlike SQL trace call lines where all the activity happens before a call line is printed (with some exceptions of course)) are done for this OPI call (until a next OPI call is printed out). OPI call tracing should work even on ancient database versions…

By the way, if you are wondering, what’s the cursor number 47112801352808 in the “WAIT #47112801352808″ above? Shouldn’t the cursor numbers be small numbers?

Well, in 11.2.0.2 this was also changed. Before that, the X in CURSOR #X (and PARSE #X, BIND #X, EXEC #X, FETCH #X) represented the slot number in your open cursor array (controlled by open_cursors) in your session’s UGA. Now, the tracefile dumps out the actual address of that cursor. 47112801352808 in HEX is 2AD94DC9FC68 and it happens to reside in the UGA of my session.

Naturally I asked Cary Millsap about whether he had spotted this LOBREAD already and yes, Cary’s way ahead of me – he said that Method-R’s mrskew tool v2.0, which will be out soon, will support it too.

It’s hard to not end up talking about Cary’s work when talking about performance profiling and especially Oracle SQL trace, so here are a few very useful bits which you should know about:

If you want to understand the SQL trace & profiling stuff more, then the absolute must document is Cary’s paper on the subject – Mastering Performance with Extended SQL Trace:

Also, if you like to optimize your work like me (in other words: you’re proactively lazy ;-) and you want to avoid some boring “where-the-heck-is-this-tracefile-now” and “scp-copy-it-over-to-my-pc-for-analysis” work then check out Cary’s MrTrace plugin (costs ~50 bucks and has a 30-day trial) for SQL Developer. I’ve ended up using it myself regularly although I still tend to avoid GUIs:

Share

ORA-4031 errors, contention, cursor management issues and shared pool fragmentation – free secret seminar!

Free stuff! Free stuff! Free stuff! :-)

The awesome dudes at E2SN have done it again! (and yes, Tom, this time the “we at E2SN Ltd” doesn’t mean only me alone ;-)

On Tuesday 22nd March I’ll hold two (yes two) Secret Oracle Hacking Sessions – about ORA-04031: unable to allocate x bytes of shared memory errors, cursor management issues and other shared pool related problems (like fragmentation). This event is free for all! You’ll just need to be fast enough to register, both events have 100 attendee limit (due to my GotoWebinar accont limitations).

I am going to run this online event twice, so total 200 people can attend (don’t register for both events, please). One event is in the morning (my time) to cater for APAC/EMEA region and the other session is for EMEA/US/Americas audience.

The content will be the same in both sessions. There will be no slides (you cant fix your shared pool problems with slides!) but there will be demos, scripts, live examples and fun (for the geeks among us anyway – others go and read some slides instead ;-)!

Share