Search

OakieTags

Who's online

There are currently 0 users and 36 guests online.

Recent comments

Oakies Blog Aggregator

Oracle XML Training with Marco Gralike in Ljubljana

In last two years I have been involved in many cases when we had to troubleshoot performance for processing XML files. Due to this reason I decided to organize specialized training with the best expert for processing XML in Oracle database - Marco Gralike, my fellow from OakTable and also Oracle ACE Director.
The training is organized by my company DbProf d.o.o in cooperation with Slovenian Oracle User Group - SIOUG. The training will take place in Ljubljana, March 27th and 28th 2012. More about training agenda and the pricing is available here.

SQL Quiz – How To Multiply across Rows

A colleague came to me a couple of days ago with a SQL problem. He had something like this:

@get_source

NAME          INPUT
------------- -----
GROUP_1       5
GROUP_2       3
GROUP_3       4
GROUP_4       7
GROUP_5       3

What he wanted to do was multiply all the inputs across the groups, to get the total number of possible permutations. ie 5*3*4*7*3. The product of all the INPUTS. This has to be in straight SQL. Easy! You just need to… Ahhh… Hmmm…

No, I just could not think of a way to do it that my colleague could use.

- There is no group-by function that gives a product of a column {that I know of}
- We could not use PL/SQL {for reasons I won’t go into}, so no cursor looping or passing in an array, which would make it simple
- Neither of us could think of an analytical function that could take the result of itself in the prior row (though I suspect there may be a way to do it).
- The number of groups could and would vary from a few to possibly a hundred, so the old tricks of converting rows to columns or nesting so many sql statements would not help.

So, I asked my friend – the queen of SQL Query, {Boneist} {Oh she of the trombone playing pastime}.

She came straight back with an answer. In case you want to try and work out an answer yourself before seeing the solution, below is a cat picture. The answer she came up with is below that:

The key to the solution is natural logs {ln}. I don’t know about you, but I learnt about using logs at school and have pretty much not used them since. In summary:

If x=3*5*9
then ln(x) = ln(3)+ln(5)+ln(9)
= 1.09861+1.60944+2.19722
= 4.90527

ie using log converts multiplication to addition. You then use EXP, the inverse of ln, to convert your added-up log value into your result.

exp(4.90527) = 135

{NB if you use my figures, exp(4.90527) realy equals 134.999355, as I have truncated the log values shown. Oracle does this far more accurately internally but be aware you might get some slight rounding errors).

So, what we can do is simply use the SQL GROUP function SUM to add together the natural logs of all the rows:

sum(ln(input))
{grouped by the whole statement, so no group by is needed in this case}

As an example:

-- show the expected result first
select 3*7*4*5*1 from dual;

 3*7*4*5*1
----------
       420


select min(name),max(name),count(name)
,EXP (SUM (LN (gr_sum))) gr_prod
from
(select 'group_1' name, 3 gr_sum from dual
 union
 select 'group_2' name, 7 gr_sum from dual
 union
 select 'group_3' name, 4 gr_sum from dual
 union
 select 'group_4' name, 5 gr_sum from dual
 union
 select 'group_5' name, 1 gr_sum from dual
)
/

MIN(NAM MAX(NAM COUNT(NAME)    GR_PROD
------- ------- ----------- ----------
group_1 group_5           5        420

As you can see, it works – even if when you first look at the formula your brains {if you are not a mathematician} try to leak out of your head. Just try and remember what your mathematics teacher said about log books and how, before calculators, they were used to speed up manual long multiplication tasks by converting the task into log addition.

If you want more information on logs, see this discussion about how they are actually about growth or wikipedia if you must :-) .

Boneist actually pointed me to this very nice post about using logs in oracle by Anju Parashar, which I have borrowed from.

One issues to be aware of (which is highlighted in Anuj Parashar’s article) is that you can’t get a log of negative values, as a consequence Oracle will give you an ora-01428 error:

select ln(-3) from dual;
select ln(-3) from dual
*
ERROR at line 1:
ORA-01428: argument ‘-3′ is out of range

Anuj gives a version of code that works if all values are negative, below I have one that copes with any number of negatives. Basically, you convert all the values to be multiplied to positive values and then make it negative if the count of negative values is odd. Mathematically, the result of a multiplication can ONLY be negative if there are an odd number of negative values.

,EXP (SUM (LN (abs(gr_sum))))
*decode (mod(sum(decode(sign(gr_sum),0,0,1,0, 1)),2)
,0,1,-1) correct_gr_prod

I’m sure that the above expression could be simplified, but I have to go and do the day job.

Finally, here is a little set of test cases covering the above, so you can play with this.

mdw1123> select 3*7*4*5*1 from dual;

 3*7*4*5*1
----------
       420

1 row selected.

mdw1123> --
mdw1123> select 'group_1' name, 3 gr_sum from dual
  2  union
  3  select 'group_2' name, 7 gr_sum from dual
  4  union
  5  select 'group_3' name, 4 gr_sum from dual
  6  union
  7  select 'group_4' name, 5 gr_sum from dual
  8  union
  9  select 'group_5' name, 1 gr_sum from dual
 10  /

NAME        GR_SUM
------- ----------
group_1          3
group_2          7
group_3          4
group_4          5
group_5          1

5 rows selected.

mdw1123>
mdw1123> select min(name),max(name),count(name)
  2  ,EXP (SUM (LN (gr_sum))) gr_prod
  3  from
  4  (select 'group_1' name, 3 gr_sum from dual
  5   union
  6   select 'group_2' name, 7 gr_sum from dual
  7   union
  8   select 'group_3' name, 4 gr_sum from dual
  9   union
 10   select 'group_4' name, 5 gr_sum from dual
 11   union
 12   select 'group_5' name, 1 gr_sum from dual
 13  )
 14  /

MIN(NAM MAX(NAM COUNT(NAME)    GR_PROD
------- ------- ----------- ----------
group_1 group_5           5        420

1 row selected.

mdw1123> --
mdw1123> -- now with a negative
mdw1123> select 'group_1' name, 3 gr_sum from dual
  2   union
  3   select 'group_2' name, -7 gr_sum from dual
  4   union
  5   select 'group_3' name, 4 gr_sum from dual
  6   union
  7   select 'group_4' name, 5 gr_sum from dual
  8   union
  9   select 'group_5' name, 1 gr_sum from dual
 10  /

NAME        GR_SUM
------- ----------
group_1          3
group_2         -7
group_3          4
group_4          5
group_5          1

5 rows selected.

mdw1123> -- and if the values contain negatives
mdw1123> select min(name),max(name),count(name)
  2  ,EXP (SUM (LN (abs(gr_sum)))) gr_prod
  3  ,mod(sum(decode(sign(gr_sum),0,0
  4                          ,1,0
  5                          ,  1)
  6           ),2) -- 0 if even number of negatives, else 1
  7           modifier
  8  ,EXP (SUM (LN (abs(gr_sum))))
  9   *decode (mod(sum(decode(sign(gr_sum),0,0,1,0,     1)),2)
 10         ,0,1,-1) correct_gr_prod
 11  from
 12  (select 'group_1' name, 3 gr_sum from dual
 13   union
 14   select 'group_2' name, -7 gr_sum from dual
 15   union
 16   select 'group_3' name, 4 gr_sum from dual
 17   union
 18   select 'group_4' name, 5 gr_sum from dual
 19   union
 20   select 'group_5' name, 1 gr_sum from dual
 21  )
 22  /

MIN(NAM MAX(NAM COUNT(NAME)    GR_PROD   MODIFIER CORRECT_GR_PROD
------- ------- ----------- ---------- ---------- ---------------
group_1 group_5           5        420          1            -420

1 row selected.

Xen problems with OpenSuSE 12.1 and Intel core-i7

I have set up my new lab server yesterday, which in essence is a rack mounted server with a core i7 2600 processor, 32GB RAM and 3 TB of (slow) disk. When I moved some of my VMs across from an identical system (except that it was a core i7 920) and tried to start the domU, it repeatedly crashed. The message from the console was a simple question: is xend running?

I couldn’t believe my eyes-using identical software now produced segmentation faults? How is that possible. I am using xen 4.2, kernel 3.1.9-1.4-xen and libvirt libvirt-0.9.6-3.3.1.x86_64

I started the troubleshooting with the xen logs. There was no output in the debug log, however the xend.log showed these lines:

...
[2012-02-21 22:36:43 1212] INFO (image:187) buildDomain os=linux dom=1 vcpus=2
[2012-02-21 22:36:43 1212] DEBUG (image:819) domid          = 1
[2012-02-21 22:36:43 1212] DEBUG (image:820) memsize        = 1024
[2012-02-21 22:36:43 1212] DEBUG (image:821) image          = /m/xen/kernels/ol62/vmlinuz
[2012-02-21 22:36:43 1212] DEBUG (image:822) store_evtchn   = 1
[2012-02-21 22:36:43 1212] DEBUG (image:823) console_evtchn = 2
[2012-02-21 22:36:43 1212] DEBUG (image:824) cmdline        =  vnc
[2012-02-21 22:36:43 1212] DEBUG (image:825) ramdisk        = /m/xen/kernels/ol62/initrd.img
[2012-02-21 22:36:43 1212] DEBUG (image:826) vcpus          = 2
[2012-02-21 22:36:43 1212] DEBUG (image:827) features       =
[2012-02-21 22:36:43 1212] DEBUG (image:828) flags          = 0
[2012-02-21 22:36:43 1212] DEBUG (image:829) superpages     = 0
[2012-02-21 22:36:44 1210] CRITICAL (SrvDaemon:232) Xend died due to signal 11!  Restarting it.
[2012-02-21 22:36:44 3288] INFO (SrvDaemon:332) Xend Daemon started
...

The xm-list output showed an unknown domain.

The clue which finally lead me to Novell’s bugzilla database was in /var/log/messages:

kernel: [  230.384375] xend[3287]: segfault at 2408d7f6ea8 ip 00007fd098a9d779 sp 00007fd08d7f6c48 error 4 in libxenguest.so.4.2.0[7fd098a82000+25000]

Now to save you from a long session with your favourite search engine, I would like to point this thread out to you:

https://bugzilla.novell.com/show_bug.cgi?id=727081#c31

In a nutshell, all processors with “xsave” in the processor flags output are affected by this bug. Follow the steps in comment number 31 to add the fixes. It worked just beautifully after that.

The machine I am using is a rented root server and costs me a measly EUR 59/mth. The nice aspect is that I just switched from my “old” core i7-920 to the new machine and don’t have old hardware lying around. And as it’s in a hosted data centre all I need is an Internet connection to access it from anywhere. IMO a great alternative to the big bulky laptop approach.

ACE Director Program Updates: My thoughts…

As you probably already heard, the ACE Director program has recently altered their travel assistance policy. Some of the bigger, well established events will no longer be considered for travel assistance, and even those that are will have a limit to the number of ACEs that will get funding. The immediate fallout of this is some ACE Directors (including myself) have had to pull out of some events we had planned to speak at.

My first reaction was to post the following message to Victoria, Lillian and Justin,

Thanks for the update. I hope you guys don’t get too much hassle from people about this change. If you do, just remember, people will always find something to moan about. :)

As always, I’m very grateful for all the time and effort you guys have put into the program over the years.

Of course I am disappointed about having to pull out of events like KScope12, but I’m not about to throw my toys out of the pram. :)

I still think the ACE Director program is a really cool thing and I will continue to support it for as long as they will have me. I’ve met some truly brilliant people through the program and I fully expect to continue doing so. In my opinion, the travel assistance is only a small aspect of what I’ve gained from being a member of the group, so I hope people don’t focus on it too much.

Of course, if I stop getting free t-shirts, it’s all over. :)

Cheers

Tim… (Still proud to be an ACED)




Not In – 2

My note on “NOT IN” subqueries is one of the most popular on my blog, staying in the top 5 hits for the last five years – but it’s getting a bit old, so it’s about time I said something new about “NOT IN” – especially since the Null Aware Anti Join has been around such a long time. The example I want to talk about is, as so often, something that came up as a problem on a customer site. Here’s a bit of SQL to model the situation, which is currently running under Oracle 11.1.0.7:

create table t1
as
with generator as (
	select	--+ materialize
		rownum id
	from dual
	connect by
		level <= 10000
)
select
	rownum			id1,
	trunc((rownum-1)/10000)	id2,
	trunc((rownum-1)/10000)	n1,
	lpad(rownum,10,'0')	small_vc,
	rpad('x',100)		padding
from
	generator	v1,
	generator	v2
where
	rownum <= 100000;

-- collect stats, compute, no histograms

alter table t1 add constraint t1_pk primary key(id1, id2);
alter index t1_pk invisible;
create index t1_i2 on t1(id1, id2, n1);

create type jpl_scalar as object(x1 number);
/
create type jpl_table as table of jpl_scalar;
/

create type jpl_scalar3 as object(x1 number, x2 number, x3 number);
/
create type jpl_table3 as table of jpl_scalar3;
/

There are a couple of oddities in the model – one is that the second index starts with the same columns as the primary key index, and this is to emulate the functionality of the client system; the other is that I’ve made the primary key index invisible, the nature of the client data and code was such that the primary key index was not used in the query I’m about to emulate and I’ve made the index invisible so that I don’t have to mess around with statistics or complicated data routines to make the appropriate behaviour appear in the model.

Now that I’ve got the model in place, let’s take a look at the query:

delete
from	t1
where
	(id1) in (
		select x1 from table(cast(:b1 as jpl_table)) v1
	)
and	(id1, id2, n1) not in (
		select x1, x2, x3 from table(cast(:b2 as jpl_table3)) v2
	)
;

This query is actually trying to delete a load of data from the table. The surrounding PL/SQL package populates two collections with up to 1,000 items, The first collection identifies rows that may need to be deleted but the second collection rows then identifies rows from the first set that should not be deleted. You’ll notice that the first collection identifies rows only by the first column of the primary key, and the second collection uses both parts of the key and a non-key column to countermand the deletion. Typically both collections would hold close to the 1,000 item limit set by the developer, and typically only one or two rows would end up being deleted each time the statement ran (the id1 column tends to be “nearly unique” across the first collection).

With that overview in mind, thinking particularly of the number of rows intially identified and the number of rows that survive, look at the execution plan:

------------------------------------------------------
| Id  | Operation                            | Name  |
------------------------------------------------------
|   0 | DELETE STATEMENT                     |       |
|   1 |  DELETE                              | T1    |
|   2 |   NESTED LOOPS                       |       |
|   3 |    SORT UNIQUE                       |       |
|   4 |     COLLECTION ITERATOR PICKLER FETCH|       |
|*  5 |    INDEX RANGE SCAN                  | T1_I2 |
|*  6 |     COLLECTION ITERATOR PICKLER FETCH|       |
------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   5 - access("ID1"=SYS_OP_ATG(VALUE(KOKBF$),1,2,2))
       filter( NOT EXISTS (SELECT 0 FROM TABLE() "KOKBF$" WHERE
              LNNVL(SYS_OP_ATG(VALUE(KOKBF$),1,2,2)<>:B1) AND
              LNNVL(SYS_OP_ATG(VALUE(KOKBF$),2,3,2)<>:B2) AND
              LNNVL(SYS_OP_ATG(VALUE(KOKBF$),3,4,2)<>:B3)))
   6 - filter(LNNVL(SYS_OP_ATG(VALUE(KOKBF$),1,2,2)<>:B1) AND
              LNNVL(SYS_OP_ATG(VALUE(KOKBF$),2,3,2)<>:B2) AND
              LNNVL(SYS_OP_ATG(VALUE(KOKBF$),3,4,2)<>:B3))

The optimizer has unnested the first collection (IN list) sorted the set for uniqueness, and used it to drive a nested loop through the index we need (avoiding the table) to pick up the rowid and all the columns we need to check against the second collection. However, as it does the index range scan for each unique item in the first collection it runs the NOT IN subquery checking whether the row it has acquired has a match in the second collection. This means that around 1,000 times we fetch a row and scan the second collection for a match. We almost always find a match so, on average, we will have to scan 500 items from the second collection to find that match. The statement was CPU intensive; on the production system it did about 3,000 buffer gets but took about 1 CPU second to find and delete (on average) one row.

The problem is the repeated scanning on the second collection – and it’s a problem that shouldn’t exist. I want Oracle to unnest the second query and do a hash anti join with it. If we did that then we would only scan the second collection once, scatter it into memory, and only do one probe and comparison for each row brought back by the first collection. This is (a mockup of) the plan I want to see:

----------------------------------------------------------
| Id  | Operation                              | Name    |
----------------------------------------------------------
|   0 | DELETE STATEMENT                       |         |
|   1 |  DELETE                                | T1      |
|   2 |   HASH JOIN ANTI                       |         |
|   3 |    NESTED LOOPS                        |         |
|   5 |     SORT UNIQUE                        |         |
|   6 |      COLLECTION ITERATOR PICKLER FETCH |         |
|   7 |     INDEX RANGE SCAN                   | T1_I2   |
|   8 |    COLLECTION ITERATOR PICKLER FETCH   |         |
----------------------------------------------------------

With the SQL supplied, I couldn’t make this plan appear in 11.1. I had hoped to force the path I wanted and then create an SQL Baseline for it, but I actually had to rewrite the query, converting the “NOT IN” to “NOT EXISTS” – now this isn’t always legal, of course, but in my case I knew that all the relevant columns would be non-null (even though the n1 column in the table had not been declared as such) and the data accumulated in the collections would also be non-null so the transformation was safe. So here’s the rewrite and the new plan:

delete
from	t1
where
	exists (
		select
			null
		from	table(cast(:b1 as jpl_table)) v1
		where	v1.x1 = t1.id1
	)
and	not exists (
		select
			null
		from
			table(cast(:b2 as jpl_table3)) v2
		where	v2.x1 = t1.id1
		and	v2.x2 = t1.id2
		and	v2.x3 = t1.n1
	)
;

-------------------------------------------------------
| Id  | Operation                             | Name  |
-------------------------------------------------------
|   0 | DELETE STATEMENT                      |       |
|   1 |  DELETE                               | T1    |
|*  2 |   HASH JOIN ANTI                      |       |
|   3 |    NESTED LOOPS                       |       |
|   4 |     SORT UNIQUE                       |       |
|   5 |      COLLECTION ITERATOR PICKLER FETCH|       |
|*  6 |     INDEX RANGE SCAN                  | T1_I2 |
|   7 |    COLLECTION ITERATOR PICKLER FETCH  |       |
-------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("T1"."ID1"=SYS_OP_ATG(VALUE(KOKBF$),1,2,2) AND
              "T1"."ID2"=SYS_OP_ATG(VALUE(KOKBF$),2,3,2) AND
              "T1"."N1"=SYS_OP_ATG(VALUE(KOKBF$),3,4,2))
   6 - access("T1"."ID1"=SYS_OP_ATG(VALUE(KOKBF$),1,2,2))

It’s exactly what I wanted – but the code has to be modified, which means the full testing cycle and delay, and because of the complexities of the collection objects and the need to use realistic data it’s something that I can’t actually do on the sand-pit that the client lets me play with. So, as it stands, I think it ought to be quite a bit more efficient – but I’ll have to wait a couple of weeks to find out.

Here’s the irritating bit. The client will be upgrading to 11.2.0.3 in the not too distant future, and here’s the plan you get from the original query on that version of Oracle:

------------------------------------------------------------
| Id  | Operation                               | Name     |
------------------------------------------------------------
|   0 | DELETE STATEMENT                        |          |
|   1 |  DELETE                                 | T1       |
|   2 |   MERGE JOIN ANTI NA                    |          |
|   3 |    SORT JOIN                            |          |
|   4 |     NESTED LOOPS                        |          |
|   5 |      VIEW                               | VW_NSO_1 |
|   6 |       SORT UNIQUE                       |          |
|   7 |        COLLECTION ITERATOR PICKLER FETCH|          |
|*  8 |      INDEX RANGE SCAN                   | T1_I2    |
|*  9 |    SORT UNIQUE                          |          |
|  10 |     COLLECTION ITERATOR PICKLER FETCH   |          |
------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   8 - access("ID1"="X1")
   9 - access(INTERNAL_FUNCTION("ID1")=SYS_OP_ATG(VALUE(KOKBF$),1,2,2)
              AND INTERNAL_FUNCTION("ID2")=SYS_OP_ATG(VALUE(KOKBF$),2,3,2) AND
              INTERNAL_FUNCTION("N1")=SYS_OP_ATG(VALUE(KOKBF$),3,4,2))
       filter(INTERNAL_FUNCTION("N1")=SYS_OP_ATG(VALUE(KOKBF$),3,4,2)
              AND INTERNAL_FUNCTION("ID2")=SYS_OP_ATG(VALUE(KOKBF$),2,3,2) AND
              INTERNAL_FUNCTION("ID1")=SYS_OP_ATG(VALUE(KOKBF$),1,2,2))

With my data set it uses a merge join rather than a hash join, but it manages to unnest both collections – note, also, the appearance of the null-aware anti join; the appearance of the non-mergeable view (vw_nso_1) is also an interesting detail – 11.1 didn’t have this operator in its plan.

In principle 11.1 ought to be able to produce the same plan, all the units of functionality seem to be there – including the null-aware anti-join – but the I just can’t make the plan appear (although I did managed to get an ORA-00600 error with one of my more bizarre attempts at hinting.)

Despite the automatic appearance of what seems to be a suitable (though slightly sub-optimal) path with the upgrade, I think we’ll still be doing the rewrite – interestingly 11.2 does produce a slightly different plan when you go for existence subqueries – it’s a (differently named) non-mergeable view again:

----------------------------------------------------------
| Id  | Operation                              | Name    |
----------------------------------------------------------
|   0 | DELETE STATEMENT                       |         |
|   1 |  DELETE                                | T1      |
|*  2 |   HASH JOIN ANTI                       |         |
|   3 |    NESTED LOOPS                        |         |
|   4 |     VIEW                               | VW_SQ_1 |
|   5 |      SORT UNIQUE                       |         |
|   6 |       COLLECTION ITERATOR PICKLER FETCH|         |
|*  7 |     INDEX RANGE SCAN                   | T1_I2   |
|   8 |    COLLECTION ITERATOR PICKLER FETCH   |         |
----------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("T1"."ID1"=SYS_OP_ATG(VALUE(KOKBF$),1,2,2) AND
              "T1"."ID2"=SYS_OP_ATG(VALUE(KOKBF$),2,3,2) AND
              "T1"."N1"=SYS_OP_ATG(VALUE(KOKBF$),3,4,2))
   7 - access("ITEM_2"="T1"."ID1")

One last thought – it looks as if the optimizer has some new ways (including dynamic sampling) of handling collections and subquery manipulation of collections in 11.2: and this client loves doing cunning things with collections – so we’re probably going to get a number of better execution plans from the upgrade – but they’re going to have to check every single example they’ve got of code using collections, because you can bet that somewhere they’ll hit an edge case where the “new improved” mechanisms manage to be the wrong choice.

Footnote: I put a couple of /*+ cardinality (XXX 10) */ hints into my code while I was creating the examples above, but took them out to present the code and results. My data set was small compared to the client’s data set, so I needed the hints but didn’t want to give the impression that they were a necessary part of the solution.

Load profile

I like Load profile section of Statspack or AWR reports (who doesn’t). It’s short and gives a brief understanding of what kind of work a database does. But what if you don’t have an access to Statspack or AWR but still want to see something similar? It’s possible to use V$SYSMETRIC to get this numbers for last 60 or 15 seconds. I wanted to write a script to do this for a long time. Here it is.

col short_name  format a20              heading 'Load Profile'
col per_sec     format 999,999,999.9    heading 'Per Second'
col per_tx      format 999,999,999.9    heading 'Per Transaction'
set colsep '   '

select lpad(short_name, 20, ' ') short_name
     , per_sec
     , per_tx from
    (select short_name
          , max(decode(typ, 1, value)) per_sec
          , max(decode(typ, 2, value)) per_tx
          , max(m_rank) m_rank 
       from
        (select /*+ use_hash(s) */ 
                m.short_name
              , s.value * coeff value
              , typ
              , m_rank
           from v$sysmetric s,
               (select 'Database Time Per Sec'                      metric_name, 'DB Time' short_name, .01 coeff, 1 typ, 1 m_rank from dual union all
                select 'CPU Usage Per Sec'                          metric_name, 'DB CPU' short_name, .01 coeff, 1 typ, 2 m_rank from dual union all
                select 'Redo Generated Per Sec'                     metric_name, 'Redo size' short_name, 1 coeff, 1 typ, 3 m_rank from dual union all
                select 'Logical Reads Per Sec'                      metric_name, 'Logical reads' short_name, 1 coeff, 1 typ, 4 m_rank from dual union all
                select 'DB Block Changes Per Sec'                   metric_name, 'Block changes' short_name, 1 coeff, 1 typ, 5 m_rank from dual union all
                select 'Physical Reads Per Sec'                     metric_name, 'Physical reads' short_name, 1 coeff, 1 typ, 6 m_rank from dual union all
                select 'Physical Writes Per Sec'                    metric_name, 'Physical writes' short_name, 1 coeff, 1 typ, 7 m_rank from dual union all
                select 'User Calls Per Sec'                         metric_name, 'User calls' short_name, 1 coeff, 1 typ, 8 m_rank from dual union all
                select 'Total Parse Count Per Sec'                  metric_name, 'Parses' short_name, 1 coeff, 1 typ, 9 m_rank from dual union all
                select 'Hard Parse Count Per Sec'                   metric_name, 'Hard Parses' short_name, 1 coeff, 1 typ, 10 m_rank from dual union all
                select 'Logons Per Sec'                             metric_name, 'Logons' short_name, 1 coeff, 1 typ, 11 m_rank from dual union all
                select 'Executions Per Sec'                         metric_name, 'Executes' short_name, 1 coeff, 1 typ, 12 m_rank from dual union all
                select 'User Rollbacks Per Sec'                     metric_name, 'Rollbacks' short_name, 1 coeff, 1 typ, 13 m_rank from dual union all
                select 'User Transaction Per Sec'                   metric_name, 'Transactions' short_name, 1 coeff, 1 typ, 14 m_rank from dual union all
                select 'User Rollback UndoRec Applied Per Sec'      metric_name, 'Applied urec' short_name, 1 coeff, 1 typ, 15 m_rank from dual union all
                select 'Redo Generated Per Txn'                     metric_name, 'Redo size' short_name, 1 coeff, 2 typ, 3 m_rank from dual union all
                select 'Logical Reads Per Txn'                      metric_name, 'Logical reads' short_name, 1 coeff, 2 typ, 4 m_rank from dual union all
                select 'DB Block Changes Per Txn'                   metric_name, 'Block changes' short_name, 1 coeff, 2 typ, 5 m_rank from dual union all
                select 'Physical Reads Per Txn'                     metric_name, 'Physical reads' short_name, 1 coeff, 2 typ, 6 m_rank from dual union all
                select 'Physical Writes Per Txn'                    metric_name, 'Physical writes' short_name, 1 coeff, 2 typ, 7 m_rank from dual union all
                select 'User Calls Per Txn'                         metric_name, 'User calls' short_name, 1 coeff, 2 typ, 8 m_rank from dual union all
                select 'Total Parse Count Per Txn'                  metric_name, 'Parses' short_name, 1 coeff, 2 typ, 9 m_rank from dual union all
                select 'Hard Parse Count Per Txn'                   metric_name, 'Hard Parses' short_name, 1 coeff, 2 typ, 10 m_rank from dual union all
                select 'Logons Per Txn'                             metric_name, 'Logons' short_name, 1 coeff, 2 typ, 11 m_rank from dual union all
                select 'Executions Per Txn'                         metric_name, 'Executes' short_name, 1 coeff, 2 typ, 12 m_rank from dual union all
                select 'User Rollbacks Per Txn'                     metric_name, 'Rollbacks' short_name, 1 coeff, 2 typ, 13 m_rank from dual union all
                select 'User Transaction Per Txn'                   metric_name, 'Transactions' short_name, 1 coeff, 2 typ, 14 m_rank from dual union all
                select 'User Rollback Undo Records Applied Per Txn' metric_name, 'Applied urec' short_name, 1 coeff, 2 typ, 15 m_rank from dual) m
          where m.metric_name = s.metric_name
            and s.intsize_csec > 5000
            and s.intsize_csec < 7000)
      group by short_name)
 order by m_rank;

Couple of notes:

  • It’s a simple piece of code with most of it being copy-paste to select data that is only required and to sort it the same way as AWR does
  • I’ve deliberately supplied a USE_HASH hint to the query because on one DB without a fixed objects statistics (or wrong – not sure) Oracle decided to do a NESTED LOOP with a FULL SCAN of a huge in-memory structure so the query took around a second to execute. Don’t wanna that happen
  • It’s possible to utilize PIVOT in the query, but it’ll limit applicability to 11g only – and I want this SQL to be able to run on 10g too

And here is a sample output. Looks familiar :-)

Load Profile               Per Second   Per Transaction
--------------------   --------------   ---------------
             DB Time               .7
              DB CPU               .7
           Redo size        217,570.3       4,283,958.7
       Logical reads          4,177.1          82,248.0
       Block changes          1,475.5          29,053.0
      Physical reads               .1               2.0
     Physical writes               .4               8.3
          User calls               .3               6.0
              Parses              8.2             161.3
         Hard Parses              7.3             144.0
              Logons               .1               1.3
            Executes            722.3          14,222.3
           Rollbacks               .0
        Transactions               .1
        Applied urec               .1               1.0

Filed under: Oracle, Performance Tagged: AWR, scripts, Statspack

RMOUG 12 — Rocky Mountain Training Days

Well, I’ve just returned from a fabulous trip to the Rocky Mountain Oracle User’s Group (RMOUG) training days conference in Denver, Colorado.

It’s been a long time since I’ve attended this conference, but in the fall when I started to plan out which conferences I would begin re-attending in 2012 I gave this one a solid look. Often it’s hard to choose among all of the good conferences, but I thought I’d limit myself to the top 4-5 conferences based on my interests, and this one made the cut.

The main reason I wanted to attend is that I’m targeting conferences that try to bring many groups together instead of being a giant echo chamber for a homogeneous group of people. Some regional Oracle conferences start to seem like DBA-clubs at which the same topics are re-hashed again and again (the basic ones talk about space management and backups, while the advanced ones talk about block dumps and index leaf node management).

When I looked at the agendas for past RMOUG conferences, I was pleasantly surprised to see a wider range of topics that included DBA topics AND developer topics — and it’s this intersection that has me so keenly interested lately. Much of what I’m seeing in the data world is increasing conflict between DBAs and developers that is resulting in poor architectural and process decisions all-around. I’m hoping that conferences like RMOUG can be a venue for facilitating discussions between the camps.

The conference itself was run like a clock — but in an interesting way. It’s the most laid-back well-organized event I’ve ever been to. You know how some conferences are scheduled to the hilt with barely enough time to sprint from session to session and event to event? RMOUG isn’t like that — there’s a quiet competence about how its run so that you don’t miss a thing, and you don’t feel rushed or stressed about it. If you know the people involved, you probably understand why — Tim Gorman, Kellyn Pot’vin, John and Peggy King all exemplify that quite competence and energy that makes sure stuff gets done without a big deal being made of it.

Anyway, I went to some great sessions by Kris Rice (on the Oracle Database Cloud), Graham Wood (on under-utilized Database features), George Thrower (on Ruby!), and Jean-Pierre Dijcks (on the Oracle Big Data Appliance). I also caught up with lots of my OakTable friends.

In Kris’ session I learned a lot about the differences between the Oracle Database Cloud and the Amazon Relational Database Service (RDS) — basically Amazon RDS presents you with an instance and the ability to connect just like any other instance through a machine and port, while the Oracle Database Cloud is more like a schema with more limited access and extreme security (the Oracle Cloud lacks the Amazon security controls around IP address access, so it settles for more constraints around access). Also, file and object deployment is heavily virus scanned on the Oracle Database Cloud, leading to multi-step deployments — Kris showed us using the Deployment Cart functionality of SQL Developer. Also, a lot of emphasis on using RESTful interfaces to the Cloud (with a quick side note on the ability of the APEX listener to expose database capabilities as RESTful interfaces WITHOUT requiring a full APEX installation!).

In Jean-Pierre’s presentation I learned a lot about the connectors from the Big Data Appliance to an Oracle database, including specialized Hadoop finalization jobs which take reduce output and format it specifically for insertion into an Oracle database, along with an optimized hdfs client that will enable using hdfs content as external tables. JP described tests that were able to move 15TB/hr from hdfs on a BDA into an Exadata machine over Infiniband.

My own presentation on writing MapReduce in SQL and PL/SQL was rather sparsely attended, but that’s ok — I understand that presentations on the edge between DBAs and Developers aren’t always the most popular. I think it went well and my audience got some good stuff out of it. It was my first run of this presentation and I’ve taken away some good ideas on how to improve it.

In any event, from Cary Millsap’s keynote (in which I loved the difference between memorizing facts and understanding concepts as a way to “master” material) to Kerry Osborne’s presentation on DIY Exadata, I had an excellent time and will be sure to try and attend next year.

Between the location (Denver and the mountains), venue (Colorado Convention Center), and the excellent content I’d highly recommend putting this one on your list.


I wish I could go inside and see the RMOUG sessions!

Oracle database features wish list – 2 (V$SQL_UNSHARED_CURSOR)

It’s been too long from a previous wish for Oracle. While reading a MOS updated articles (in a new HTML interface which looks nice, much better than flash-based introduction) I’ve seen this document that made me write this blog post.
As you know, Oracle has V$SQL_SHARED_CURSOR view that helps to understand why a particular child cursor has been built due inability to share existing child cursor. Since 11.2.0.2 there’s also a REASON column that “helps” by storing more data in an XML. The thing is: in my opinion this view usability is far from perfect. Why? Because 1) usually application could suffer from different issues related to cursor sharing 2) child cursors come and go as garbage without traces left in the dynamic views. So, if you are interested in just one particular child cursor, then V$SQL_SHARED_CURSOR will work just fine for you (don’t forget to use nonshared.sql from TPT). But it’s not too useful for a statistical analysis on SQL or instance level. I want to know Top 10 causes why a given SQL had unshared cursors during its lifetime. Or the same information on the instance level. These questions are impossible to answer easy way & correctly right know. That’s why it would be nice to have V$SQL_UNSHARED_CURSOR with columns SQL_ID, REASON, VALUE. For each SQL_ID it would report how many times a particular REASON of non-shared cursor was encountered. Right now there’re 64 reasons in V$SQL_SHARED_CURSOR, so to support V$SQL_UNSHARED_CURCOR it would be necessary to add at least 64 bytes to each parent cursor plus some infrastructure overhead. I don’t know too much details but compared to a normal child cursor size of 15K, 100-ish bytes doesn’t look much to me. And the profit would be huge.
To support similar counters on the system level it’s logical to add them as statistics to V$SYSSTAT. It’s easy and fits perfectly well to AWR reports. Let’s say, you see substantial hard parse (sharing criteria) elapsed time or “hard parse (bind mismatch) elapsed time” reported in AWR report. Then you just scroll down to statistics section of the report and see Top reasons for these issues.
What do you think?

Filed under: Oracle Tagged: AWR, dynamic views, shared cursors, tpt, wish list

Dynamic Sampling And Indexes

There is a lot more to say about Dynamic Sampling and indexes, and I'll try to cover these basics in my Dynamic Sampling series on AllThingsOracle.com, but two recent discussions on the OTN forums and on Charles Hooper's blog prompted me to publish this blog post.

These discussions revolved around the following issues with Dynamic Sampling and indexes:

1. CREATE INDEX On Empty Table

There is an interesting change in behaviour that has been introduced with release 11.2. Whereas pre-11.2 versions do not automatically gather index statistics on empty tables, 11.2 does so. So a simple CREATE TABLE followed by a CREATE INDEX command (or an ALTER TABLE add constraint that implicitly/explictly creates indexes) will now lead to a table without statistics, but index statistics with every value set to 0 (rows, distinct keys, clustering factor, leaf blocks etc.)

Since Oracle 10g a CREATE INDEX command automatically computes statistics for an index (since Oracle 9i you could add optionally a COMPUTE STATISTICS clause to the CREATE INDEX command). The interesting point is that there is no "NOCOMPUTE STATISTICS" clause, so you cannot prevent this from happening via the syntax.

There is a way to prevent it, but only indirectly by locking the table statistics. Funny enough, if the table statistics are locked, there is again no way to overwrite the lock by using a FORCE keyword or similar as part of the CREATE INDEX command, whereas DBMS_STATS offers such a FORCE option.

Note that creating constraints requiring indexes as part of the CREATE TABLE command shows a different behaviour: The indexes do not have statistics in this case.

This shouldn't be too relevant for real-life cases, but is probably more important to setups of test cases / demonstrations. Of course it can become relevant to real-life deployment tasks that add new objects to a database. If statistics are not handled properly afterwards you now end up with tables missing statistics but indexes with 0 statistics.

Of course when deliberately leaving a table without statistics you need to do something about it anyway, because otherwise the default statistics job (since 10g) will turn this table into a potential threat by gathering statistics during the maintenance window where such tables might be in some state that is not representative for the actual workload during processing.

Either the default statistics job has to be disabled or re-configured via the AUTOSTATS_TARGET option of the SET_GLOBAL_PREFS / SET_PARAM procedures of DBMS_STATS, or the table statistics need to be locked individually via LOCK_TABLE_STATS.

So a reasonable order of commands for such a table at deployment time could look like this:

CREATE TABLE...

EXEC DBMS_STATS.LOCK_TABLE_STATS(...)

CREATE INDEX...

This way the statistics will be handled consistently for both table and indexes.

Note that all this doesn't apply to Global Temporary Tables (GTTs) as they are not analyzed by the default statistics job and creating an index on them doesn't automatically gather statistics for that index either, even in 11.2.

The change introduced in 11.2 can be seen by simply trying the following in 11.2 and pre-11.2 versions to see the difference:

drop table t;

purge table t;

create table t (id number(*, 0) not null, id2 number(*, 0), filler varchar2(100));

create index t_idx on t (id);

select num_rows, blocks from user_tab_statistics where table_name = 'T';

select blevel, num_rows, leaf_blocks, distinct_keys, clustering_factor from user_ind_statistics where table_name = 'T';

2. Inconsistent Costing

The reason why this change in behaviour is particularly relevant is the fact that the optimizer behaves inconsistently if a table doesn't have statistics but a suitable index does.

If Dynamic Sampling gets used (very likely from 10g on because the table is missing statistics and the default level of 2 will trigger Dynamic Sampling for such a table in this case) the optimizer will not only obtain basic statistics information about the table but also overwrite the existing index statistics, but only partially.

The problem is that the existing index leaf blocks statistics get overwritten by the Dynamic Sampling code, but the clustering factor is not.

You can see this from the following optimizer trace snippet:

***************************************
BASE STATISTICAL INFORMATION
***********************
Table Stats::
Table: T6 Alias: T6 (NOT ANALYZED)
#Rows: 82 #Blks: 1 AvgRowLen: 100.00
Index Stats::
Index: IND_T6_C2 Col#: 2
LVLS: 0 #LB: 0 #DK: 0 LB/K: 0.00 DB/K: 0.00 CLUF: 0.00

Access path analysis for T6
***************************************
SINGLE TABLE ACCESS PATH
Single Table Cardinality Estimation for T6[T6]

*** 2012-02-17 16:28:49.182
** Performing dynamic sampling initial checks. **
** Dynamic sampling initial checks returning TRUE (level = 2).
** Dynamic sampling updated index stats.: IND_T6_C2, blocks=227
** Dynamic sampling index access candidate : IND_T6_C2
** Dynamic sampling updated table stats.: blocks=4529

So although the index has statistics gathered (there is no (NOT ANALYZED) mentioned for the index) the Dynamic Sampling updates the index stats "blocks".

This is not consistent with the expected behaviour of Dynamic Sampling, because by default it is not supposed to overwrite existing statistics. If you want to force Dynamic Sampling to do so you need to specify the (not really) documented table level hint DYNAMIC_SAMPLING_EST_CDN.

However, another pretty important number, the clustering factor of the index, doesn't get updated. Since the clustering factor is important for accessing the table via an index any execution plan that needs to visit the table in such a way will now be costed inconsistently: The index access part will be based on the index block statistics updated by Dynamic Sampling whereas the table access part will be costed using the clustering factor (and potentially other existing index statistics left unmodified by Dynamic Sampling).

This will potentially lead to rather unreasonable plans, as the following test case demonstrates:

set echo on linesize 200

drop table t;

purge table t;

create table t (
c1 number not null,
c2 number not null,
c3 varchar2(300) not null);

create index t_idx on t(c2);

create index t_pk on t(c1);

insert into
t
select
rownum as c1,
1 as c2,
rpad('A',300,'A') as c3
from
dual
connect by
level<=100000;

commit;

alter session set optimizer_dynamic_sampling = 8;

explain plan for
select * from t where c2 = 1;

select * from table(dbms_xplan.display);

select index_name, blevel, num_rows, leaf_blocks, distinct_keys, clustering_factor from user_ind_statistics where table_name = 'T';

exec dbms_stats.delete_index_stats(null, 't_idx', no_invalidate=>false)

explain plan for
select * from t where c2 = 1;

select * from table(dbms_xplan.display);

exec dbms_stats.gather_index_stats(null, 't_idx', no_invalidate=>false)

explain plan for
select /*+ index(t(c2)) */ * from t where c2 = 1;

select * from table(dbms_xplan.display);

I get the following execution plans from 11.2.0.1:

SQL>
SQL> alter session set optimizer_dynamic_sampling = 8;

Session altered.

SQL>
SQL> explain plan for
2 select * from t where c2 = 1;

Explained.

SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 470836197

-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 100K| 16M| 177 (0)| 00:00:03 |
| 1 | TABLE ACCESS BY INDEX ROWID| T | 100K| 16M| 177 (0)| 00:00:03 |
|* 2 | INDEX RANGE SCAN | T_IDX | 100K| | 177 (0)| 00:00:03 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("C2"=1)

Note
-----
- dynamic sampling used for this statement (level=8)

18 rows selected.

SQL>
SQL> select index_name, blevel, num_rows, leaf_blocks, distinct_keys, clustering_factor from user_ind_statistics where table_name = 'T';

INDEX_NAME BLEVEL NUM_ROWS LEAF_BLOCKS DISTINCT_KEYS CLUSTERING_FACTOR
------------------------------ ---------- ---------- ----------- ------------- -----------------
T_IDX 0 0 0 0 0
T_PK 0 0 0 0 0

SQL>
SQL> exec dbms_stats.delete_index_stats(null, 't_idx', no_invalidate=>false)

PL/SQL procedure successfully completed.

SQL>
SQL> explain plan for
2 select * from t where c2 = 1;

Explained.

SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 470836197

-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 100K| 16M| 978 (1)| 00:00:12 |
| 1 | TABLE ACCESS BY INDEX ROWID| T | 100K| 16M| 978 (1)| 00:00:12 |
|* 2 | INDEX RANGE SCAN | T_IDX | 100K| | 177 (0)| 00:00:03 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("C2"=1)

Note
-----
- dynamic sampling used for this statement (level=8)

18 rows selected.

SQL>
SQL> exec dbms_stats.gather_index_stats(null, 't_idx', no_invalidate=>false)

PL/SQL procedure successfully completed.

SQL>
SQL> explain plan for
2 select /*+ index(t(c2)) */ * from t where c2 = 1;

Explained.

SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 470836197

-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 100K| 16M| 4533 (1)| 00:00:55 |
| 1 | TABLE ACCESS BY INDEX ROWID| T | 100K| 16M| 4533 (1)| 00:00:55 |
|* 2 | INDEX RANGE SCAN | T_IDX | 100K| | 179 (2)| 00:00:03 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("C2"=1)

Note
-----
- dynamic sampling used for this statement (level=8)

18 rows selected.

Notice how the optimizer for the first execution plan gets a reasonable cardinality estimate from the Dynamic Sampling (due to the increased level it is spot on in this case) but still thinks that reading 100,000 rows from a 100,000 table via an index is a good idea.

While the index access is costed reasonably based on the updated index blocks statistics, the table access is "cost-free" due to the clustering factor left unmodified at 0.

When deleting the index statistics we can see that a default clustering factor of 800 gets assumed (you can see this in the optimizer trace file), which still makes the index access more favourable.

Only with index statistics gathered a more reasonable plan gets selected.

There is a bug filed against 11.2 (bug no 12942119 "Suboptimal plan if index has statistics but table has no statistics and dynamic sampling occurs") that seems to address this issue, but it is not clear from the bug description what exactly the bug fix does. It hasn't made it into the 11.2.0.3 patch set but it is available as one-off patch and part of some Windows platform intermediate patch sets.

3. 11.2.0.3 Change In Behaviour

The 11.2.0.3 patch set introduces another interesting change in behaviour: As I've just demonstrated even with index statistics missing and hence consistent optimizer behaviour the default clustering factor assumed might still favour unreasonable execution plans.

There is a bug fix 12399886:"OPTIMIZER STILL USES DEFAULT INDEX STATISTICS AFTER DYNAMIC SAMPLING WAS USED" that is by default enabled in 11.2.0.3 that addresses this issue and uses a different clustering factor derived from the Dynamic Sampling results (it turns out to be based on the blocks of the table, not the rows, see below for more details).

This is a good thing in principle because the cost estimates now might be closer to reality, but as always this might cause a lot of headaches after applying the patch sets because execution plans might change. This applies of course only to those scenarios that rely on Dynamic Sampling and can make use indexes.

Note that the inconsistent behaviour described under 2. is still there in 11.2.0.3, so tables without statistics having index statistics gathered still don't mix very well.

Here are the execution plans that I got from 11.2.0.3 for above test case:

SQL> alter session set optimizer_dynamic_sampling = 8;

Session altered.

SQL>
SQL> explain plan for
2 select * from t where c2 = 1;

Explained.

SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------------

Plan hash value: 470836197

-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 98120 | 16M| 370 (0)| 00:00:05 |
| 1 | TABLE ACCESS BY INDEX ROWID| T | 98120 | 16M| 370 (0)| 00:00:05 |
|* 2 | INDEX RANGE SCAN | T_IDX | 98120 | | 370 (0)| 00:00:05 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("C2"=1)

Note
-----
- dynamic sampling used for this statement (level=8)

18 rows selected.

SQL>
SQL> select index_name, blevel, num_rows, leaf_blocks, distinct_keys, clustering_factor from user_ind_statistics where table_name = 'T';

INDEX_NAME BLEVEL NUM_ROWS LEAF_BLOCKS DISTINCT_KEYS CLUSTERING_FACTOR
------------------------------ ---------- ---------- ----------- ------------- -----------------
T_IDX 0 0 0 0 0
T_PK 0 0 0 0 0

SQL>
SQL> exec dbms_stats.delete_index_stats(null, 't_idx', no_invalidate=>false)

PL/SQL procedure successfully completed.

SQL>
SQL> explain plan for
2 select * from t where c2 = 1;

Explained.

SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------------

Plan hash value: 1601196873

--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 98120 | 16M| 1230 (1)| 00:00:15 |
|* 1 | TABLE ACCESS FULL| T | 98120 | 16M| 1230 (1)| 00:00:15 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("C2"=1)

Note
-----
- dynamic sampling used for this statement (level=8)

17 rows selected.

SQL>
SQL> explain plan for
2 select /*+ index(t(c2)) */ * from t where c2 = 1;

Explained.

SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------------

Plan hash value: 470836197

-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 98120 | 16M| 36602 (1)| 00:07:20 |
| 1 | TABLE ACCESS BY INDEX ROWID| T | 98120 | 16M| 36602 (1)| 00:07:20 |
|* 2 | INDEX RANGE SCAN | T_IDX | 98120 | | 370 (0)| 00:00:05 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("C2"=1)

Note
-----
- dynamic sampling used for this statement (level=8)

18 rows selected.

SQL>
SQL> exec dbms_stats.gather_index_stats(null, 't_idx', no_invalidate=>false)

PL/SQL procedure successfully completed.

SQL>
SQL> explain plan for
2 select /*+ index(t(c2)) */ * from t where c2 = 1;

Explained.

SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------------

Plan hash value: 470836197

-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 98120 | 16M| 4898 (1)| 00:00:59 |
| 1 | TABLE ACCESS BY INDEX ROWID| T | 98120 | 16M| 4898 (1)| 00:00:59 |
|* 2 | INDEX RANGE SCAN | T_IDX | 98120 | | 371 (1)| 00:00:05 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("C2"=1)

Note
-----
- dynamic sampling used for this statement (level=8)

18 rows selected.

Notice 11.2.0.3 suffers from the same cost-free table access with the zero index statistics in place, but after deleting the index statistics a full table scan will be chosen. It looks like that the new clustering factor is based on table blocks * 8 rather than a hard coded value of 800. Both constants (8, 800) might depend on the default block size - these tests were performed on 8KB.

Weather

I thought it was pretty cold in Sarajevo a couple of weeks ago, and therefore fairly mild in Minneapolis and Denver when the temperature was only just around freezing point – but after reading this report I don’t think I’m every going to say anything more about cold weather.

(Note: with my scientific/skeptic hat on, I have yet to be convinced that the story is completely true – there is one obvious weak point.)