Top 60 Oracle Blogs

Recent comments

June 2011

AWR mining – I/O wait histograms

UPDATE: July 12, 2011  I just noticed (!) that the DBA_HIST_EVENT_HISTOGRAM is only on 11g, so for 10g, it requires collecting the histogram deltas manually off of V$EVENT_HISTOGRAM, something like

set pagesize 150
col event format a25
col tm format a14
select event,
       to_char(sysdate,'J')||to_char(round(sysdate-trunc(sysdate),5)) tm,
from v$event_histogram
where event in ('log file parallel write' ,
                'db file scattered read' ,
                'db file sequential read' )
order by event;

and collecting this information every N seconds and taking the deltas. Also if you want the buckets in columns instead of rows:

    to_char(sysdate,'J')||to_char(round(sysdate-trunc(sysdate),5)) tm,
    sum (case when WAIT_TIME_MILLI=1 then WAIT_COUNT else 0 end) b1,
    sum (case when WAIT_TIME_MILLI=2 then WAIT_COUNT else 0 end) b2,
    sum (case when WAIT_TIME_MILLI=4 then WAIT_COUNT else 0 end) b3,
    sum (case when WAIT_TIME_MILLI=8 then WAIT_COUNT else 0 end) b4,
    sum (case when WAIT_TIME_MILLI=16 then WAIT_COUNT else 0 end) b5,
    sum (case when WAIT_TIME_MILLI=32 then WAIT_COUNT else 0 end) b6,
    sum (case when WAIT_TIME_MILLI=64 then WAIT_COUNT else 0 end) b7,
    sum (case when WAIT_TIME_MILLI=128 then WAIT_COUNT else 0 end) b8,
    sum (case when WAIT_TIME_MILLI=256 then WAIT_COUNT else 0 end) b9,
    sum (case when WAIT_TIME_MILLI=512 then WAIT_COUNT else 0 end) b10,
    sum (case when WAIT_TIME_MILLI=1024 then WAIT_COUNT else 0 end) b11,
    sum (case when WAIT_TIME_MILLI=2048 then WAIT_COUNT else 0 end) b12,
    sum (case when WAIT_TIME_MILLI=4096 then WAIT_COUNT else 0 end) b13,
    sum (case when WAIT_TIME_MILLI=8192 then WAIT_COUNT else 0 end) b14,
    sum (case when WAIT_TIME_MILLI=16384 then WAIT_COUNT else 0 end) b15,
    sum (case when WAIT_TIME_MILLI=32768 then WAIT_COUNT else 0 end) b16,
    sum (case when WAIT_TIME_MILLI=65536 then WAIT_COUNT else 0 end) b17,
    sum (case when WAIT_TIME_MILLI=131072 then WAIT_COUNT else 0 end) b18,
    sum (case when WAIT_TIME_MILLI=262144 then WAIT_COUNT else 0 end) b19,
    sum (case when WAIT_TIME_MILLI=524288 then WAIT_COUNT else 0 end) b20,
    sum (case when WAIT_TIME_MILLI=1048576 then WAIT_COUNT else 0 end) b21,
    sum (case when WAIT_TIME_MILLI=2097152 then WAIT_COUNT else 0 end) b22,
    sum (case when WAIT_TIME_MILLI=4194304 then WAIT_COUNT else 0 end) b23,
    sum (case when WAIT_TIME_MILLI > 4194304 then WAIT_COUNT else 0 end) b24
from v$event_histogram
where event='log file parallel write'
group by event,to_char(sysdate,'J')||to_char(round(sysdate-trunc(sysdate),5)) ;

Exploring I/O histogram values in AWR. In the AWR repository I have several databases, all reading off the same LUNs, so below I try to  take the deltas between each snapshot and sum up all the waits for each bucket in the histogram

set pagesize  50000
set feedback off
set echo off
SET markup HTML on
spool hist.html
with hist as  (
       to_char(trunc(cast(begin_interval_time as date ))+
         (ROUND ((cast(begin_interval_time as  date) -
         TRUNC (cast(begin_interval_time as date))) * 24) / 24),
        'YYYY/MM/DD HH24:MI') btime,
  from dba_hist_event_histogram  h,
       dba_hist_snapshot sn
         h.instance_number = 1
     and sn.instance_number = 1
     and h.event_name like 'db file seq%'
     and sn.snap_id=h.snap_id
     and sn.dbid=h.dbid
select  a.btime,
        sum(b.wait_count - a.wait_count)
from hist a,
     hist b
where a.dbid=b.dbid
  and a.snap_id=b.snap_id-1
  and a.wait_time_milli = b.wait_time_milli
group by a.btime, a.wait_time_milli
having  sum(b.wait_count - a.wait_count) > 0;
spool off
SET markup HTML off

With the HTML output it’s easy to read into excel.
The data can be graphed in Excel using pivot tables or pivot charts, but if I simply save the data in an excel worksheet, then I can open it up in Tableau, which in some circumstances, can be easier to use than excel

The Y axis is in log scale. I don’t find that the data speaks to me immediately (other than there are some seriously slow I/Os) but it is easy to see the major outliers.

One advantage of Tableau is the easy color coordination. Using the “cyclic” pallet, the colors come out pretty good for the I/O histograms (this would be time consuming on Excel)

Security: It’s always the silly things that get you…

I had to laugh when I read this story about Amazon Web Services. It’s posted with an attention grabbing title that implies this is an Amazon problem, but it is squarely down to user error/oversight.

Luckily I’ve not fallen into this trap yet, but I have done equally silly things in the past. That reminds me, I must go to a Pete Finnigan session next time we are at the same conference… :)



HTML with Embedded Images from PL/SQL…

Someone asked me a question about generating HTML with embedded images from PL/SQL and my answer started to spiral out of control, so I figured I’d just write it as an article.



AWR mining

Update: nice query from Jonathan Lewis on AWR:

I found this while  looking for how to eliminate reporting stats over database restarts. Apparently if using snap_id then there will be no snap_id-1 over restarts because the ids will jump more than one across restarts – have to check into this

Exploring ways to mine data in AWR.
One situation I’m looking at now is a number of database that all use the same LUNs on the back end storage, so two metrics that would be interesting to look at are IOs from these databases and the IO latencies to answer the question, “does read and/or write I/O activity correlate to increased latency on the backend storage?”

I have multiple databases in one AWR repository.

I want to create the output in a format that I can read with something like Excel, so I make the values comma delimited,
AWRs values for sysmetric are hourly, so I want to round the times to the nearest hour to make correlation easier and I want to translate the DBID into the database name to make reading the data easier. I could do something like:

                   'Physical Write Total Bytes Per Sec','write','read')||','||
                ( ROUND ((begin_time - TRUNC (begin_time)) * 24) / 24),
                       'YYYY/MM/DD HH24:MI')||','||
        from dba_hist_sysmetric_summary s,
               (select distinct dbid, db_name, host_name
                from dba_hist_database_instance) n
               ( s.metric_name= 'Physical Write Total Bytes Per Sec'
                 s.metric_name= 'Physical Read Total Bytes Per Sec'
               ) and n.dbid=s.dbid
               order by begin_time;

For the I/O latencies it take some more work as I have to compute the deltas between snapshots getting the total elasped time and the count of waits to compute the average wait over the snapshot period for each I/O wait event. I also have to watch out for database bounces, which I have inelligantly addressed by simply filtering only for waits less than or equal to 0. I could do something like

       nvl(round((time_ms_end-time_ms_beg)/nullif(count_end-count_beg,0),3),0) avg_ms
from (
          to_char(trunc(cast(begin_interval_time as date ))+
                  (ROUND ((cast(begin_interval_time as  date)- TRUNC (cast(begin_interval_time as date))) * 24) / 24),
                       'YYYY/MM/DD HH24:MI') btime,
          Lag (e.total_waits) OVER( PARTITION BY e.event_name, e.dbid ORDER BY s.snap_id)
          total_waits count_end,
          Lag (e.time_waited_micro/1000) OVER( PARTITION BY e.event_name,e.dbid ORDER BY s.snap_id)
          time_waited_micro/1000 time_ms_end
            s.snap_id=e.snap_id and
            s.dbid=e.dbid and
            (e.event_name= 'db file sequential read'
          or e.event_name='db file scattered read' )
   order by begin_interval_time
) s,
  (select distinct dbid, db_name, host_name from dba_hist_database_instance) n
       round((time_ms_end-time_ms_beg)/nullif(count_end-count_beg,0),3)  >= 0
order by btime;

Correlating the read/write throughput along with single and multiblock read latencies across multiple database over multiple days can pose challenges. One strategy is to boil the data down to a smaller number of metrics like total read, total write to average single block I/O. For average single block I/O, I’d want to normalize the I/O latency relative to read I/O for that database. Maybe I could get an average of latencies weighted by throughput of that database
Work in progress …


A question came up on Oracle-L recently about the difference in work done by the following two queries:



Before saying anything else, I should point out that these two queries are NOT logically equivalent unless you can guarantee that the table() operator returns a unique set of values – and Oracle doesn’t allow uniqueness to be enforced on collections.

However, the author of the question had ensured that the results from the table() operator contained no duplicates so the two versions of the query returned the same result set. The question is, why does one query do fewer buffer visits than the other – as evidenced by the results from sql_trace and tkprof.

Rule-based Join

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.01       0.01          0          0          0           0
Fetch        1      0.86       0.86          0     200047          0      115195
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        3      0.87       0.87          0     200047          0      115195

Rows     Row Source Operation
-------  ---------------------------------------------------
 115195  NESTED LOOPS  (cr=200047 pr=0 pw=0 time=6355 us)
  99704   COLLECTION ITERATOR PICKLER FETCH (cr=0 pr=0 pw=0 time=284 us)
 115195   INDEX RANGE SCAN DOM_NAME_IDX (cr=200047 pr=0 pw=0 time=3191 us)(object id 54309)

CBO with subquery

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.01       0.01          0          0          0           0
Fetch        1      0.78       0.78          0     157986          0      115195
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        3      0.80       0.80          0     157986          0      115195

Rows     Row Source Operation
-------  ---------------------------------------------------
 115195  NESTED LOOPS (cr=157986 pr=0 pw=0 time=4720 us cost=536 size=6141 card=267)
  99704   SORT UNIQUE (cr=0 pr=0 pw=0 time=417 us)
  99704    COLLECTION ITERATOR PICKLER FETCH (cr=0 pr=0 pw=0 time=259 us)
 115195   INDEX RANGE SCAN DOM_NAME_IDX (cr=157986 pr=0 pw=0 time=3353 us cost=2 size=21 card=1)(object id 54309)

Notice how the optimizer has unnested the subquery and introduced a “sort unique” – this demonstrates the optimizer’s recognition that the two queries are not logically identical. However, the number of rows from the collection is the same (99,704) and the number of rows after joining is the same (115,195) – the data sets have been rigged so that it is a fair comparison. So why does the explicit join take 200,047 buffer visits when the transformed  subquery approach take only 157,986 buffer visits.

The answer is that you can visit buffers without doing “logical I/Os” – and this benefit accrues most frequently to indexed access paths. Thanks to the way Oracle has sorted the collection before doing the join the nested loop follows the second index (dom_name_idx) in order – this increases the chance that index blocks that were used in the previous cycle of the loop will be re-used in the next cycle, which means that Oracle need not release the pins on the index blocks, but can revisit them incrementing the statitsic: “buffer is pinned count”.

I have to say that I haven’t proved that this is exactly what was happening, but when I suggested to the person who had asked the question that he re-run and check to see if the decrease in “session logical reads” was balanced by an increase in “buffer is pinned count” the answer seemed to confirm the hypothesis. [Correction (see comment #1 from Randolf): the results didn't confirm the hypothesis - but I think that' s because of the change in "pinning" that Randolf describes, so I'll have to find a way to confirm the hypothesis some other time.]


For Exadata geeks

Just letting you know that we are all (almost) done with the Expert Oracle Exadata book work and it will be published in less than a month!!!

Expect (many) new blog entries soon! :-)

ITL Waits – Changes in Recent Releases (script)

A reader of this blog, Paresh, asked me how I was able to find out the logic behind ITL waits without having access to Oracle code. My reply was: I wrote a test case that reproduce ITL waits and a piece of code that monitors them.

Since other readers might be interested, here is the shell script I wrote. Notice that it takes four parameters as input: user name, password, SID, and how long it has to wait in the monitoring phase.

Addressing TM Enqueue Contention – What is Wrong with this Quote?

June 19, 2011 Currently, the most viewed article in the past 90 days on this blog demonstrates a deadlock situation that is present in Oracle Database and above that is not present in Oracle Database through (someone else might be able to check the earlier releases).  This particular deadlock was the result of [...]

A Few Random Notes

Just a few random notes. Oracle Mix have the Oracle OpenWorld 2011 Suggest-A-Session again this year with lots of good presentations as always. Follow the link to vote for my Q & A session on Oracle Indexing. Jonathan Lewis has an interesting quiz on Oracle Indexes in answer to a question from the OTN forums: “If I delete [...]

Using Agile Practices to Create an Agile Presentation

What’s the best way to make a presentation on Agile practices? Practice Agile practices.

You could write a presentation “big bang” style, delivering version 1.0 in front of your big audience of 200+ people at Kscope 2011 before anybody has seen it. Of course, if you do it that way, you build a lot of risk into your product. But what else can you do?

You can execute the Agile practices of releasing early and often, allowing the reception of your product to guide its design. Whenever you find an aspect of your product that doesn’t get the enthusiastic reception you had hoped for, you fix it for the next release.