Search

Top 60 Oracle Blogs

Recent comments

Oakies Blog Aggregator

Oracle’s latest acquisition: Me

I’m definitely the type of person that gets excited by new opportunities and always loves a new challenge. Without challenge, I get bored quickly and boredom makes me a little crazy.

So, this new opportunity came along a little while ago and I thought it sounded just perfect for me. Many of you that know me will recall that I’ve had trouble finding the right company that fits with all aspects of my personality, goals, and philosophy which has led me to “try” a few of them in the past several years. I don’t regret the choices I’ve made and I’ve learned an awful lot from each of my employers. Most importantly, I’ve created new relationships at each of my past companies that I still maintain today.

In looking at all the past experiences, I’ve concluded that I am ready for a change in direction. Things I enjoy:
Oracle products
People (customers and Oracle employees) that work with Oracle software
RAC
High Availability
Large, complex environments
Servers, OS, and storage infrastructures
Working with lots of new people, especially creating, managing, and growing new relationships
Presenting my knowledge to others, both one-on-one and to groups (and conference events that surround such gatherings)
A little bit of travel (which helps support my love of…)
Scuba diving

So, when an opportunity came along to get paid to do something that combines almost all of the things I enjoy (except scuba), I couldn’t pass it up. Starting on May 18, 2009, I will be the newest member of the new group at Oracle that’s known as the “X-Team”. This team is responsible for working with customers and prospective customers of the Oracle Exadata and HP Oracle Database Machine products to help them adopt these new technologies. The group is a part of the Maximum Availability Architecture group at Oracle that authors many of the best practices whitepapers and presentations you have likely seen online. For those at Oracle that know what this means, the group is a part of the Server Technology development organization under Juan Loaiza.

For those of you that have been my past consulting customers, first of all, Thank You. I’m no longer consulting and while I won’t be able to provide an “Oracle-sanctioned” recommendation to other consultants that may be able to help, I do have a large network of friends and one of them can likely help you. Please never hesitate to keep in touch!

On a logistical note, I’m not moving and will hopefully continue to be involved with local events in Chicagoland. However, I will be traveling part of the time to visit customers and other Oracle facilities, so keep an eye on my twitter feed, Britekite location, and Tripit plans and let me know if there’s a chance to have a meeting IRL.

Finally, this decision to join Oracle means that I’ll be sacrificing several things. First and probably most near and dear to me is the RAC SIG. In September 2008, I took over as the RAC SIG President. The RAC SIG is as strong as ever and there are a good group of volunteers involved in leading the group as it continues to grow and evolve. I’ll always be a member of the RAC SIG and will continue to watch it closely and volunteer when and where I can. The RAC SIG is associated with the IOUG, the Independent Oracle User Group, and Oracle employees shouldn’t be too involved in “independent” groups. So, this year, the RAC SIG will once again elect a new president. I will remain president until Oracle Open World in October 2009 in order to provide continuity to the group’s leadership and ensure a smooth transition. You can nominate yourself for a RAC SIG office soon via our website nomination form (nominations will be open soon and stay open until July 31, 2009).

I’m also going to relinquish my appointment as an Oracle ACE Director. While I think I’ll still be considered an Oracle Employee ACE, I’ll remember fondly the fame that Oracle Technology Network affords the Oracle ACE program and the individuals that are given the honor. Thanks to Justin, Vikki, Lillian, Todd, and the others at Oracle for allowing me to be a part of that program. I’ll certainly miss the perks!

That’s about it for now, I’m off to the new job and will once again begin learning. Luckily, I’m apparently the only person named Dan Norris at Oracle (last someone checked for me), so you can contact me at dan.norris@oracle.com in a couple weeks.

growing pains ...

My role at work has been shifting over the past six months and I'm still not sure how I feel about this. In theory, I'm supposed to be 'architecting' but most of the time, I feel like I'm somewhere between a technical writer and a hostage negotiator, with the hostage alternating at turns between the integrity of the database and my sanity. Tensions have been high for everyone, the project is

Understanding the different modes of System Statistics aka. CPU Costing and the effects of multiple blocksizes - part 2

Back to part 1 Forward to part 3

Before heading on to the remaining modes of system statistics, let's summarize what has been observed in part 1 regarding the default NOWORKLOAD system statistics in 10g and later. The following table shows what the test case from the previous post demonstrated:

Table 1: 8KB MSSM locally managed tablespace 10,000 blocks table segment
default NOWORKLOAD system statistics:

MBRC|SREADTIM|MREADTIM|MREADTIM/SREADTIM|NOCPU cost|adjusted MBRC|CPU cost|CPU/NOCPU cost
----|--------|--------|-----------------|----------|-------------|--------|--------------
8 |12 | 26 | 2.16 |1,518 | 6.59 |2,709 |1.78
16 |12 | 42 | 3.5 | 962 |10.39 |2,188 |2.27
32 |12 | 74 | 6.16 | 610 |16.39 |1,928 |3.16
64 |12 |138 |11.5 | 387 |25.84 |1,798 |4.64
128 |12 |266 |22.16 | 245 |40.82 |1,732 |7.07

If you happen to have a 16KB default blocksize the results would look like the following. Note that the table is now only 5,000 blocks in size, and the SREADTIM is now a bit longer (10+16384/4096=14ms instead of 10+8192/4096=12ms) therefore the 16KB blocksize calculation makes the full table scan look a bit cheaper to the optimizer when using the default NOWORKLOAD system statistics.

Table 2: 16KB MSSM locally managed tablespace 5,000 blocks table segment
default NOWORKLOAD system statistics:

MBRC|SREADTIM|MREADTIM|MREADTIM/SREADTIM|NOCPU cost|adjusted MBRC|CPU cost|CPU/NOCPU cost
----|--------|--------|-----------------|----------|-------------|--------|--------------
4 |14 | 26 | 1.86 |1,119 | 4.17 |2,322 |2.08
8 |14 | 42 | 3.0 | 759 | 6.59 |1,875 |2.47
16 |14 | 74 | 5.3 | 481 |10.39 |1,652 |3.43
32 |14 |138 | 9.86 | 305 |16.39 |1,540 |5.05
64 |14 |266 |19.0 | 194 |25.84 |1,485 |7.65

Gathered NOWORKLOAD system statistics

If you gather NOWORKLOAD system statistics using DBMS_STATS.GATHER_SYSTEM_STATS('NOWORKLOAD') the values IOSEEKTIM and IOTFRSPEED will actually get measured and used accordingly.

The remaining calculations how to derive the SREADTIM and MREADTIM values correspond to what has been outlined above.

Gathering NOWORKLOAD statistics:

exec DBMS_STATS.GATHER_SYSTEM_STATS('NOWORKLOAD')

This actually gathers the IOTFRSPEED and IOSEEKTIM values in addition to CPUSPEEDNW rather than using the default values of 4096 and 10.

In 10g and later this may take from a couple of seconds to a couple of minutes depending on the size of your database. Note that this puts additional load onto your system while gathering the NOWORKLOAD system statistics since it submits random reads against all data files.

The following test case shows the different I/O cost calculations when using default NOWORKLOAD system statistics and custom gathered NOWORKLOAD system statistics. It creates again the 10,000 blocks table in a 8KB default block size locally managed tablespace using manual segment space management:

SQL>
SQL> drop table t1 purge;

Table dropped.

Elapsed: 00:00:00.03
SQL>
SQL> create table t1
2 pctfree 99
3 pctused 1
4 -- tablespace test_2k
5 -- tablespace test_4k
6 tablespace test_8k
7 -- tablespace test_16k
8 as
9 with generator as (
10 select --+ materialize
11 rownum id
12 from all_objects
13 where rownum <= 3000
14 )
15 select
16 /*+ ordered use_nl(v2) */
17 rownum id,
18 trunc(100 * dbms_random.normal) val,
19 rpad('x',100) padding
20 from
21 generator v1,
22 generator v2
23 where
24 rownum <= 10000
25 ;

Table created.

Elapsed: 00:00:02.22
SQL>
SQL>
SQL> begin
2 dbms_stats.gather_table_stats(
3 user,
4 't1',
5 cascade => true,
6 estimate_percent => null,
7 method_opt => 'for all columns size 1'
8 );
9 end;
10 /

PL/SQL procedure successfully completed.

Elapsed: 00:00:02.29
SQL> -- default NOWORKLOAD system statistics
SQL> -- ignore CPU costs for the moment
SQL> begin
2 dbms_stats.delete_system_stats;
3 dbms_stats.set_system_stats('CPUSPEEDNW',1000000);
4 end;
5 /

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.03
SQL>
SQL> column sname format a20
SQL> column pname format a20
SQL> column pval2 format a20
SQL>
SQL> select
2 sname
3 , pname
4 , pval1
5 , pval2
6 from
7 sys.aux_stats$;

SNAME PNAME PVAL1 PVAL2
-------------------- -------------------- ---------- --------------------
SYSSTATS_INFO STATUS COMPLETED
SYSSTATS_INFO DSTART 04-26-2009 14:21
SYSSTATS_INFO DSTOP 04-26-2009 14:21
SYSSTATS_INFO FLAGS 1
SYSSTATS_MAIN CPUSPEEDNW 1000000
SYSSTATS_MAIN IOSEEKTIM 10
SYSSTATS_MAIN IOTFRSPEED 4096
SYSSTATS_MAIN SREADTIM
SYSSTATS_MAIN MREADTIM
SYSSTATS_MAIN CPUSPEED
SYSSTATS_MAIN MBRC
SYSSTATS_MAIN MAXTHR
SYSSTATS_MAIN SLAVETHR

13 rows selected.

Elapsed: 00:00:00.06
SQL>
SQL> alter session set "_table_scan_cost_plus_one" = false;

Session altered.

Elapsed: 00:00:00.02
SQL>
SQL> explain plan for
2 select
3 max(val)
4 from
5 t1;

Explained.

Elapsed: 00:00:00.01
SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3724264953

---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 4 | 2709 (0)| 00:00:33 |
| 1 | SORT AGGREGATE | | 1 | 4 | | |
| 2 | TABLE ACCESS FULL| T1 | 10000 | 40000 | 2709 (0)| 00:00:33 |
---------------------------------------------------------------------------

9 rows selected.

Elapsed: 00:00:00.08
SQL> -- gather NOWORKLOAD system statistics
SQL> exec dbms_stats.gather_system_stats('NOWORKLOAD')

PL/SQL procedure successfully completed.

Elapsed: 00:00:14.43
SQL> -- ignore CPU costs for the moment
SQL> begin
2 dbms_stats.set_system_stats('CPUSPEEDNW',1000000);
3 end;
4 /

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.02
SQL> column sname format a20
SQL> column pname format a20
SQL> column pval2 format a20
SQL>
SQL> select
2 sname
3 , pname
4 , pval1
5 , pval2
6 from
7 sys.aux_stats$;

SNAME PNAME PVAL1 PVAL2
-------------------- -------------------- ---------- --------------------
SYSSTATS_INFO STATUS COMPLETED
SYSSTATS_INFO DSTART 04-26-2009 14:21
SYSSTATS_INFO DSTOP 04-26-2009 14:21
SYSSTATS_INFO FLAGS 1
SYSSTATS_MAIN CPUSPEEDNW 1000000
SYSSTATS_MAIN IOSEEKTIM 14.226
SYSSTATS_MAIN IOTFRSPEED 32517.754
SYSSTATS_MAIN SREADTIM
SYSSTATS_MAIN MREADTIM
SYSSTATS_MAIN CPUSPEED
SYSSTATS_MAIN MBRC
SYSSTATS_MAIN MAXTHR
SYSSTATS_MAIN SLAVETHR

13 rows selected.

Elapsed: 00:00:00.06
SQL>
SQL> explain plan for
2 select
3 max(val)
4 from
5 t1;

Explained.

Elapsed: 00:00:00.02
SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3724264953

---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 4 | 1403 (0)| 00:00:21 |
| 1 | SORT AGGREGATE | | 1 | 4 | | |
| 2 | TABLE ACCESS FULL| T1 | 10000 | 40000 | 1403 (0)| 00:00:21 |
---------------------------------------------------------------------------

9 rows selected.

Elapsed: 00:00:00.04
SQL>
SQL> spool off

Based on the gathered IOSEEKTIM and IOTFRSPEED values the I/O cost calculated is significantly different.

Applying the known formulas we can reproduce the calculated figures:

SREADTIM = IOSEEKTIM + DB_BLOCK_SIZE / IOTFRSPEED

SREADTIM = 14.226 + 8192 / 32,517.754 = 14.478

MREADTIM = IOSEEKTIM + MBRC * DB_BLOCK_SIZE / IOTFRSPEED

MREADTIM = 14.226 + 8 * 8192 / 32,517.754 = 16.241

FTS cost = Blocks below HWM / MBRC * MREADTIM / SREADTIM

FTS cost = 10,000 / 8 * 16.241 / 14.478 = 1,403

Gathered WORKLOAD system statistics

Gathering WORKLOAD system statistics measures a different set of values, including the actual MBRC, SREADTIM and MREADTIM values. The cost calculation therefore doesn't use the synthesized SREADTIM and MREADTIM values any longer, nor does it use the "_db_file_optimizer_read_count" parameter in 10g and later, but uses simply the measured values.

Therefore the I/O costs calculated with WORKLOAD system statistics are not dependent on the "db_file_multiblock_read_count" value used, but the important point to keep in mind is that the gathered WORKLOAD system statistics are based on the "db_file_multiblock_read_count" (in 10g and later on the internal parameter "_db_file_exec_read_count") value used at runtime, so the values measured are obviously influenced by this setting ("_db_file_exec_read_count" equals "db_file_multiblock_read_count" if this has been set and the underscore parameter hasn't been modified).

As already mentioned in part 1 Oracle has introduced with Oracle 10.2 that if the "db_file_multiblock_read_count" is left unset different values for cost calculation and at execution time will be used (8 for calculation and the largest possible I/O size at runtime, usually 1MB on most platforms), so that points in general into the right direction since it allows the calibration code to work out the largest MBRC possible at runtime that can be achieved. Note that Christian Antognini doesn't agree to this approach in his book "Troubleshooting Oracle Performance" where he advices to manually work out the "optimal" MBRC setting running suitable I/O tests.

Note that in 10g and later the runtime engine still uses the "_db_file_exec_read_count", regardless of the MBRC used to calculate the cost.

If you run the following code snippet in 10g and later and check the resulting trace files, you'll see this confirmed:

alter session set tracefile_identifier = 'exec_count_16';

alter session set "_db_file_exec_read_count" = 16;

alter system flush buffer_cache;

alter session set events '10046 trace name context forever, level 8';

select max(val)
from t1;

alter session set events '10046 trace name context off';

alter session set tracefile_identifier = 'exec_count_128';

alter session set "_db_file_exec_read_count" = 128;

alter system flush buffer_cache;

alter session set events '10046 trace name context forever, level 8';

select max(val)
from t1;

alter session set events '10046 trace name context off';

The resulting trace files look like the following:

The 16 blocks setting:

.
.
.
WAIT #2: nam='db file scattered read' ela= 1732 file#=8 block#=12058 blocks=16 obj#=62088 tim=69006657688
WAIT #2: nam='db file scattered read' ela= 1725 file#=8 block#=12074 blocks=16 obj#=62088 tim=69006659628
WAIT #2: nam='db file scattered read' ela= 1726 file#=8 block#=12090 blocks=16 obj#=62088 tim=69006661566
.
.
.

The 128 blocks setting:

.
.
.
WAIT #2: nam='db file scattered read' ela= 13842 file#=8 block#=12169 blocks=128 obj#=62088 tim=69008775308
WAIT #2: nam='db file scattered read' ela= 15513 file#=8 block#=12297 blocks=128 obj#=62088 tim=69008793460
WAIT #2: nam='db file scattered read' ela= 26437 file#=8 block#=12425 blocks=128 obj#=62088 tim=69008822434
.
.
.

Gathering WORKLOAD system statistics:

exec DBMS_STATS.GATHER_SYSTEM_STATS('START')
-- some significant (ideally "representative") workload needs to be performed
-- otherwise some or all of the measured values will be missing
exec DBMS_STATS.GATHER_SYSTEM_STATS('STOP')

or

exec DBMS_STATS.GATHER_SYSTEM_STATS('INTERVAL', )

Note that gathering workload system statistics doesn't put additional load onto your system, since the values are derived from the delta in statistics already maintained by Oracle during database activity.

Furthermore if your workload doesn't use "db file scattered read" i.e. multi-block reads that are working with the buffer cache, then you might end up with WORKLOAD system statistics that are missing the MBRC and MREADTIM component (null values). This might happen if you e.g. use only index access paths with table row random lookups by ROWID and/or all your tablescans are all going parallel, or in 11g use serial direct reads that bypass the buffer cache (which can be activated in pre-11g using the hidden parameter "_serial_direct_read").

The same applies to "db file sequential read" i.e. single-block reads, if you only perform multi-block reads in your workload then the SREADTIM information might be missing from the gathered statistics.

Although the official documentation says in 10.2 for that case the following:
"During the gathering process of workload statistics, it is possible that mbrc and mreadtim will not be gathered if no table scans are performed during serial workloads, as is often the case with OLTP systems. On the other hand, FTS occur frequently on DSS systems but may run parallel and bypass the buffer cache. In such cases, sreadtim will still be gathered since index lookup are performed using the buffer cache. If Oracle cannot gather or validate gathered mbrc or mreadtim, but has gathered sreadtim and cpuspeed, then only sreadtim and cpuspeed will be used for costing. FTS cost will be computed using analytical algorithm implemented in previous releases. Another alternative to computing mbrc and mreadtim is to force FTS in serial mode to allow the optimizer to gather the data."

And the 11g documentation says this:
"If Oracle Database cannot gather or validate gathered mbrc or mreadtim values, but has gathered sreadtim and cpuspeed values, then only the sreadtim and cpuspeed values are used for costing. In this case, the optimizer uses the value of the initialization parameter DB_FILE_MULTIBLOCK_READ_COUNT to cost a full table scan. However, if DB_FILE_MULTIBLOCK_READ_COUNT is not set or is set to 0 (zero), then the optimizer uses a value of 8 for costing."

But when testing this it looked like that if either MBRC or MREADTIM was missing then the optimizer simply reverted to the available NOWORKLOAD system statistics (Note that this applies to 10g and later; I'll show in the next part of the series what happens in 9i since things are different there).

Note in order to have the optimizer accept the WORKLOAD system statistic the MREADTIM needs to greater than the SREADTIM. If your multi-block read requests are served from a cache or your storage system performs aggresssive read-aheads the measured MREADTIM can be less than the SREADTIM. In this case you might need to adjust the MREADTIM manually using the GET_SYSTEM_STATS/SET_SYSTEM_STATS API, which will be covered below.

One interesting oddity showed up when MBRC was available but MREADTIM was missing or not greater than SREADTIM: In that case it looks like that the NOWORKLOAD statistics use the MBRC set in their calculations for synthesizing the MREADTIM and calculating the full table scan cost. This makes sense but is interesting mixture of NOWORKLOAD and WORKLOAD system statistics.

The following test case shows how to gather WORKLOAD system statistics, and how to correct manually a MREADTIM value gathered too low.

SQL>
SQL> drop table t1 purge;

Table dropped.

Elapsed: 00:00:02.44
SQL>
SQL> create table t1
2 pctfree 99
3 pctused 1
4 -- tablespace test_2k
5 -- tablespace test_4k
6 tablespace test_8k
7 -- tablespace test_16k
8 as
9 with generator as (
10 select --+ materialize
11 rownum id
12 from all_objects
13 where rownum <= 3000
14 )
15 select
16 /*+ ordered use_nl(v2) */
17 rownum id,
18 trunc(100 * dbms_random.normal) val,
19 rpad('x',100) padding
20 from
21 generator v1,
22 generator v2
23 where
24 rownum <= 10000
25 ;

Table created.

Elapsed: 00:00:02.27
SQL>
SQL> begin
2 dbms_stats.gather_table_stats(
3 user,
4 't1',
5 cascade => true,
6 estimate_percent => null,
7 method_opt => 'for all columns size 1'
8 );
9 end;
10 /

PL/SQL procedure successfully completed.

Elapsed: 00:00:02.60
SQL>
SQL> begin
2 dbms_stats.delete_system_stats;
3 dbms_stats.set_system_stats('CPUSPEEDNW',1000000);
4 end;
5 /

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.07
SQL>
SQL> alter session set "_table_scan_cost_plus_one" = false;

Session altered.

Elapsed: 00:00:00.01
SQL>
SQL> exec dbms_stats.gather_system_stats('START')

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.12
SQL>
SQL> begin
2 for i in 1..10 loop
3 for rec in (
4 select
5 max(val)
6 from
7 t1
8 ) loop
9 execute immediate 'alter system flush buffer_cache';
10 end loop;
11 end loop;
12 for rec in (
13 select /*+ use_nl(a t1) */ max(val) from t1,
14 (
15 select /*+ no_merge no_eliminate_oby */
16 rowid as row_id
17 from
18 t1
19 order by
20 dbms_random.value
21 ) a
22 where a.row_id = t1.rowid
23 ) loop
24 null;
25 end loop;
26 end;
27 /

PL/SQL procedure successfully completed.

Elapsed: 00:01:17.73
SQL>
SQL> exec dbms_stats.gather_system_stats('STOP')

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.74
SQL>
SQL> begin
2 dbms_stats.set_system_stats('CPUSPEED',1000000);
3 end;
4 /

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.09
SQL>
SQL> declare
2 s_status varchar2(200);
3 dt_dstart date;
4 dt_dstop date;
5 n_pvalue number;
6 begin
7 dbms_stats.get_system_stats (
8 s_status,
9 dt_dstart,
10 dt_dstop,
11 'MREADTIM',
12 n_pvalue);
13 dbms_output.put_line('Status: ' || s_status);
14 dbms_output.put_line('Dstart: ' || to_char(dt_dstart, 'DD.MM.YYYY HH24:MI:SS'));
15 dbms_output.put_line('Dstop : ' || to_char(dt_dstop , 'DD.MM.YYYY HH24:MI:SS'));
16 dbms_output.put_line('Value : ' || to_char(n_pvalue, 'TM'));
17 dbms_stats.set_system_stats('MREADTIM', 10 * n_pvalue);
18 end;
19 /
Status: COMPLETED
Dstart: 03.05.2009 13:24:00
Dstop : 03.05.2009 13:24:00
Value : 1.293

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.11
SQL>
SQL> column sname format a20
SQL> column pname format a20
SQL> column pval2 format a20
SQL>
SQL> select
2 sname
3 , pname
4 , pval1
5 , pval2
6 from
7 sys.aux_stats$;

SNAME PNAME PVAL1 PVAL2
-------------------- -------------------- ---------- --------------------
SYSSTATS_INFO STATUS COMPLETED
SYSSTATS_INFO DSTART 05-03-2009 13:24
SYSSTATS_INFO DSTOP 05-03-2009 13:24
SYSSTATS_INFO FLAGS 1
SYSSTATS_MAIN CPUSPEEDNW 1000000
SYSSTATS_MAIN IOSEEKTIM 10
SYSSTATS_MAIN IOTFRSPEED 4096
SYSSTATS_MAIN SREADTIM 8.021
SYSSTATS_MAIN MREADTIM 12.93
SYSSTATS_MAIN CPUSPEED 1000000
SYSSTATS_MAIN MBRC 8
SYSSTATS_MAIN MAXTHR
SYSSTATS_MAIN SLAVETHR

13 rows selected.

Elapsed: 00:00:00.06
SQL>
SQL> explain plan for
2 select
3 max(val)
4 from
5 t1;

Explained.

Elapsed: 00:00:00.02
SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3724264953

---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 4 | 2016 (0)| 00:00:17 |
| 1 | SORT AGGREGATE | | 1 | 4 | | |
| 2 | TABLE ACCESS FULL| T1 | 10000 | 40000 | 2016 (0)| 00:00:17 |
---------------------------------------------------------------------------

9 rows selected.

Elapsed: 00:00:00.17
SQL>
SQL> spool off

Using the known formula we can confirm the cost calculation above:

Blocks below HWM / MBRC: Number of multi-block read requests required to scan the segment

Number of multi-block read requests * MREADTIM = time it takes to perform these number of read requests in milliseconds.

Finally this is divided by SREADTIM to arrive at the known unit used for cost representation which is number of single read requests.

10,000 / 8 = 1,250 multi-block read requests

1,250 * 12.93 = 16,162.5 ms execution time (which is shown as 17 seconds in the plan by the way)

16,162.5 / 8.021 = 2,015.02 (which is shown as 2,016 in the plan)

Manually writing and maintaining system statistics

Using the DBMS_STATS.GET_SYSTEM_STATS / SET_SYSTEM_STATS API you can write your own set of system statistics for both NOWORKLOAD and WORKLOAD values.

You can use DBMS_STATS.DELETE_SYSTEM_STATS to remove the system statistics, which will activate the default NOWORKLOAD system statistics in 10g and disable CPU costing in 9i.

You can use DBMS_STATS.EXPORT_SYSTEM_STATS / IMPORT_SYSTEM_STATS to export and import system statistics to a user statistics table created with DBMS_STATS.CREATE_STAT_TABLE.

Note that DBMS_STATS.GATHER_SYSTEM_STATS when used with an user stats table (created with DBMS_STATS.CREATE_STAT_TABLE) behaves differently than e.g. DBMS_STATS.GATHER_TABLE_STATS: Whereas object related statistics always go to the data dictionary and you only have the option to save the current statistics to the user stats table before replacing them with the new values, GATHER_SYSTEM_STATS actually writes the system statistics into the user stats table and doesn't change the actual system statistics if you're supplying a user stats table name.

For further discussion how you could use system statistics see Jonathan Lewis' thoughts about this topic:

Part 1
Part 2
Part 3

The next part of the series will cover the usage of system statistics in 9i, highlight some quirks and oddities observed and finally show what happens if you attempt to use multiple block sizes for "tuning" purposes.

Dynamic logging with global application context

Controlling logging output across sessions using global application context. September 2007 (updated April 2009)

Collaborate 09: Don’t miss these sessions

Collaborate 09 starts on Sunday, May 3 (a few days from now!) in Orlando. I’ve been offline for several weeks (more on that later), but will be returning to the world of computers and technology in full force in Orlando. I’ve had a few inquiries about whether or not I’ll be at Collaborate, so I thought I’d resurrect my blog with a post about where I’ll be and some of the highlights I see at Collaborate 09.

First, where I’ll be presenting:

  • Monday, 10:45-11:45am, #301, “Avoiding Common RAC Problems”
  • Tuesday, 9:45am-12pm, #332, “Installing RAC From The Ground Up”
  • Wednesday, 9:45-10:45am, #121, “Troubleshooting Oracle Clusterware”

I’m also currently the President of the Oracle RAC Special Interest Group (RAC SIG). The RAC SIG is hosting several great sessions (I’m moderating a couple of these panels) at Collaborate 09 as well:

  • Sunday, 6-7:30pm, IOUG/SIG Welcome Reception (each SIG will have representatives there–this is open to all IOUG attendees)
  • Monday, 8-9am, RAC SIG Orientation
  • Tuesday, 12:15-1:15pm, RAC SIG Birds of a Feather
  • Tuesday, 4:30-5:30pm, RAC SIG Expert Panel
  • Wednesday, 4:30-5:30pm, RAC SIG Customer Panel (not in online scheduler at the moment, check again later)
  • Thursday, 8:30am-12pm, RAC Attack (University Session – Additional fee required)

The RAC SIG has also assembled this list of RAC-related sessions at Collaborate 09 to help you plan your conference agenda.

Be sure to set up your personal agenda using the agenda builder and add these sessions to your agenda. I think that if you have these in your agenda and details (like date or room assignments) change, you’ll be notified via email (not sure, but I think that’s how it works).

Also, you can follow @IOUG on Twitter (follow me too if you’d like) and that will help you find where the action is during the event next week. It’s going to be a great event and I look forward to seeing you there!

The Most Common Performance Problem I See

At the Percona Performance Conference in Santa Clara this week, the first question an audience member asked our panel was, "What is the most common performance problem you see in the field?"

I figured, being an Oracle guy at a MySQL conference, this might be my only chance to answer something, so I went for the mic. Here is my answer.

The most common performance problem I see is people who think there's a most-common performance problem that they should be looking for, instead of measuring to find out what their actual performance problem actually is.

It's a meta answer, but it's a meta problem. The biggest performance problems I see, and the ones I see most often, are not problems with machines or software. They're problems with people who don't have a reliable process of identifying the right thing to work on in the first place.

That's why the definition of Method R doesn't mention Oracle, or databases, or even computers. It's why Optimizing Oracle Performance spends the first 69 pages talking about red rocks and informed consent and Eli Goldratt instead of Oracle, or databases, or even computers.

The most common performance problem I see is that people guess instead of knowing. The worst cases are when people think they know because they're looking at data, but they really don't know, because they're looking at the wrong data. Unfortunately, every case of guessing that I ever see is this worst case, because nobody in our business goes very far without consulting some kind of data to justify his opinions. Tim Cook from Sun Microsystems pointed me yesterday to a blog post that gives a great example of that illusion of knowing when you really don't.

Understanding the different modes of System Statistics aka. CPU Costing and the effects of multiple blocksizes - part 1

Forward to part 2

This is the first part of a series of posts that cover one of the fundamentals of the cost based optimizer in 9i and later. Understanding how the different system statistics modes work is crucial in making the most out of the cost based optimizer, therefore I'll attempt to provide some detailed explanations and samples about the formulas and arithmetics used. Finally I'll show (again) that using multiple block sizes for "tuning" purposes is a bad idea in general, along with detailed examples why I think this is so.

One of the deficiencies of the traditional I/O based costing was that it simply counted the number of I/O requests making no differentation between single-block I/O and multi-block I/O.

System statistics were introduced in Oracle 9i to allow the cost based optimizer to take into account that single-block I/Os and multi-block I/Os should be treated differently in terms of costing and to include a CPU component in the cost calculation.

The system statistics tell the cost based optimizer (CBO) among other things the time it takes to perform a single block read request and a multi-block read request. Given this information the optimizer ought to be able to come to estimates that better fit the particular environment where the database is running on and additionally use an appropriate costing for multi-block read requests that usually take longer than single block read requests. Given the information about the time it takes to perform the read requests the cost calculated can be turned into a time estimate.

The cost calculated with system statistics is still expressed in the same units as with traditional I/O based costing, which is in units of single-block read requests.

Although the mode using system statistics is also known as "CPU costing" despite the name the system statistics have the most significant impact on the I/O costs calculated for full table scans due to the different measure MREADTIM used for multi-block read requests.

Starting with Oracle 10g you have actually the choice of three different modes of system statistics also known as CPU costing:

1. Default NOWORKLOAD system statistics
2. Gathered NOWORKLOAD system statistics
3. Gathered WORKLOAD system statistics

The important point to understand here is that starting with Oracle 10g system statistics are enabled by default (using the default NOWORKLOAD system statistics) and you can only disable them by either downgrading your optimizer (using the OPTIMIZER_FEATURES_ENABLE parameter) or using undocumented parameters or hints ("_optimizer_cost_model" respectively the CPU_COSTING and NOCPU_COSTING hints).

This initial part of the series will focus on the default NOWORKLOAD system statistics introduced with Oracle 10g.

Default NOWORKLOAD system statistics

The default NOWORKLOAD system statistics measure only the CPU speed (CPUSPEEDNW), the two other remaining values used for NOWORKLOAD system statistics IOSEEKTIM (seek time) and IOTFRSPEED (transfer speed) are using default values (10 milliseconds seek time and 4096 bytes per millisecond transfer speed).

Using these default values for the I/O part the SREADTIM (single-block I/O read time) and MREADTIM (multi-block I/O read time) values are synthesized for cost calculation by applying the following formula:

SREADTIM = IOSEEKTIM + db_block_size / IOTFRSPEED

MREADTIM = IOSEEKTIM + mbrc * db_block_size / IOTFRSPEED

where "db_block_size" represents your database standard block size in bytes and "mbrc" is either the value of "db_file_multiblock_read_count" if it has been set explicitly, or a default of 8 if left unset. From 10.2 on this is controlled internally by the undocumented parameter "_db_file_optimizer_read_count". This means that in 10.2 and later the "mbrc" used by the optimizer to calculate the cost can be different from the "mbrc" actually used at runtime when performing multi-block read requests. If you leave the "db_file_multiblock_read_count" unset in 10.2 and later then Oracle uses a default of 8 for cost calculation but uses the largest possible I/O request size depending on the platform, which is usually 1MB (e.g. 128 blocks when using a block size of 8KB). In 10.2 and later this is controlled internally by the undocumented parameter "_db_file_exec_read_count".

Assuming a default block size of 8KB (8192 bytes) and "db_file_multiblock_read_count" left unset, this results in the following calculation:

SREADTIM = 10 + 8192 / 4096 = 10 + 2 = 12ms

MREADTIM = 10 + 8 * 8192 / 4096 = 10 + 16 = 26ms

These values will then be used to calculate the I/O cost of single block and multi-block read requests according to the execution plan (number of single-block reads + number of multi-block reads * MREADTIM / SREADTIM), which means that the I/O cost with system statistics aka. CPU costing is expressed in units of single block reads.

You can derive from above formula that with system statistics the cost of a full table scan operation is going to be more expensive approximately by the factor MREADTIM / SREADTIM compared to the traditional I/O based costing used in pre-10g by default, therefore system statistics usually tend to favor index access a bit more.

Note that above factor MREADTIM / SREADTIM is not entirely correct since the traditional I/O costing introduces a efficiency reduction factor when using higher MBRC settings, presumably to reflect that the larger the number of blocks per I/O request the higher the possibility that it won't be possible to use that large number of blocks per I/O request due to blocks already being in the buffer cache or hitting extent boundaries.

So with a MBRC setting of 8 the adjusted MBRC used for calculation is actually 6.59. Using e.g. a very high setting of 128 for the MBRC will actually use 40.82 for calculation. So the higher the setting the more the MRBC used for calculation will be reduced.

The following test case shall demonstrate the difference between traditional I/O costing, CPU costing and the factor MREADTIM / SREADTIM when using different "db_file_multiblock_read_count" settings. The test case was run against 10.2.0.4 Win32.

Note that the test case removes your current system statistics so you should be cautious if you have non-default system statistics at present in your database.

Furthermore the test case assumes a 8KB database default block size, and a locally managed tablespace with 1MB uniform extent size using manual segment space management (no ASSM).

drop table t1;

-- Create a table consisting of 10,000 blocks / 1 row per block
-- in a 8KB tablespace with manual segment space management (no ASSM)
create table t1
pctfree 99
pctused 1
-- tablespace test_2k
-- tablespace test_4k
tablespace test_8k
-- tablespace test_16k
as
with generator as (
select --+ materialize
rownum id
from all_objects
where rownum <= 3000
)
select
/*+ ordered use_nl(v2) */
rownum id,
trunc(100 * dbms_random.normal) val,
rpad('x',100) padding
from
generator v1,
generator v2
where
rownum <= 10000
;

begin
dbms_stats.gather_table_stats(
user,
't1',
cascade => true,
estimate_percent => null,
method_opt => 'for all columns size 1'
);
end;
/

-- Use default NOWORKLOAD system statistics
-- for test but ignore CPU cost component
-- by using an artificially high CPU speed
begin
dbms_stats.delete_system_stats;
dbms_stats.set_system_stats('CPUSPEEDNW',1000000);
end;
/

-- In order to verify the formula against the
-- optimizer calculations
-- don't increase the table scan cost by one
-- which is done by default from 9i on
alter session set "_table_scan_cost_plus_one" = false;

alter session set db_file_multiblock_read_count = 8;

-- Assumption due to formula is that CPU costing
-- increases FTS cost by MREADTIM/SREADTIM, but
-- traditional I/O based costing introduces a
-- efficiency penalty the higher the MBRC is
-- therefore the factor is not MREADTIM/SREADTIM
-- but MREADTIM/SREADTIM/(MBRC/adjusted MBRC)
--
-- NOWORKLOAD synthesized SREADTIM = 12, MREADTIM = 26
-- MREADTIM/SREADTIM = 26/12 = 2.16
-- Factor CPU Costing / traditional I/O costing
-- 2,709/1,518 = 1.78
-- MBRC = 8, adjusted MBRC = 10,000 / 1,518 = 6.59
-- 8/6.59 = 1.21
-- 2.16 / 1.21 = 1.78

select /*+ nocpu_costing */ max(val)
from t1;

-----------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
-----------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 4 | 1518 |
| 1 | SORT AGGREGATE | | 1 | 4 | |
| 2 | TABLE ACCESS FULL| T1 | 10000 | 40000 | 1518 |
-----------------------------------------------------------

select /*+ cpu_costing */ max(val)
from t1;

---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 4 | 2709 (0)| 00:00:33 |
| 1 | SORT AGGREGATE | | 1 | 4 | | |
| 2 | TABLE ACCESS FULL| T1 | 10000 | 40000 | 2709 (0)| 00:00:33 |
---------------------------------------------------------------------------

alter session set db_file_multiblock_read_count = 16;

-- Assumption due to formula is that CPU costing
-- increases FTS cost by MREADTIM/SREADTIM, but
-- traditional I/O based costing introduces a
-- efficiency penalty the higher the MBRC is
-- therefore the factor is not MREADTIM/SREADTIM
-- but MREADTIM/SREADTIM/(MBRC/adjusted MBRC)
--
-- NOWORKLOAD synthesized SREADTIM = 12, MREADTIM = 42
-- MREADTIM/SREADTIM = 42/12 = 3.5
-- Factor CPU Costing / traditional I/O costing
-- 2,188/962 = 2.27
-- MBRC = 16, adjusted MBRC = 10,000 / 962 = 10.39
-- 16/10.39 = 1.54
-- 3.5 / 1.54 = 2.27

select /*+ nocpu_costing */ max(val)
from t1;

-----------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
-----------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 4 | 962 |
| 1 | SORT AGGREGATE | | 1 | 4 | |
| 2 | TABLE ACCESS FULL| T1 | 10000 | 40000 | 962 |
-----------------------------------------------------------

select /*+ cpu_costing */ max(val)
from t1;

---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 4 | 2188 (0)| 00:00:27 |
| 1 | SORT AGGREGATE | | 1 | 4 | | |
| 2 | TABLE ACCESS FULL| T1 | 10000 | 40000 | 2188 (0)| 00:00:27 |
---------------------------------------------------------------------------

alter session set db_file_multiblock_read_count = 32;

-- Assumption due to formula is that CPU costing
-- increases FTS cost by MREADTIM/SREADTIM, but
-- traditional I/O based costing introduces a
-- efficiency penalty the higher the MBRC is
-- therefore the factor is not MREADTIM/SREADTIM
-- but MREADTIM/SREADTIM/(MBRC/adjusted MBRC)
--
-- NOWORKLOAD synthesized SREADTIM = 12, MREADTIM = 74
-- MREADTIM/SREADTIM = 74/12 = 6.16
-- Factor CPU Costing / traditional I/O costing
-- 1,928/610 = 3.16
-- MBRC = 32, adjusted MBRC = 10,000 / 610 = 16.39
-- 32/16.39 = 1.95
-- 6.16 / 1.95 = 3.16

select /*+ nocpu_costing */ max(val)
from t1;

-----------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
-----------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 4 | 610 |
| 1 | SORT AGGREGATE | | 1 | 4 | |
| 2 | TABLE ACCESS FULL| T1 | 10000 | 40000 | 610 |
-----------------------------------------------------------

select /*+ cpu_costing */ max(val)
from t1;

---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 4 | 1928 (0)| 00:00:24 |
| 1 | SORT AGGREGATE | | 1 | 4 | | |
| 2 | TABLE ACCESS FULL| T1 | 10000 | 40000 | 1928 (0)| 00:00:24 |
---------------------------------------------------------------------------

alter session set db_file_multiblock_read_count = 64;

-- Assumption due to formula is that CPU costing
-- increases FTS cost by MREADTIM/SREADTIM, but
-- traditional I/O based costing introduces a
-- efficiency penalty the higher the MBRC is
-- therefore the factor is not MREADTIM/SREADTIM
-- but MREADTIM/SREADTIM/(MBRC/adjusted MBRC)
--
-- NOWORKLOAD synthesized SREADTIM = 12, MREADTIM = 138
-- MREADTIM/SREADTIM = 138/12 = 11.5
-- Factor CPU Costing / traditional I/O costing
-- 1,798/387 = 4.64
-- MBRC = 64, adjusted MBRC = 10,000 / 387 = 25.84
-- 64/25.84 = 2.48
-- 11.5 / 2.48 = 4.64

select /*+ nocpu_costing */ max(val)
from t1;

-----------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
-----------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 4 | 387 |
| 1 | SORT AGGREGATE | | 1 | 4 | |
| 2 | TABLE ACCESS FULL| T1 | 10000 | 40000 | 387 |
-----------------------------------------------------------

select /*+ cpu_costing */ max(val)
from t1;

---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 4 | 1798 (0)| 00:00:22 |
| 1 | SORT AGGREGATE | | 1 | 4 | | |
| 2 | TABLE ACCESS FULL| T1 | 10000 | 40000 | 1798 (0)| 00:00:22 |
---------------------------------------------------------------------------

alter session set db_file_multiblock_read_count = 128;

-- Assumption due to formula is that CPU costing
-- increases FTS cost by MREADTIM/SREADTIM, but
-- traditional I/O based costing introduces a
-- efficiency penalty the higher the MBRC is
-- therefore the factor is not MREADTIM/SREADTIM
-- but MREADTIM/SREADTIM/(MBRC/adjusted MBRC)
--
-- NOWORKLOAD synthesized SREADTIM = 12, MREADTIM = 266
-- MREADTIM/SREADTIM = 266/12 = 22.16
-- Factor CPU Costing / traditional I/O costing
-- 1,732/245 = 7.07
-- MBRC = 128, adjusted MBRC = 10,000 / 245 = 40.82
-- 128/40.82 = 3.13
-- 22.16 / 3.13 = 7.07

select /*+ nocpu_costing */ max(val)
from t1;

-----------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
-----------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 4 | 245 |
| 1 | SORT AGGREGATE | | 1 | 4 | |
| 2 | TABLE ACCESS FULL| T1 | 10000 | 40000 | 245 |
-----------------------------------------------------------

select /*+ cpu_costing */ max(val)
from t1;

---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 4 | 1732 (0)| 00:00:21 |
| 1 | SORT AGGREGATE | | 1 | 4 | | |
| 2 | TABLE ACCESS FULL| T1 | 10000 | 40000 | 1732 (0)| 00:00:21 |
---------------------------------------------------------------------------

So as you can see the I/O costs for a full table scan are significantly different when using default NOWORKLOAD system statistics. You can also see that the SREADTIM and MREADTIM values derived are quite different when using different "db_file_multiblock_read_count" settings. Furthermore the difference between traditional I/O based costing and the CPU costing is not the factor MREADTIM / SREADTIM as suggested by the formula, but is reduced by the adjustment applied to the MBRC when using traditional I/O costing.

The next part of the series will cover the remaining available System Statistics mode.

Unloading data using external tables in 10g

External tables can write as well as read in 10g. May 2005

Helsinki code layers in the DBMS

Ok, let's continue with the second part of "The Helsinki Declaration". That would be the part where I zoom in on the DBMS and show you how best to do this database centric thing.We have seen that the DBMS is the most stable component in everybodies technology landscape. We have also concluded that the DBMS has been designed to handle WoD application BL-code and DL-code. And current DBMS's are

Advanced Oracle Troubleshooting by Tanel Poder in Singapore

When I first saw that Tanel will conduct his seminar in Singapore, I told myself that I would even spend my own money just to be on that training! I’ve already read performance books like Optimizing Oracle Performance, Oracle 8i Internal Services, Forecasting Oracle Performance… And after that I still want more, and I still have questions that need to be answered. Well, if you’re on a tight budget you just opt to download some more docs/books to do multiple reads coupled with research/test cases and also reading through others blog…
But thanks to my boss for the funding, I was there! </p />
</p></div>

    	  	<div class=