Oakies Blog Aggregator

An interview question...

A couple of days ago I read an article "One of the toughest job-interview questions ever".  I was reminded of it by this posting...

I found the original post interesting - mostly because I liked the answer the technical writer gave.  A bit of background - someone interviewing for a technical writing position is asked what is clearly a "hard core, heads down, write code programmer question".  The question seemed entirely inappropriate for the position - but - the answer given was great (I thought)

The answer consisted of lots of questions - in effect - a lot of push back.  Define this, specify that, clarify this - need more information.

I can relate. 

What surprised me was that a lot of the feedback was negative.  A lot of people said "would never hire you", "you missed the point".

All of the time I was reading though, I was nodding my head saying "yeah, what about that".  I would have hired him on the spot.  Critical thinking, push back, give me the details, tell me what you are really trying to do. 

The programmer that rolls over and just answers the question - without enough information to actually answer the question - should send the interviewer running away.  But that is apparently what a lot of interviewers are looking for.

I've been known to have three to four very simple interview questions for "Oracle people".  They are designed to test the simple to the sublime.  They are:

I have a table:

create table t ( .....,  month number, ..... );

Month is always a number between 1 and 12.

I ask three questions about this table:

1) how many rows are in the table

2) how many rows BY MONTH are in the table (i want to  know how many rows for month one, month two
and so on)

3) what MONTH has the most rows (and for a special bonus, tell me why this question is ambiguous)

The fourth question is more of a "do something for me" - and that is "go to the white board, draw a picture of Oracle and tell me how it works". 

As the link says - a surprising number of people *struggle* (seriously) with the first question.  The second - gets *most* (seriously) of the rest.  The third question freaks them out mostly.  Especially the parenthetical part.  The fourth question - sends people running out of the room.

That is why I liked the article I originally read - the author was poking around, developing derived requirements, fleshing it out, figuring out what really needed to be done, not rolling over and saying "you got it, I'll be right on it, we'll do that straight away".  Developers (DBA's, whatever) that don't push back, that don't dig into the question, that don't try to convey "this is more complex than you think, we need to go a bit into this to figure out what you really need" - well, I don't have any patience for them.  They do not belong (in our profession).

Will that person (the interview-e) annoy you?  Sure, from time to time (I'm sure that every now and then - someone is annoyed by me, probably).

Will you ultimately be really happy they were there? Absolutely.

Will the person that rolls over annoy you? Absolutely - every time - most of the time probably. Especially after they really mess you up the first time they are so "flexible".  Will you ultimately be even a little happy they were there?  I doubt it.

I've said many times - there are only TWO answers to all technical questions.  They are:

  1. WHY (why do you want to do that)
  2. IT DEPENDS (it really does, and it requires digging around, poking, probing to figure out what it depends on...)

poke, probe, ask, discuss, dive deep, play stupid (it works, really) - but get the information...

TM locking: Checking for Missing Indexes on Foreign Key Constraints

Recently, I was working on a packaged application purchased from a third-party vendor. It is one of those platform agnostic systems that started life on Microsoft SQL Server, and has been ported to Oracle. I spend a lot of my time working with PeopleSoft, so I had a certain sense of déjà vu. However, this application uses referential integrity.

The application was upgraded, and simultaneously Oracle was upgraded to 10g and then exhibited TM contention. It had probably been suffering from TM contention while running on Oracle 9i, but we hadn't realised because Oracle9i only reports 'enqueue'.

From 10g, there are no less that 208 different enqueue wait events, that show the type of lock that the process is waiting for, and sometimes additional information. In my case it was event 175. Events can be listed from v$event_name.

SELECT event#, name FROM v$event_name
WHERE UPPER(name) LIKE 'ENQ: TM%'
/
EVENT# NAME
---------- --------------------
175 enq: TM - contention

With a little help from my friends I came to realise that the cause of this contention was that the system had foreign key constraints on columns that were not indexed. Having found one example of this, I realised that I needed a way to check the entire data model. The result was the following SQL and PL/SQL script.

REM fk_index_check.sql
REM 19.10.2007

Uncommenting the following section will produce a test case that should build two indexes.

/*--------------------------------------------------------------
ALTER TABLE EMP_TAB DROP CONSTRAINT MGR_FKEY;
ALTER TABLE EMP_TAB DROP CONSTRAINT DEPT_FKEY;
DROP TABLE Emp_tab;
DROP TABLE DEPT_TAB;

CREATE TABLE Dept_tab (
setid NUMBER(3),
deptno NUMBER(3),
dname VARCHAR2(15),
loc VARCHAR2(15)
--CONSTRAINT dname_ukey UNIQUE (Dname, Loc),
--CONSTRAINT loc_check1
--CHECK (loc IN ('NEW YORK', 'BOSTON', 'CHICAGO'))
,CONSTRAINT Dept_pkey PRIMARY KEY (setid,deptno)
)
/
CREATE TABLE Emp_tab (
empno NUMBER(5) CONSTRAINT emp_pkey PRIMARY KEY,
ename VARCHAR2(15) NOT NULL,
job VARCHAR2(10),
mgr NUMBER(5) CONSTRAINT mgr_fkey REFERENCES emp_tab,
hiredate DATE,
sal NUMBER(7,2),
comm NUMBER(5,2),
setid NUMBER(3),
deptno NUMBER(3) NOT NULL,
CONSTRAINT dept_fkey FOREIGN KEY (setid,deptno)
REFERENCES dept_tab (setid,deptno) ON DELETE CASCADE
)
/
/*------------------------------------------------------------*/
set serveroutput on buffer 1000000000

GFC_FK_INDEX_CHECK is a working storage script that is to hold results of the tests on each foreign key.

DROP TABLE gfc_fk_index_check
/
CREATE TABLE gfc_fk_index_check
(owner VARCHAR2(30) NOT NULL
,table_name VARCHAR2(30) NOT NULL
,constraint_name VARCHAR2(30) NOT NULL
,r_owner VARCHAR2(30) NOT NULL
,r_table_name VARCHAR2(30) NOT NULL
,r_constraint_name VARCHAR2(30) NOT NULL
,i_index_owner VARCHAR2(30)
,i_index_name VARCHAR2(30)
,i_status VARCHAR2(30) DEFAULT 'UNKNOWN'
,i_column_list VARCHAR2(300)
,CONSTRAINT gfc_fk_index_check_pk
PRIMARY KEY(table_name, constraint_name)
)
/
TRUNCATE TABLE gfc_fk_index_check
/

First the script populates the working storage table with all the referential integrity constraints that reference a primary key constraint.

INSERT INTO gfc_fk_index_check
(owner, table_name, constraint_name
,r_owner, r_constraint_name, r_table_name)
SELECT c.owner, c.table_name, c.constraint_name
, c.r_owner, c.r_constraint_name
, r.table_name r_table_name
FROM all_constraints c
, all_constraints r
WHERE c.constraint_Type = 'R'
AND r.owner = c.r_owner
AND r.constraint_name = c.r_constraint_name
AND r.constraint_Type = 'P'
AND r.owner = user
/

This PL/SQL routine checks each foreign key constraint in the table for each constraint it looks up the referring columns in all_cons_columns and builds a dynamic query that SELECTs the owner and name of an index with the same columns in the same position. The name of that index and the column list is stored on the working storage table. Depending upon how many rows that query returns, a status string is written to the table: No Index/Index Found/Multiple Indexes

DECLARE
l_counter NUMBER;
l_column_list VARCHAR2(200);
l_sql1 VARCHAR2(4000);
l_sql2 VARCHAR2(4000);
l_tmp1 VARCHAR2(20);
l_tmp2 VARCHAR2(20);
l_alias VARCHAR2(3);
l_oldalias VARCHAR2(3);
l_index_owner VARCHAR2(30);
l_index_name VARCHAR2(30);
l_status VARCHAR2(30);
BEGIN
FOR a IN (SELECT * FROM gfc_fk_index_check) LOOP
l_counter := 0;
l_column_list := '';
l_sql1 := 'SELECT i1.index_owner, i1.index_name';
l_sql2 := '';
FOR b IN (SELECT *
FROM all_cons_columns c
WHERE c.owner = a.owner
AND c.constraint_name = a.constraint_name
AND c.table_name = a.table_name
ORDER BY position) LOOP
l_counter := l_counter + 1;
l_oldalias := l_alias;
l_alias := ' i'||TO_CHAR(l_counter);
IF l_counter > 1 THEN
l_sql1 := l_sql1||', ';
l_sql2 := l_sql2
||' AND '||l_oldalias||'.index_owner='
||l_alias ||'.index_owner'
||' AND '||l_oldalias||'.index_name='
||l_alias ||'.index_name'
||' AND ';
l_column_list := l_column_list||',';
ELSE
l_sql1 := l_sql1||' FROM ';
l_sql2 := l_sql2||' WHERE';
END IF;
l_sql1 := l_sql1||'all_ind_columns'||l_alias;
l_sql2 := l_sql2
||l_alias||'.TABLE_OWNER='''||b.owner||''''
||' AND '||l_alias||'.TABLE_NAME='''||b.table_name||''''
||' AND '||l_alias||'.COLUMN_NAME='''||b.column_name||''''
||' AND '||l_alias||'.COLUMN_POSITION='''||b.position||'''';
l_column_list := l_column_list||b.column_name;
END LOOP;
-- dbms_output.put_line(l_sql1);
-- dbms_output.put_line(l_sql2);
-- dbms_output.put_line(l_column_list);
l_status := a.i_status;
l_index_owner := '';
l_index_name := '';
BEGIN
EXECUTE IMMEDIATE l_sql1||l_sql2
INTO l_index_owner, l_index_name;
l_status := 'Index Found';
EXCEPTION
WHEN NO_DATA_FOUND THEN l_status := 'No Index';
WHEN TOO_MANY_ROWS THEN l_status := 'Multiple Indexes';
END;
UPDATE gfc_fk_index_check
SET i_status = l_status
, i_index_owner = l_index_owner
, i_index_name = l_index_name
, i_column_list = l_column_list
WHERE owner = a.owner
AND table_name = a.table_name
AND constraint_name = a.constraint_name;
END LOOP;
COMMIT;
END;
/

This query produces a simple report on each foreign key constraint.

set lines 90 head on feedback on echo on
column owner format a20
column table_name format a30
column constraint_name format a30
column r_owner format a20
column r_constraint_name format a30
column r_table_name format a30
column i_index_owner format a20
column i_index_name format a30
column i_status format a30
column i_column_list format a80
spool fk_index_check
SELECT g.owner, g.table_name, g.constraint_name
, g.r_owner, g.r_table_name, g.r_constraint_name
, g.i_index_owner, g.i_index_name, g.i_status
, g.i_column_list
FROM gfc_fk_index_check g
/
spool off

This query is similar to the last, but it produces a report of just largest tables that lack indexes on FK constraints. It show tables more than 10000 rows (according to the CBO statistics), or at least the top 20. These are likely to be most severe offenders.

spool fk_index_by_size
SELECT * from (
SELECT g.owner, g.table_name, g.constraint_name
, g.r_owner, g.r_table_name, g.r_constraint_name
, g.i_index_owner, g.i_index_name, g.i_status
, /*t.temporary, t.partitioned, */ t.num_rows
, g.i_column_list
FROM gfc_fk_index_check g, all_tables t
WHERE t.table_name = g.table_name
AND t.owner = g.owner
AND g.i_status = 'No Index'
ORDER BY num_rows desc
) WHERE rownum <= 20 or num_rows >= 10000
/
spool off

This query generates a script constraint create index DDL statements that will build the missing indexes. The index will have the same name as the foreign key constraint to which it relates.

set head off trimout on trimspool on feedback off verify off timi off echo off lines 200
spool fk_index_build.sql
SELECT 'CREATE INDEX '||g.owner||'.'||g.constraint_name
||' ON '||g.owner||'.'||g.table_name
||' ('||g.i_column_list||');' build_indexes
FROM gfc_fk_index_check g, all_tables t
WHERE t.table_name = g.table_name
AND t.owner = g.owner
AND g.i_status = 'No Index'
ORDER BY t.num_rows
/
spool off
set lines 90 head on feedback on echo on

The test script correctly reports (in fk_index_check.LST) that there are two foreign keys that require supporting indexes

OWNER           TABLE_NAME           CONSTRAINT_NAME
--------------- -------------------- --------------------
R_OWNER R_TABLE_NAME R_CONSTRAINT_NAME
--------------- -------------------- --------------------
I_INDEX_OWNER I_INDEX_NAME I_STATUS
--------------- -------------------- --------------------
I_COLUMN_LIST
---------------------------------------------------------
SYSADM EMP_TAB MGR_FKEY
SYSADM EMP_TAB EMP_PKEY
No Index
MGR

SYSADM EMP_TAB DEPT_FKEY
SYSADM DEPT_TAB DEPT_PKEY
No Index
SETID,DEPTNO

It produces another script fk_index_build.sql that will build the missing indexes.

CREATE INDEX SYSADM.MGR_FKEY ON SYSADM.EMP_TAB (MGR);
CREATE INDEX SYSADM.DEPT_FKEY ON SYSADM.EMP_TAB (SETID,DEPTNO);

When I ran this test script on my problem application, it identified over 200 missing indexes on 900 foreign key constraints, and since building the indexes on tables where I have seen TM locking, I haven't seen any TM locking contention.

The script can be downloaded from the Go-Faster website at http://www.go-faster.co.uk/scripts.htm#fk_index_check.sql

Caveat: Just because you can index a foreign key, doesn't mean that you should. See
http://www.jlcomp.demon.co.uk/faq/fk_ind.htmlThis query produces a simple report on each foreign key constraint.

Collaborate 09 Ends

I wrapped up a hectic three days at Collaborate this year. As if to echo the sentiments on the economy in general, and Oracle database technology related in particular, the attendance was way down this year. The vast expanse of Orange County Convention Center didn't make it easy either; it made it appear even smaller! Well, it was Orlando; the weather and traffic cooperated and I'm pretty sure some attendees picked up a few well deserved resting points either alone or with visiting family.

I presented three sessions:

(1) RAC Performance Tuning
(2) Real World Best Practices for DBAs (with Webcast)
(3) All About Deadlocks

I also participated in the RAC Experts' Panel as a Panelist.

As always, I used to opportunity to meet old friends and acquintances. Collaborate is all about knowledge sharing; I didn't lose when it came to getting some myself. I attended some highly useful sessions:

(1) Database Capacity Planning by Ashish Rege
(2) Partitioning for DW by Vincent
(3) Hints on Hints by Jonathan Lewis
(4) 11g Performance Tuning by Rich Niemic
(5) HA Directions from Oracle by Ashish Ray
(6) Storage Performance Diagnosis by Gaja
(7) Two sessions - on Database Replay and Breaking Oracle by Jeremiah Wilton

and some more. It has been a worthy conference.

Yast on OEL

Lately I’ve been playing with Enterprise Manager Grid Control 10.2.0.4 on OEL 4.4 (I’ll upgrade this to 10.2.0.5 soon) and I’ve had a couple of product presentation and demo focusing on the Database Enterprise Management, well among all the packs under this Grid Control “Area” the hottest are the Configuration Management and Data Masking.

…While clicking around the huge monitoring tool I’ve reached the page of “Targets -> Hosts -> Administration”…that happens to be part of the Oracle Management Pack for Linux and noticed it was not working and says it needs the “Yast wrapper available at oss.oracle.com” hmm maybe not the exact message but it lead me to download this file http://oss.oracle.com/projects/yast/dist/files/el4/20070117/yast_el4_x86_32.tar which is a port of Yast to Oracle Enterprise Linux. COOL! </p />
</p></div>

    	  	<div class=

Understanding the SCN

For the DBAs who want to have a refreser on SCN (system change number), this article article is very nice and explained clearly written by Sandeep Makol. It started on where you ‘ll find info for SCN (controlfile and datafile headers) then goes to the backup and recovery scenarios where knowledge of this “magic number” is very useful.

Below are some useful scripts (with sample output) as well

-- get specific datafile
col name format a50
select 'controlfile' "SCN location",'SYSTEM checkpoint' name,checkpoint_change#
from v$database
union
select 'file in controlfile',name,checkpoint_change#
from v$datafile where lower(name) like '%&&datafile%'
union
select 'file header',name,checkpoint_change#
from v$datafile_header where lower(name) like '%&&datafile%';

SCN location        NAME                           CHECKPOINT_CHANGE#
------------------- -------------------------------------------------- ------------------
controlfile     SYSTEM checkpoint                         1151614
file header     +DATA_1/ivrs/datafile/sysaux.258.652821943            1151614
file header     +DATA_1/ivrs/datafile/system.267.652821909            1151614
file header     +DATA_1/ivrs/datafile/system_02.dbf               1151614
file header     +DATA_1/ivrs/datafile/undotbs1.257.652821933          1151614
file header     +DATA_1/ivrs/datafile/users.263.652821963             1151614
file in controlfile +DATA_1/ivrs/datafile/sysaux.258.652821943            1151614
file in controlfile +DATA_1/ivrs/datafile/system.267.652821909            1151614
file in controlfile +DATA_1/ivrs/datafile/system_02.dbf               1151614
file in controlfile +DATA_1/ivrs/datafile/undotbs1.257.652821933          1151614
file in controlfile +DATA_1/ivrs/datafile/users.263.652821963             1151614

-- get distinct checkpoint_change#
select checkpoint_change#, 'SYSTEM checkpoint in controlfile' "SCN location"
from v$database
union
select distinct checkpoint_change#, 'file in controlfile'
from v$datafile
union
select distinct checkpoint_change#, 'file header'
from v$datafile_header;

CHECKPOINT_CHANGE# SCN location
------------------ --------------------------------
       1151614 SYSTEM checkpoint in controlfile
       1151614 file header
       1151614 file in controlfile

-- get distinct datafile count
select 'controlfile' "SCN location",'SYSTEM checkpoint' name,checkpoint_change#
from v$database
union
select 'file in controlfile',to_char(count(*)),checkpoint_change#
from v$datafile
group by checkpoint_change#
union
select 'file header',to_char(count(*)),checkpoint_change#
from v$datafile_header
group by checkpoint_change#;

SCN location        NAME                           CHECKPOINT_CHANGE#
------------------- -------------------------------------------------- ------------------
controlfile     SYSTEM checkpoint                         1151614
file header     5                                 1151614
file in controlfile 5                                 1151614

-- info from x$kcvfh (All file headers)
SELECT hxfil file_num,substr(hxfnm,1,40) file_name,fhtyp type,hxerr validity, fhscn chk_ch#, fhtnm tablespace_name,fhsta status,fhrba_seq sequence
FROM x$kcvfh;

  FILE_NUM FILE_NAME                      TYPE   VALIDITY CHK_CH#      TABLESPACE_NAME            STATUS   SEQUENCE
---------- ---------------------------------------- ---------- ---------- ---------------- ------------------------------ ---------- ----------
     1 +DATA_1/ivrs/datafile/system.267.6528219      3      0 1151614      SYSTEM               8196        146
     2 +DATA_1/ivrs/datafile/undotbs1.257.65282      3      0 1151614      UNDOTBS1                4        146
     3 +DATA_1/ivrs/datafile/sysaux.258.6528219      3      0 1151614      SYSAUX                  4        146
     4 +DATA_1/ivrs/datafile/users.263.65282196      3      0 1151614      USERS                   4        146
     5 +DATA_1/ivrs/datafile/system_02.dbf           3      0 1151614      SYSTEM                  4        146

.

Oracle’s latest acquisition: Me

I’m definitely the type of person that gets excited by new opportunities and always loves a new challenge. Without challenge, I get bored quickly and boredom makes me a little crazy.

So, this new opportunity came along a little while ago and I thought it sounded just perfect for me. Many of you that know me will recall that I’ve had trouble finding the right company that fits with all aspects of my personality, goals, and philosophy which has led me to “try” a few of them in the past several years. I don’t regret the choices I’ve made and I’ve learned an awful lot from each of my employers. Most importantly, I’ve created new relationships at each of my past companies that I still maintain today.

In looking at all the past experiences, I’ve concluded that I am ready for a change in direction. Things I enjoy:
Oracle products
People (customers and Oracle employees) that work with Oracle software
RAC
High Availability
Large, complex environments
Servers, OS, and storage infrastructures
Working with lots of new people, especially creating, managing, and growing new relationships
Presenting my knowledge to others, both one-on-one and to groups (and conference events that surround such gatherings)
A little bit of travel (which helps support my love of…)
Scuba diving

So, when an opportunity came along to get paid to do something that combines almost all of the things I enjoy (except scuba), I couldn’t pass it up. Starting on May 18, 2009, I will be the newest member of the new group at Oracle that’s known as the “X-Team”. This team is responsible for working with customers and prospective customers of the Oracle Exadata and HP Oracle Database Machine products to help them adopt these new technologies. The group is a part of the Maximum Availability Architecture group at Oracle that authors many of the best practices whitepapers and presentations you have likely seen online. For those at Oracle that know what this means, the group is a part of the Server Technology development organization under Juan Loaiza.

For those of you that have been my past consulting customers, first of all, Thank You. I’m no longer consulting and while I won’t be able to provide an “Oracle-sanctioned” recommendation to other consultants that may be able to help, I do have a large network of friends and one of them can likely help you. Please never hesitate to keep in touch!

On a logistical note, I’m not moving and will hopefully continue to be involved with local events in Chicagoland. However, I will be traveling part of the time to visit customers and other Oracle facilities, so keep an eye on my twitter feed, Britekite location, and Tripit plans and let me know if there’s a chance to have a meeting IRL.

Finally, this decision to join Oracle means that I’ll be sacrificing several things. First and probably most near and dear to me is the RAC SIG. In September 2008, I took over as the RAC SIG President. The RAC SIG is as strong as ever and there are a good group of volunteers involved in leading the group as it continues to grow and evolve. I’ll always be a member of the RAC SIG and will continue to watch it closely and volunteer when and where I can. The RAC SIG is associated with the IOUG, the Independent Oracle User Group, and Oracle employees shouldn’t be too involved in “independent” groups. So, this year, the RAC SIG will once again elect a new president. I will remain president until Oracle Open World in October 2009 in order to provide continuity to the group’s leadership and ensure a smooth transition. You can nominate yourself for a RAC SIG office soon via our website nomination form (nominations will be open soon and stay open until July 31, 2009).

I’m also going to relinquish my appointment as an Oracle ACE Director. While I think I’ll still be considered an Oracle Employee ACE, I’ll remember fondly the fame that Oracle Technology Network affords the Oracle ACE program and the individuals that are given the honor. Thanks to Justin, Vikki, Lillian, Todd, and the others at Oracle for allowing me to be a part of that program. I’ll certainly miss the perks!

That’s about it for now, I’m off to the new job and will once again begin learning. Luckily, I’m apparently the only person named Dan Norris at Oracle (last someone checked for me), so you can contact me at dan.norris@oracle.com in a couple weeks.

growing pains ...

My role at work has been shifting over the past six months and I'm still not sure how I feel about this. In theory, I'm supposed to be 'architecting' but most of the time, I feel like I'm somewhere between a technical writer and a hostage negotiator, with the hostage alternating at turns between the integrity of the database and my sanity. Tensions have been high for everyone, the project is

Understanding the different modes of System Statistics aka. CPU Costing and the effects of multiple blocksizes - part 2

Back to part 1 Forward to part 3

Before heading on to the remaining modes of system statistics, let's summarize what has been observed in part 1 regarding the default NOWORKLOAD system statistics in 10g and later. The following table shows what the test case from the previous post demonstrated:

Table 1: 8KB MSSM locally managed tablespace 10,000 blocks table segment
default NOWORKLOAD system statistics:

MBRC|SREADTIM|MREADTIM|MREADTIM/SREADTIM|NOCPU cost|adjusted MBRC|CPU cost|CPU/NOCPU cost
----|--------|--------|-----------------|----------|-------------|--------|--------------
8 |12 | 26 | 2.16 |1,518 | 6.59 |2,709 |1.78
16 |12 | 42 | 3.5 | 962 |10.39 |2,188 |2.27
32 |12 | 74 | 6.16 | 610 |16.39 |1,928 |3.16
64 |12 |138 |11.5 | 387 |25.84 |1,798 |4.64
128 |12 |266 |22.16 | 245 |40.82 |1,732 |7.07

If you happen to have a 16KB default blocksize the results would look like the following. Note that the table is now only 5,000 blocks in size, and the SREADTIM is now a bit longer (10+16384/4096=14ms instead of 10+8192/4096=12ms) therefore the 16KB blocksize calculation makes the full table scan look a bit cheaper to the optimizer when using the default NOWORKLOAD system statistics.

Table 2: 16KB MSSM locally managed tablespace 5,000 blocks table segment
default NOWORKLOAD system statistics:

MBRC|SREADTIM|MREADTIM|MREADTIM/SREADTIM|NOCPU cost|adjusted MBRC|CPU cost|CPU/NOCPU cost
----|--------|--------|-----------------|----------|-------------|--------|--------------
4 |14 | 26 | 1.86 |1,119 | 4.17 |2,322 |2.08
8 |14 | 42 | 3.0 | 759 | 6.59 |1,875 |2.47
16 |14 | 74 | 5.3 | 481 |10.39 |1,652 |3.43
32 |14 |138 | 9.86 | 305 |16.39 |1,540 |5.05
64 |14 |266 |19.0 | 194 |25.84 |1,485 |7.65

Gathered NOWORKLOAD system statistics

If you gather NOWORKLOAD system statistics using DBMS_STATS.GATHER_SYSTEM_STATS('NOWORKLOAD') the values IOSEEKTIM and IOTFRSPEED will actually get measured and used accordingly.

The remaining calculations how to derive the SREADTIM and MREADTIM values correspond to what has been outlined above.

Gathering NOWORKLOAD statistics:

exec DBMS_STATS.GATHER_SYSTEM_STATS('NOWORKLOAD')

This actually gathers the IOTFRSPEED and IOSEEKTIM values in addition to CPUSPEEDNW rather than using the default values of 4096 and 10.

In 10g and later this may take from a couple of seconds to a couple of minutes depending on the size of your database. Note that this puts additional load onto your system while gathering the NOWORKLOAD system statistics since it submits random reads against all data files.

The following test case shows the different I/O cost calculations when using default NOWORKLOAD system statistics and custom gathered NOWORKLOAD system statistics. It creates again the 10,000 blocks table in a 8KB default block size locally managed tablespace using manual segment space management:

SQL>
SQL> drop table t1 purge;

Table dropped.

Elapsed: 00:00:00.03
SQL>
SQL> create table t1
2 pctfree 99
3 pctused 1
4 -- tablespace test_2k
5 -- tablespace test_4k
6 tablespace test_8k
7 -- tablespace test_16k
8 as
9 with generator as (
10 select --+ materialize
11 rownum id
12 from all_objects
13 where rownum <= 3000
14 )
15 select
16 /*+ ordered use_nl(v2) */
17 rownum id,
18 trunc(100 * dbms_random.normal) val,
19 rpad('x',100) padding
20 from
21 generator v1,
22 generator v2
23 where
24 rownum <= 10000
25 ;

Table created.

Elapsed: 00:00:02.22
SQL>
SQL>
SQL> begin
2 dbms_stats.gather_table_stats(
3 user,
4 't1',
5 cascade => true,
6 estimate_percent => null,
7 method_opt => 'for all columns size 1'
8 );
9 end;
10 /

PL/SQL procedure successfully completed.

Elapsed: 00:00:02.29
SQL> -- default NOWORKLOAD system statistics
SQL> -- ignore CPU costs for the moment
SQL> begin
2 dbms_stats.delete_system_stats;
3 dbms_stats.set_system_stats('CPUSPEEDNW',1000000);
4 end;
5 /

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.03
SQL>
SQL> column sname format a20
SQL> column pname format a20
SQL> column pval2 format a20
SQL>
SQL> select
2 sname
3 , pname
4 , pval1
5 , pval2
6 from
7 sys.aux_stats$;

SNAME PNAME PVAL1 PVAL2
-------------------- -------------------- ---------- --------------------
SYSSTATS_INFO STATUS COMPLETED
SYSSTATS_INFO DSTART 04-26-2009 14:21
SYSSTATS_INFO DSTOP 04-26-2009 14:21
SYSSTATS_INFO FLAGS 1
SYSSTATS_MAIN CPUSPEEDNW 1000000
SYSSTATS_MAIN IOSEEKTIM 10
SYSSTATS_MAIN IOTFRSPEED 4096
SYSSTATS_MAIN SREADTIM
SYSSTATS_MAIN MREADTIM
SYSSTATS_MAIN CPUSPEED
SYSSTATS_MAIN MBRC
SYSSTATS_MAIN MAXTHR
SYSSTATS_MAIN SLAVETHR

13 rows selected.

Elapsed: 00:00:00.06
SQL>
SQL> alter session set "_table_scan_cost_plus_one" = false;

Session altered.

Elapsed: 00:00:00.02
SQL>
SQL> explain plan for
2 select
3 max(val)
4 from
5 t1;

Explained.

Elapsed: 00:00:00.01
SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3724264953

---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 4 | 2709 (0)| 00:00:33 |
| 1 | SORT AGGREGATE | | 1 | 4 | | |
| 2 | TABLE ACCESS FULL| T1 | 10000 | 40000 | 2709 (0)| 00:00:33 |
---------------------------------------------------------------------------

9 rows selected.

Elapsed: 00:00:00.08
SQL> -- gather NOWORKLOAD system statistics
SQL> exec dbms_stats.gather_system_stats('NOWORKLOAD')

PL/SQL procedure successfully completed.

Elapsed: 00:00:14.43
SQL> -- ignore CPU costs for the moment
SQL> begin
2 dbms_stats.set_system_stats('CPUSPEEDNW',1000000);
3 end;
4 /

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.02
SQL> column sname format a20
SQL> column pname format a20
SQL> column pval2 format a20
SQL>
SQL> select
2 sname
3 , pname
4 , pval1
5 , pval2
6 from
7 sys.aux_stats$;

SNAME PNAME PVAL1 PVAL2
-------------------- -------------------- ---------- --------------------
SYSSTATS_INFO STATUS COMPLETED
SYSSTATS_INFO DSTART 04-26-2009 14:21
SYSSTATS_INFO DSTOP 04-26-2009 14:21
SYSSTATS_INFO FLAGS 1
SYSSTATS_MAIN CPUSPEEDNW 1000000
SYSSTATS_MAIN IOSEEKTIM 14.226
SYSSTATS_MAIN IOTFRSPEED 32517.754
SYSSTATS_MAIN SREADTIM
SYSSTATS_MAIN MREADTIM
SYSSTATS_MAIN CPUSPEED
SYSSTATS_MAIN MBRC
SYSSTATS_MAIN MAXTHR
SYSSTATS_MAIN SLAVETHR

13 rows selected.

Elapsed: 00:00:00.06
SQL>
SQL> explain plan for
2 select
3 max(val)
4 from
5 t1;

Explained.

Elapsed: 00:00:00.02
SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3724264953

---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 4 | 1403 (0)| 00:00:21 |
| 1 | SORT AGGREGATE | | 1 | 4 | | |
| 2 | TABLE ACCESS FULL| T1 | 10000 | 40000 | 1403 (0)| 00:00:21 |
---------------------------------------------------------------------------

9 rows selected.

Elapsed: 00:00:00.04
SQL>
SQL> spool off

Based on the gathered IOSEEKTIM and IOTFRSPEED values the I/O cost calculated is significantly different.

Applying the known formulas we can reproduce the calculated figures:

SREADTIM = IOSEEKTIM + DB_BLOCK_SIZE / IOTFRSPEED

SREADTIM = 14.226 + 8192 / 32,517.754 = 14.478

MREADTIM = IOSEEKTIM + MBRC * DB_BLOCK_SIZE / IOTFRSPEED

MREADTIM = 14.226 + 8 * 8192 / 32,517.754 = 16.241

FTS cost = Blocks below HWM / MBRC * MREADTIM / SREADTIM

FTS cost = 10,000 / 8 * 16.241 / 14.478 = 1,403

Gathered WORKLOAD system statistics

Gathering WORKLOAD system statistics measures a different set of values, including the actual MBRC, SREADTIM and MREADTIM values. The cost calculation therefore doesn't use the synthesized SREADTIM and MREADTIM values any longer, nor does it use the "_db_file_optimizer_read_count" parameter in 10g and later, but uses simply the measured values.

Therefore the I/O costs calculated with WORKLOAD system statistics are not dependent on the "db_file_multiblock_read_count" value used, but the important point to keep in mind is that the gathered WORKLOAD system statistics are based on the "db_file_multiblock_read_count" (in 10g and later on the internal parameter "_db_file_exec_read_count") value used at runtime, so the values measured are obviously influenced by this setting ("_db_file_exec_read_count" equals "db_file_multiblock_read_count" if this has been set and the underscore parameter hasn't been modified).

As already mentioned in part 1 Oracle has introduced with Oracle 10.2 that if the "db_file_multiblock_read_count" is left unset different values for cost calculation and at execution time will be used (8 for calculation and the largest possible I/O size at runtime, usually 1MB on most platforms), so that points in general into the right direction since it allows the calibration code to work out the largest MBRC possible at runtime that can be achieved. Note that Christian Antognini doesn't agree to this approach in his book "Troubleshooting Oracle Performance" where he advices to manually work out the "optimal" MBRC setting running suitable I/O tests.

Note that in 10g and later the runtime engine still uses the "_db_file_exec_read_count", regardless of the MBRC used to calculate the cost.

If you run the following code snippet in 10g and later and check the resulting trace files, you'll see this confirmed:

alter session set tracefile_identifier = 'exec_count_16';

alter session set "_db_file_exec_read_count" = 16;

alter system flush buffer_cache;

alter session set events '10046 trace name context forever, level 8';

select max(val)
from t1;

alter session set events '10046 trace name context off';

alter session set tracefile_identifier = 'exec_count_128';

alter session set "_db_file_exec_read_count" = 128;

alter system flush buffer_cache;

alter session set events '10046 trace name context forever, level 8';

select max(val)
from t1;

alter session set events '10046 trace name context off';

The resulting trace files look like the following:

The 16 blocks setting:

.
.
.
WAIT #2: nam='db file scattered read' ela= 1732 file#=8 block#=12058 blocks=16 obj#=62088 tim=69006657688
WAIT #2: nam='db file scattered read' ela= 1725 file#=8 block#=12074 blocks=16 obj#=62088 tim=69006659628
WAIT #2: nam='db file scattered read' ela= 1726 file#=8 block#=12090 blocks=16 obj#=62088 tim=69006661566
.
.
.

The 128 blocks setting:

.
.
.
WAIT #2: nam='db file scattered read' ela= 13842 file#=8 block#=12169 blocks=128 obj#=62088 tim=69008775308
WAIT #2: nam='db file scattered read' ela= 15513 file#=8 block#=12297 blocks=128 obj#=62088 tim=69008793460
WAIT #2: nam='db file scattered read' ela= 26437 file#=8 block#=12425 blocks=128 obj#=62088 tim=69008822434
.
.
.

Gathering WORKLOAD system statistics:

exec DBMS_STATS.GATHER_SYSTEM_STATS('START')
-- some significant (ideally "representative") workload needs to be performed
-- otherwise some or all of the measured values will be missing
exec DBMS_STATS.GATHER_SYSTEM_STATS('STOP')

or

exec DBMS_STATS.GATHER_SYSTEM_STATS('INTERVAL', )

Note that gathering workload system statistics doesn't put additional load onto your system, since the values are derived from the delta in statistics already maintained by Oracle during database activity.

Furthermore if your workload doesn't use "db file scattered read" i.e. multi-block reads that are working with the buffer cache, then you might end up with WORKLOAD system statistics that are missing the MBRC and MREADTIM component (null values). This might happen if you e.g. use only index access paths with table row random lookups by ROWID and/or all your tablescans are all going parallel, or in 11g use serial direct reads that bypass the buffer cache (which can be activated in pre-11g using the hidden parameter "_serial_direct_read").

The same applies to "db file sequential read" i.e. single-block reads, if you only perform multi-block reads in your workload then the SREADTIM information might be missing from the gathered statistics.

Although the official documentation says in 10.2 for that case the following:
"During the gathering process of workload statistics, it is possible that mbrc and mreadtim will not be gathered if no table scans are performed during serial workloads, as is often the case with OLTP systems. On the other hand, FTS occur frequently on DSS systems but may run parallel and bypass the buffer cache. In such cases, sreadtim will still be gathered since index lookup are performed using the buffer cache. If Oracle cannot gather or validate gathered mbrc or mreadtim, but has gathered sreadtim and cpuspeed, then only sreadtim and cpuspeed will be used for costing. FTS cost will be computed using analytical algorithm implemented in previous releases. Another alternative to computing mbrc and mreadtim is to force FTS in serial mode to allow the optimizer to gather the data."

And the 11g documentation says this:
"If Oracle Database cannot gather or validate gathered mbrc or mreadtim values, but has gathered sreadtim and cpuspeed values, then only the sreadtim and cpuspeed values are used for costing. In this case, the optimizer uses the value of the initialization parameter DB_FILE_MULTIBLOCK_READ_COUNT to cost a full table scan. However, if DB_FILE_MULTIBLOCK_READ_COUNT is not set or is set to 0 (zero), then the optimizer uses a value of 8 for costing."

But when testing this it looked like that if either MBRC or MREADTIM was missing then the optimizer simply reverted to the available NOWORKLOAD system statistics (Note that this applies to 10g and later; I'll show in the next part of the series what happens in 9i since things are different there).

Note in order to have the optimizer accept the WORKLOAD system statistic the MREADTIM needs to greater than the SREADTIM. If your multi-block read requests are served from a cache or your storage system performs aggresssive read-aheads the measured MREADTIM can be less than the SREADTIM. In this case you might need to adjust the MREADTIM manually using the GET_SYSTEM_STATS/SET_SYSTEM_STATS API, which will be covered below.

One interesting oddity showed up when MBRC was available but MREADTIM was missing or not greater than SREADTIM: In that case it looks like that the NOWORKLOAD statistics use the MBRC set in their calculations for synthesizing the MREADTIM and calculating the full table scan cost. This makes sense but is interesting mixture of NOWORKLOAD and WORKLOAD system statistics.

The following test case shows how to gather WORKLOAD system statistics, and how to correct manually a MREADTIM value gathered too low.

SQL>
SQL> drop table t1 purge;

Table dropped.

Elapsed: 00:00:02.44
SQL>
SQL> create table t1
2 pctfree 99
3 pctused 1
4 -- tablespace test_2k
5 -- tablespace test_4k
6 tablespace test_8k
7 -- tablespace test_16k
8 as
9 with generator as (
10 select --+ materialize
11 rownum id
12 from all_objects
13 where rownum <= 3000
14 )
15 select
16 /*+ ordered use_nl(v2) */
17 rownum id,
18 trunc(100 * dbms_random.normal) val,
19 rpad('x',100) padding
20 from
21 generator v1,
22 generator v2
23 where
24 rownum <= 10000
25 ;

Table created.

Elapsed: 00:00:02.27
SQL>
SQL> begin
2 dbms_stats.gather_table_stats(
3 user,
4 't1',
5 cascade => true,
6 estimate_percent => null,
7 method_opt => 'for all columns size 1'
8 );
9 end;
10 /

PL/SQL procedure successfully completed.

Elapsed: 00:00:02.60
SQL>
SQL> begin
2 dbms_stats.delete_system_stats;
3 dbms_stats.set_system_stats('CPUSPEEDNW',1000000);
4 end;
5 /

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.07
SQL>
SQL> alter session set "_table_scan_cost_plus_one" = false;

Session altered.

Elapsed: 00:00:00.01
SQL>
SQL> exec dbms_stats.gather_system_stats('START')

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.12
SQL>
SQL> begin
2 for i in 1..10 loop
3 for rec in (
4 select
5 max(val)
6 from
7 t1
8 ) loop
9 execute immediate 'alter system flush buffer_cache';
10 end loop;
11 end loop;
12 for rec in (
13 select /*+ use_nl(a t1) */ max(val) from t1,
14 (
15 select /*+ no_merge no_eliminate_oby */
16 rowid as row_id
17 from
18 t1
19 order by
20 dbms_random.value
21 ) a
22 where a.row_id = t1.rowid
23 ) loop
24 null;
25 end loop;
26 end;
27 /

PL/SQL procedure successfully completed.

Elapsed: 00:01:17.73
SQL>
SQL> exec dbms_stats.gather_system_stats('STOP')

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.74
SQL>
SQL> begin
2 dbms_stats.set_system_stats('CPUSPEED',1000000);
3 end;
4 /

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.09
SQL>
SQL> declare
2 s_status varchar2(200);
3 dt_dstart date;
4 dt_dstop date;
5 n_pvalue number;
6 begin
7 dbms_stats.get_system_stats (
8 s_status,
9 dt_dstart,
10 dt_dstop,
11 'MREADTIM',
12 n_pvalue);
13 dbms_output.put_line('Status: ' || s_status);
14 dbms_output.put_line('Dstart: ' || to_char(dt_dstart, 'DD.MM.YYYY HH24:MI:SS'));
15 dbms_output.put_line('Dstop : ' || to_char(dt_dstop , 'DD.MM.YYYY HH24:MI:SS'));
16 dbms_output.put_line('Value : ' || to_char(n_pvalue, 'TM'));
17 dbms_stats.set_system_stats('MREADTIM', 10 * n_pvalue);
18 end;
19 /
Status: COMPLETED
Dstart: 03.05.2009 13:24:00
Dstop : 03.05.2009 13:24:00
Value : 1.293

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.11
SQL>
SQL> column sname format a20
SQL> column pname format a20
SQL> column pval2 format a20
SQL>
SQL> select
2 sname
3 , pname
4 , pval1
5 , pval2
6 from
7 sys.aux_stats$;

SNAME PNAME PVAL1 PVAL2
-------------------- -------------------- ---------- --------------------
SYSSTATS_INFO STATUS COMPLETED
SYSSTATS_INFO DSTART 05-03-2009 13:24
SYSSTATS_INFO DSTOP 05-03-2009 13:24
SYSSTATS_INFO FLAGS 1
SYSSTATS_MAIN CPUSPEEDNW 1000000
SYSSTATS_MAIN IOSEEKTIM 10
SYSSTATS_MAIN IOTFRSPEED 4096
SYSSTATS_MAIN SREADTIM 8.021
SYSSTATS_MAIN MREADTIM 12.93
SYSSTATS_MAIN CPUSPEED 1000000
SYSSTATS_MAIN MBRC 8
SYSSTATS_MAIN MAXTHR
SYSSTATS_MAIN SLAVETHR

13 rows selected.

Elapsed: 00:00:00.06
SQL>
SQL> explain plan for
2 select
3 max(val)
4 from
5 t1;

Explained.

Elapsed: 00:00:00.02
SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3724264953

---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 4 | 2016 (0)| 00:00:17 |
| 1 | SORT AGGREGATE | | 1 | 4 | | |
| 2 | TABLE ACCESS FULL| T1 | 10000 | 40000 | 2016 (0)| 00:00:17 |
---------------------------------------------------------------------------

9 rows selected.

Elapsed: 00:00:00.17
SQL>
SQL> spool off

Using the known formula we can confirm the cost calculation above:

Blocks below HWM / MBRC: Number of multi-block read requests required to scan the segment

Number of multi-block read requests * MREADTIM = time it takes to perform these number of read requests in milliseconds.

Finally this is divided by SREADTIM to arrive at the known unit used for cost representation which is number of single read requests.

10,000 / 8 = 1,250 multi-block read requests

1,250 * 12.93 = 16,162.5 ms execution time (which is shown as 17 seconds in the plan by the way)

16,162.5 / 8.021 = 2,015.02 (which is shown as 2,016 in the plan)

Manually writing and maintaining system statistics

Using the DBMS_STATS.GET_SYSTEM_STATS / SET_SYSTEM_STATS API you can write your own set of system statistics for both NOWORKLOAD and WORKLOAD values.

You can use DBMS_STATS.DELETE_SYSTEM_STATS to remove the system statistics, which will activate the default NOWORKLOAD system statistics in 10g and disable CPU costing in 9i.

You can use DBMS_STATS.EXPORT_SYSTEM_STATS / IMPORT_SYSTEM_STATS to export and import system statistics to a user statistics table created with DBMS_STATS.CREATE_STAT_TABLE.

Note that DBMS_STATS.GATHER_SYSTEM_STATS when used with an user stats table (created with DBMS_STATS.CREATE_STAT_TABLE) behaves differently than e.g. DBMS_STATS.GATHER_TABLE_STATS: Whereas object related statistics always go to the data dictionary and you only have the option to save the current statistics to the user stats table before replacing them with the new values, GATHER_SYSTEM_STATS actually writes the system statistics into the user stats table and doesn't change the actual system statistics if you're supplying a user stats table name.

For further discussion how you could use system statistics see Jonathan Lewis' thoughts about this topic:

Part 1
Part 2
Part 3

The next part of the series will cover the usage of system statistics in 9i, highlight some quirks and oddities observed and finally show what happens if you attempt to use multiple block sizes for "tuning" purposes.

Dynamic logging with global application context

Controlling logging output across sessions using global application context. September 2007 (updated April 2009)

Collaborate 09: Don’t miss these sessions

Collaborate 09 starts on Sunday, May 3 (a few days from now!) in Orlando. I’ve been offline for several weeks (more on that later), but will be returning to the world of computers and technology in full force in Orlando. I’ve had a few inquiries about whether or not I’ll be at Collaborate, so I thought I’d resurrect my blog with a post about where I’ll be and some of the highlights I see at Collaborate 09.

First, where I’ll be presenting:

  • Monday, 10:45-11:45am, #301, “Avoiding Common RAC Problems”
  • Tuesday, 9:45am-12pm, #332, “Installing RAC From The Ground Up”
  • Wednesday, 9:45-10:45am, #121, “Troubleshooting Oracle Clusterware”

I’m also currently the President of the Oracle RAC Special Interest Group (RAC SIG). The RAC SIG is hosting several great sessions (I’m moderating a couple of these panels) at Collaborate 09 as well:

  • Sunday, 6-7:30pm, IOUG/SIG Welcome Reception (each SIG will have representatives there–this is open to all IOUG attendees)
  • Monday, 8-9am, RAC SIG Orientation
  • Tuesday, 12:15-1:15pm, RAC SIG Birds of a Feather
  • Tuesday, 4:30-5:30pm, RAC SIG Expert Panel
  • Wednesday, 4:30-5:30pm, RAC SIG Customer Panel (not in online scheduler at the moment, check again later)
  • Thursday, 8:30am-12pm, RAC Attack (University Session – Additional fee required)

The RAC SIG has also assembled this list of RAC-related sessions at Collaborate 09 to help you plan your conference agenda.

Be sure to set up your personal agenda using the agenda builder and add these sessions to your agenda. I think that if you have these in your agenda and details (like date or room assignments) change, you’ll be notified via email (not sure, but I think that’s how it works).

Also, you can follow @IOUG on Twitter (follow me too if you’d like) and that will help you find where the action is during the event next week. It’s going to be a great event and I look forward to seeing you there!