Top 60 Oracle Blogs

Recent comments

Oakies Blog Aggregator

Enabling and Disabling Optimizer Bug Fixes

The following was recommended by Oracle support in connection with a specific bug. However, I think some Oracle users may find it useful when testing.

Oracle releases one-off bug fixes for the CBO quite frequently. You can find out which fixes are enabled in your own instance by inspecting the 10053 trace.

For example:

fix 5385629 = enabled
fix 5705630 = disabled
fix 1234567 = enabled
fix 6122894 = enabled
fix 5842686 = disabled
fix 6006300 = disabled

In the above example bug fix 1234567 is enabled. If you want to disable this fix dynamically you can try using:

ALTER SESSION SET "_fix_control" = '1234567:off';

The output of 10053 should change as follows:

fix 5385629 = enabled
fix 5705630 = disabled
fix 1234567 = disabled *
fix 6122894 = enabled
fix 5842686 = disabled
fix 6006300 = disabled

Note the asterisk indicates a non-default setting for the fix

To enable the fix again use:

ALTER SESSION SET "_fix_control" = '1234567:on';

Obviously this should only be used in a test environment, but it is a useful way to evaluate the impact of a bug fix without needing to repeatedly install and deinstall the patch - something that is under change control with a few of my larger customers.

As this is an unsupported parameter you consult Oracle Support before using it in a production environment.

Very good advice...

I've been reading Seth Godin's blog for many years.  He is a 'marketing' person, with a lot of good old fashioned common sense.  I agree with most of what he writes - and he just did a longish post (for Seth Godin it was long).  It was on Slack.

His two points in these unique times - if you find yourself unexpectedly with more free time than you had anticipated you should consider:

a) Continuing your education, learn something new.  As you go to interview and look around, people will ask you what you've been doing with your time.  If you can arrive at an interview with "I've been learning X in my free time" and be really excited about it - be able to converse about it, that'll be a really positive thing.

b) Participate - join the forums - become known.  I've said that myself many times in the past.


So think about that if you find you have more time on your hands than you anticipated having... Not bad ideas.  Even if you don't have a sudden abundance of free time - maybe find the time to do these two things..

Install to go-live, 3 days

This has been an interesting week, but not really that surprising.

I was called back to a previous client site where I had previously helped with some Oracle Application Server ( post-install configuration. In that previous visit, I got oriented to the environment they use and the packaged application they were deploying. The packaged application uses JSP, Oracle Forms, and Oracle Reports (possibly also Discoverer). The deployment environment is all Microsoft Windows servers with two Oracle Application Server homes per application server since the vendor’s deployment requires that JSPs be deployed in a separate O_H from the Oracle Forms and Oracle Reports environment (that’s the first eyebrow-raise I did, but whatever).

This customer had an environment that was configured by the vendor for testing purposes and it works fine. However, it uses HTTP and they want to use HTTPS for all client-server traffic. They also wanted to be able to manage the environment and be better equipped to support it, so they left the vendor-installed environment as is and built a new environment on new servers so they’d get first-hand views of the install and configuration procedures. Since all the application servers are virtual machines, they could easily create additional machines.

My first job back a few weeks ago was to configure SSL for all the client-server configuration. That took a couple of hours to finalize while explaining the steps I performed along the way. I also did some database configuration to configure basic backups of the databases (these are the first and only Oracle databases in the environment). The customer then fought through finding all the places where URLs were either in database tables or hard-coded in properties and configuration files to make the appropriate change for the new HTTPS URLs. We also configured the whole environment to use WebCache to load-balance the two application tiers (2 “Forms tier” and 2 “JSP tier” servers total). After much hunting and a little tracing of SQL to find the URLs in the DB, it all worked fine.

This week, the plan was to invite the vendor’s installation consultant along with me to a joint installfest. We would start on Monday morning with a database server that had databases installed, patched, and created along with two “brand new” application servers that had no Oracle software on them at all. The tasks were to install and configure the vendor’s application and go-live on it (cutting over the existing environments from the vendor-installed environment that was in use at the time) by Wednesday afternoon. That’s 3 days. Same week, same year…you read it right.

While this seems (and is) something of a horrible idea, it did have some logic that helped make it seem a little less bad:

  • They only had one module (payroll–so an important module) for one small sub-company (a small number of employees) live on the current system. The rest of the modules and other entities will go live later, following data migration testing and training.
  • They saw it as an opportunity to go live, maybe hit a few bumps, but have the two best resources to fix the bumps (the vendor installation consultant and me) onsite and ready to address whatever comes up. The alternative was to go live next week when they’d be there without additional support and have to address the issues “alone.”

The short version of the story is that they went live at around 5pm on Wednesday afternoon. True to their goal, it was three days from first Oracle Univesal Installer run to application go-live. Of course, they had done testing quite a lot on other systems/environments, so they already had a list of workarounds and fixes to implement as part of the three-day plan.

While I still don’t believe that such a short implementation cycle was the best plan, it did happen and they went in eyes wide open knowing the risks and potential issues. Regardless, this is probably the fastest implementation go-live I’ve ever witnessed and can’t imagine any other ERP implementation project doing the same thing no matter how small it is!

The Evils of Encoding Meaning Into Data

About a year ago I worked on collating and transforming data from an application so that it could be imported into another app. I've performed this exercise a number of times in the past 20 or so years, and while it is never boring, it is sometimes quite challenging.

Oft times when trying to make the data suitable for inclusion in the new applications, I ask my self "What were they thinking?"

I will leave the answer to that up to your imagination, as my answers to that particular question are not always complimentary.

One of the problems run into is when the data modeler and database designer chose to allow data with encoded special meanings.

In other words, there is key data in the database, the meaning of which can only be ascertained by asking someone that knows, or finding it in the documentation (Is there documentation?)

The example I will use is a real one I ran into.

Given a table containing projects, and revisions of those projects that are indicated as such by a revision "number".

A sensible series of revision numbers might be a numeric range beginning with 1, or even an alpha range beginning with 'A', or even a combination thereof.

Personally, I prefer the unambiguous nature of an ascending numeric range. 2 is always greater than 1. There is no case involved as there would be with an alpha range.

Which is greater, "a1" or "A1"? You know how this will sort in the database, but will it be what the users expect?

While a numeric range would have been preferable, the data I was given to work with used a range of revision "numbers" that was numeric and alpha, with a numeric value being regarding as greater than the "numeric" value. The "numeric" is in quotes, as this obviously must be stored as a string, and in this case with a leading zero.

Given this rule, a project with revisions of A,B,C,D,01,02 would have a most recent revision of "02". This is not the way it works in the standard collating order in the database.

11:34:03 SQL> create table rev (version varchar2(2));

Table created.

11:34:03 SQL>
11:34:03 SQL> insert into rev values('A');
11:34:03 SQL> insert into rev values('B');
11:34:03 SQL> insert into rev values('C');
11:34:03 SQL> insert into rev values('01');
11:34:03 SQL> insert into rev values('02');
11:34:03 SQL>
11:34:03 SQL> select * from rev order by version;


5 rows selected.

In a perfect world, the data modeler or data architect would work with the users to create a reasonable versioning method. In this case however there is no choice but to work with what I was given.

From the data provided, only the most recent version was to be included in the imported data.
With a versioning system that doesn't follow the normal collating order, this requires some code to determine what is really the greatest version.

If you know a method to accomplish this in straight SQL, please feel free to post it. I could not think of a pure SQL solution.

The following generic function was created to determine the greatest revision value given a table name and the key columns.

-- maxrev.sql
-- return the maximum numeric revision
-- returna alpha if no numeric exists
-- maximum rev is numeric
-- revs are alpha - mixed number and character

create or replace function maxrev
table_name_in varchar2,
key_column_in varchar2,
value_column_in varchar2,
key_value_in varchar2
return varchar2
v_max_rev integer := 0;
v_tmp_rev integer;
v_col_rev varchar2(10);
v_return_rev varchar2(30);
type curType is ref cursor;
l_cursor curType;
v_sql varchar2(1000);

function is_number( chk_data_in varchar2 )
return boolean
dummy number(38,4);
dummy := to_number(chk_data_in);
return true;
when value_error then
return false;
when others then

-- get the maximum rev, whether alpha or numeric
-- there may not be any numerics
v_sql := 'select max(' || value_column_in || ') from ' || table_name_in || ' where ' || key_column_in || ' = :1';

execute immediate v_sql into v_return_rev using key_value_in;

--return v_return_rev;

v_sql := 'select ' || value_column_in || ' from ' || table_name_in || ' where ' || key_column_in || ' = :1';

open l_cursor for v_sql using key_value_in;
fetch l_cursor into v_col_rev;
exit when l_cursor%notfound;

if (is_number(v_col_rev)) then
v_tmp_rev := to_number(v_col_rev);
end if;
if (v_tmp_rev > v_max_rev) then
v_max_rev := v_tmp_rev;
v_return_rev := v_col_rev;
end if;
end loop;

return v_return_rev;

( Reviewing this function just now, I see what could be considered a programming error.
Let me know if you spot it. )

Here's a test case to prove that the function works as expected.

-- maxrev_test.sql
-- should always return numeric if it exists, otherwise alpha

drop table maxrev_test;

create table maxrev_test ( id varchar2(4), rev varchar2(2));

insert into maxrev_test values('T1', 'A');
insert into maxrev_test values('T1', 'B');
insert into maxrev_test values('T1', '01');
insert into maxrev_test values('T1', '02');
insert into maxrev_test values('T2', '01');
insert into maxrev_test values('T2', '02');
insert into maxrev_test values('T2', '03');
insert into maxrev_test values('T3', 'X');
insert into maxrev_test values('T3', 'Y');
insert into maxrev_test values('T3', 'Z');


select * from maxrev_test order by id,rev;

col rev format a10

prompt Incorrect results

select id, max(rev) rev
from maxrev_test
group by id
order by id

prompt Correct results

select id, maxrev('maxrev_test','id','rev',id) rev
from maxrev_test
group by id
order by id

And the results:

Incorrect results

---- ----------
T1 B
T2 03
T3 Z

3 rows selected.

Correct results

---- ----------
T1 02
T2 03
T3 Z

3 rows selected.


This error showed in a trace file (udump) and caused all kind of strange errors in the user application at a customer where I was working. A quick search on metalink pointed me to Doc ID 601709.1 titled “What does the message PROTCOL VIOLATION DETECTED mean in the trace files”.  The message seems to be related to bug 4669305 (which is not published by Oracle Support), but it moved me into the direction of ora-600[12333] and ora-3106. There I found as simple test case where some one in SQL*Plus had a reproducable case: select 1 from dual where rownum < ###. This caused the message to show in the trace file and complain about the bind variable value ‘###’. That looked funny as this was not a bind variable, but ofcourse the database had CURSOR_SHARING set to similar or to force. My customer had CURSOR_SHARING set to similar. Aftter we changed that to EXACT the problem went away</p />

    	  	<div class=

Performance Benefits of ReUse Statement Flag in Application Engine

I have achieved some significant performance improvements in some Application Engine programs by just enabling the ReUse Statement flag on certain steps. I thought I would share a recent example of how effective this can be.

I don't think I can improve on the description of this feature in PeopleBooks:
"One of the key performance features of PeopleSoft Application Engine is the ability to reuse SQL statements by dedicating a persistent cursor to that statement.

Unless you select the ReUse property for a SQL action, %BIND fields are substituted with literal values in the SQL statement. The database has to recompile the statement every time it is executed.

However, selecting ReUse converts any %BIND fields into real bind variables (:1, :2, and so on), enabling PeopleSoft Application Engine to compile the statement once, dedicate a cursor, and re-execute it with new data multiple times. This reduction in compile time can result in dramatic improvements to performance.

In addition, some databases have SQL statement caching. Every time they receive SQL, they compare it against their cache of previously executed statements to see if they have seen it before. If so, they can reuse the old query plan. This works only if the SQL text matches exactly. This is unlikely with literals instead of bind variables."

In fact most databases do this, and Oracle certainly does.

On Oracle, you could enable CURSOR_SHARING. Then Oracle effectively replaces the literals with bind variables at parse time. However, I certainly would not recommend doing this database-wide. Whenever I have tried this on a PeopleSoft system, it has had severe negative effects elsewhere. I have enabled cursor sharing at session level for specific batch programs (using a trigger), but even then it is not always beneficial.

Instead, I do recommend using the ReUse Statement flag wherever possible. It cannot just be turned on indiscriminately, the same section in PeopleBooks goes on to describe some limitations (which is probably why the default value for the flag is false).

To illustrate the kind of improvement you can obtain, here is a real-life example. This is an extract from the batch timings report at the end of the Application Engine trace file. We are interested in statements with the high compile count.

ReUse Statement is not enabled on these 4 steps. They account for more that 50% of the overall execution time.

                          PeopleSoft Application Engine Timings
(All timings in seconds)

C o m p i l e E x e c u t e F e t c h Total
SQL Statement Count Time Count Time Count Time Time
------------------------------ ------- -------- ------- -------- ------- -------- --------
99XxxXxx.Step02.S 8453 2.8 8453 685.6 0 0.0 688.4
99XxxXxx.Step03.S 8453 5.0 8453 2718.8 0 0.0 2723.8
99XxxXxx.Step05.S 8453 0.9 8453 888.4 0 0.0 889.3
99XxxXxx.Step06.S 8453 0.4 8453 17.4 0 0.0 17.8

Total run time : 8416.4
Total time in application SQL : 8195.0 Percent time in application SQL : 97.4%
Total time in PeopleCode : 192.7 Percent time in PeopleCode : 2.3%
Total time in cache : 8.7 Number of calls to cache : 8542

Now, I have enabled ReUse Statement on these steps. I have not changed anything else.

                         C o m p i l e    E x e c u t e    F e t c h        Total
SQL Statement Count Time Count Time Count Time Time
------------------------------ ------- -------- ------- -------- ------- -------- --------
99XxxXxx.Step02.S 1 0.0 8453 342.3 0 0.0 342.3
99XxxXxx.Step03.S 1 0.0 8453 83.3 0 0.0 83.3
99XxxXxx.Step05.S 1 0.0 8453 8.7 0 0.0 8.7
99XxxXxx.Step06.S 1 0.0 8453 7.6 0 0.0 7.6
Total run time : 5534.1
Total time in application SQL : 5341.7 Percent time in application SQL : 96.5%
Total time in PeopleCode : 190.8 Percent time in PeopleCode : 3.4%
Total time in cache : 1.1 Number of calls to cache : 90

Notice that:

  • The number of compilations for each step has gone down to 1, though the number of executions remains the same
  • The execution time for the first three statements has fallen by nearly 90%.
  • The improvement in the 4th statement is quite modest because it did not contain any bind variables, but clearly some of the time reported as execution time by Application Engine is associated with the preparation of a new SQL statement.

To emphasise the point, lets look at the effect on the database. The following are extracts from the TKPROF output for Oracle SQL trace files for these processes.

First the TKPROF without ReUse Statement:


call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 101063 2600.60 2602.83 6197 661559 4 0
Execute 101232 1817.96 3787.17 1572333 73729207 10617830 4770112
Fetch 96186 385.41 1101.47 374425 25986600 0 96285
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 298481 4803.97 7491.48 1952955 100377366 10617834 4866397

Misses in library cache during parse: 25498
Misses in library cache during execute: 90

Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
db file sequential read 1199472 0.36 2601.83
SQL*Net message from client 130345 1.57 296.50
db file scattered read 8816 0.39 171.47


call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 100002 13.51 13.57 17 820 94 0
Execute 131495 30.00 31.31 7025 29277 21164 74315
Fetch 141837 218.77 295.49 159969 3039304 12 519406
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 373334 262.28 340.38 167011 3069401 21270 593721

160446 user SQL statements in session.

70478 internal SQL statements in session.
230924 SQL statements in session.

And now with ReUse Statement set on only those four steps


call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 67238 10.24 10.75 47 4415 9 0
Execute 101160 1650.25 4040.88 1766325 129765633 11160830 4781797
Fetch 96123 385.50 1024.50 372737 26097251 0 103844
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 264521 2045.99 5076.14 2139109 155867299 11160839 4885641

Misses in library cache during parse: 73
Misses in library cache during execute: 21

Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
db file sequential read 1506834 0.61 2839.19
SQL*Net message from client 130312 1.53 258.81
db file scattered read 8782 0.37 147.01


call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1331 0.46 0.46 0 173 16 0
Execute 4044 2.72 5.82 12923 33374 24353 113323
Fetch 5697 8.38 13.43 15550 55431 12 13449
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 11072 11.56 19.72 28473 88978 24381 126772

67425 user SQL statements in session.

3154 internal SQL statements in session.
70579 SQL statements in session.

  • Nearly all the saving is in parse time of non-recursive statements, the rest is the reduction of recursive SQL because there is less parsing.
  • There is less parsing, because there are fewer different SQL statements submitted by Application Engine. The number of user statements has fallen from 160446 to 67425.
  • The number of misses on the library cache has fallen from 25498 to just 73.
  • There has been a reduction in SQL*Net message from client (database idle time) from 296 seconds to 253 because the Application Engine program spends less time compiling SQL statements.


Enabling ReUse Statement can have a very significant effect on the performance of Application Engine batches. It is most effective when SQL statements with %BIND() variables are executed within loops. Otherwise, for each execution of the loop, Application Engine must recompile the SQL statement with different bind variable values, which the database will treat as a new statement that must be parsed.

SQL parsing is CPU intensive. Reducing excessive parse also reduces CPU consumption on the database server. It can also reduce physical I/O to the database catalogue. On PeopleSoft 8.x applications that use Unicode, the overhead of parsing is magnified by the use of length checking constraints on all character columns. This is no longer an issue in version 9 applications which use character semantics.

If you use Oracle's Automatic Memory Management, excessive parsing can cause the database to allocate more memory to the Shared Pool at the expense of the Block Buffer Cache. This in turn can increase physical I/O and can degrade query performance.

Bind Variables are a good thing. You should use them. Therefore, ReUse Statement is also a good thing. You should use that too!

Advanced Query Optimization Question

Having now had a couple days to rest and reflect, I can definitively say that SEOUC was great.  I’m really glad I attended, was able to meet a lot of new people, and have some good discussions about problems people faced both technically and given the current economy.  I was also able to get some […]

Dad, do I really need math?

My kids are pretty good about their math homework. They seem to enjoy it for the most part. It wasn't always that way. When the going gets tough, the natural human response, it seems, is to quit. So at times in our kids' school careers, their Mom and I have had to hang tough with them to try to make them do their homework. (The credit here belongs to their Mom.)

I remember when I was in school, the prevailing attitude in the classroom was, "When are we ever going to need to know this?" The much sadder one was, "My Mom and Dad said that I'm never going to need to know this stuff."

I couldn't have told you, when I was 10 years old, that I'd need to understand queueing theory one day in order to finish an Oracle project I had to do for Fidelity Investments. Or that I'd be able to win a Jim Sundberg autographed World Series baseball by using the distributive law of multiplication in my head while he was showing 400 people how Gaylord Perry liked his signs on the mound. It didn't matter to me, because I just had faith that there was a good reason I was supposed to learn everything I could in school. Having that particular faith was no accident.

I don't remember my Mom and Dad ever forcing me into doing math. I knew, of course, that it was My Job to do as well as I could in school ('A's are loafing unless they're '100's). But I don't remember ever feeling forced.

One of the things I fondly remember my Dad doing with me was glide slope calculation. Dad flew for many years for United Airlines. He retired as a 767 captain a long time ago. One of his priorities as a professional was to conserve fuel for his employer. It used to bug him when a pilot would constantly monkey around with the throttle during the approach to a landing. My Dad told me his goal on approach was to dial back the power one time at cruise altitude, at the very beginning of the descent, and then never touch it again until the turned on the thrust reversers after touchdown.

So he played this game with me, especially on car rides, because it was a 30-minute drive each day to where I went to grade school. He'd give me the altitude we were at and the altitude we needed to descend to, and either a time limit or the number of miles outbound we were. Then he'd ask me to calculate the sink rate in my head. He put triangles into my brain that I could see every time he asked me a question like that, and I'd hatch on it with him until we came up with the right sink rate. Or he would ask me things like, if the nose is pointing to heading 026, then what heading is our tail pointed at. So he put circles into my brain, too.

Every once in a while—oh, and I loved this—he would give me a paper flight plan form, with dozens of tiny cells to fill in, and I would fill them all in. I was 6 or 7 when we was doing that. I of course didn't know how to do it correctly, but I filled it all in anyway. Whenever I was really worried about doing it "right," I'd ask my Dad, and he'd tell me the kinds of things I should write down and which cells I should write them in.

You know the biggest value of that flight planning experience? It was that I couldn't wait to find out in school someday what point meant. You know, as in "three point five." I remember the day in class when a teacher finally taught us about decimal points. I felt sooo cool because now I knew what "three point five" actually meant.

My Dad did things with me that got me interested and excited about doing math, all on my own, without making me feel like I was being punished by it. Thus the abundance of wonderful opportunities that I have today are largely a continuing gift from him. I hope that another gift he gave me is the ability to be a good enough dad myself for my own kiddos, but of course I worry that I'm not doing it enough, or well enough. Telling stories about it helps remind me how important it is.

What reminded me of all this is a little document called "A Short Course in Human Relations," autographed by Bobby Bragan. It sits here in the foyer of our Method R office. I see it every single time I walk through our door. You've probably heard the following statement:

Say you were standing with one foot in the oven and one foot in an ice bucket. According to the percentage people, you should be perfectly comfortable.

Bobby Bragan said that; I think it was in 1963. It is a classic illustration of skew, which is vitally important to my career. Bobby Bragan, though, is an American hero for lots of good reasons. You should read about him.

Well, one night a few years ago, I got to watch Bobby Bragan speak to a small group. His talk was fascinating. He brought a huge box of stuff up to the podium with him, and he warmed up with a game. He opened by pulling something out of the box and saying whoever can answer this riddle gets the prize. The first one was something like, "What has eighteen legs and two breasts?" Shocker, right? The answer was The Supreme Court. Whoever said that, Bobby Bragan tossed him the first prize of the night.

Pretty deep into his speech, he must have given out twenty prizes to people. Not me. I either didn't know the answer, or I didn't say it loud enough or fast enough. I watched prize after prize go out, until he brought out this autographed document called "A Short Course in Human Relations." He read it aloud. It was an important part of his speech. And then he asked the question that went with it: "Nine ballplayers come out of the dugout before each game, and each ballplayer shakes the hand of every teammate. How many handshakes is that?" The voice that said "thirty-six" was mine. I was doggone lucky that Bobby Bragan had asked a bunch of baseball players a math question, and right on the prize that I really wanted, too.

Math. You really never know when you're going to need it.

Security, Forecasting Oracle Performance and Some stuff to post… soon…

I’ve been busy this February “playing around/studying” on the following:

1) Oracle Security products (Advance Security Option, Database Vault, Audit Vault, Data Masking, etc. etc.). Well, every organization must guard their digital assets against any threat (external/internal) because once compromised it could lead to negative publicity, lost revenue, litigation, lost of trust.. and the list goes on.. I’m telling you, Oracle has a lot to offer (breadth of products and features, some of them are even free!) on this area and you just need to have the knowledge to stitch them..

I’ll recommend a great book on Security, I believe everyone should have this (developers, DBAs, CxOs)..!
It’s a book by David Knox which is the chief engineer in Oracle’s Information Assurance Center (IAC). The IAC is a center of expertise that works with Oracle’s customers, partners, development, and consulting to design and develop security and high-availability solutions. Having a long time hands-on practice (working on Oracle security ever since) he was able to architect and implement solutions on organizations like United States Department of Defense and CIA.. and produce a book called “Effective Oracle Database 10g Security by Design”


2) The other one is Forecasting Oracle Performance… This is a book by Craig Shallahamer, a former Oracle employee, back then he’s with the System Performance Group (one of the largest and best collections of Oracle performance experts in the world) together with Cary Millsap (author of Optimizing Oracle Performance) and a few others..

Cary’s book is great, I’ve already read it and it will change the way you think about performance and tuning.. This is the only book that discusses Extended SQL trace in detail, the ins and outs and how to use it in different scenarios. This book also gave me a primer on what is Queueing Theory.. it tackles on the M/M/m Queueing Model but I felt it was more of an overview and not very detailed but at least I get the picture how useful it is..

Now presenting Craig’s work of art.. Forecasting Oracle Performance..


If you’re an Oracle professional, and you love complexities, and you like to be challenged, and you want to see whats ahead of you (in performance), and you love math (well I’m not really good at it, my brother is).. better read this!

Honestly I have two more performance/capacity planning book which I’ll still have to figure out how am I going to use it (specifically the formulas) on some of my performance tuning activities. These are:


p style=”padding-left:20px;”>Guerrilla Capacity Planning: A Tactical Approach to Planning for Highly Scalable Applications and Services
Performance by Design: Computer Capacity Planning By Example

Craig’s book is focused on the complex Oracle environment..and the book has contents like the following:

  • Learn how you manage your Service Levels or how to make Service Levels by using your workload samples,
  • at what workload and utilization you’ll enter the “elbow of the curve”
  • should you acquire, more CPUs (same speed)? or just replace it with faster CPUs (same number)? or both?
  • should you acquire, more disks (same speed)? or just replace it with faster disks (same number)? or both?
  • If there would be 200% additional workload, what will be the effect on response time?.. should I buy more CPUs? manage the workload? tune the application?
  • Learn basic statistics (Sample and Population, mean & weighted average, standard deviation, skew, histogram, inferences, residual analysis, confidence level and interval).. And you’ll learn to use the statement like.. 90% OF THE TIME, THE QUERY TAKES AN AVERAGE OF 3 SECONDS PLUS OR MINUS 2 SECONDS
  • Little’s law and Kendall’s notation
  • and many more….

You might be overwhelmed, but the forecasting work done in this book has methods and is tackled in a gradual manner. Chapter by Chapter you’ll see yourself progressing and not feeling the pain of understanding something that is beyond human cognition..</p />

    	  	<div class=

Dang it, people, they're syscalls, not "waits"...

So many times, I see people get really confused about how to attack an Oracle performance problem, resulting in thoughts that look like this:

I don't understand why my program is so slow. The Oracle wait interface says it's just not waiting on anything. ?

The confusion begins with the name "wait event." I wish Oracle hadn't called them that. I wish instead of WAIT in the extended SQL trace output, they had used the token SYSCALL. Ok, that's seven bytes of trace data instead of just four, so maybe OS instead of WAIT. I wish that they had called v$session_wait either v$session_syscall or v$session_os .

Here's why. First, realize that an Oracle "wait event" is basically the instrumentation for one operating system subroutine call ("syscall"). For example, the Oracle event called db file sequential read: that's instrumentation for a pread call on our Linux box. On the same system, a db file scattered read covers a sequence of two syscalls: _llseek and readv (that's one reason why I said basically at the beginning of this paragraph). The event called enqueue: that's a semtimedop call.

Second, the word wait is easy to misinterpret. To the Oracle kernel developer who wrote the word WAIT into the Oracle source code, the word connoted the duration that the code path he was writing would have to "wait" for some syscall to return. But to an end-user or performance analyst, the word wait has lots of other meanings, too, like (to name just two):

  1. How long the user has to wait for a task to complete (this is R in the R = S + W equation from queueing theory).
  2. How long the user's task queues for service on a specific resource (this is W in the R = S + W equation from queueing theory).

The problem is that, as obvious and useful as these two definitions seem, neither one of them means what the word wait means in an Oracle context, which is:

wait n. In an Oracle context, the approximate response time of what is usually a single operating system call (syscall) executed by an Oracle kernel process.

That's a problem. It's a big problem when people try to stick Oracle wait times into the W slot of mathematical queueing models. Because they're not W values; they're R values. (But they're not the same R values as in #1 above.)

But that's a digression from a much more important point: I think the word wait simply confuses people into thinking that response time is something different than what it really is. Response time is simply how long it takes to execute a given code path.

To understand response time, you have to understand code path.

This is actually the core tenet that divides people who "tune" into two categories: people who look at code path, and people who look at system resources.

Here's an example of what code path really looks like, for an Oracle process:

begin prepare (dbcall)
execute Oracle kernel code path (mostly CPU)
maybe make a syscall or two (e.g., "latch: library cache")
maybe even make recursive prepare, execute, or fetch calls (e.g., view resolution)
end prepare
maybe make a syscall or two (e.g., "SQL*Net message...")
begin execute (another dbcall)
execute Oracle kernel code path
maybe make some syscalls (e.g., "db file sequential read" for updates)
end execute
maybe make a syscall or two
begin fetch (another dbcall)
execute Oracle kernel code path (acquire latches, visit the buffer cache, ...)
maybe make some syscalls (e.g., "db")
end fetch
make a syscall or two

The trick is, you can't see this whole picture when you look at v$whatever within Oracle. You have to look at a lot of v$whatevers and do a lot of work reconciling what you find, to come up with anything close to a coherent picture of your code path.

But when you look at the Oracle code path, do you see how the syscalls just kind of blend in with the dbcalls? It's because they're all calls, and they all take time. It's non-orthogonal thinking to call syscalls something other than what they really are: just subroutine calls to another layer in the software stack. Calling all syscalls waits diminishes the one distinction that I think really actually is important; that's the distinction between syscalls that occur within dbcalls and the syscalls that occur between dbcalls.

It's the reason I like extended SQL trace data so much: it lets me look at my code path without having to spend a bunch of extra time trying to compose several different perspectives of performance into a coherent view. The coherent view I want is right there in one place, laid out sequentially for me to look at, and that coherent view fits what the business needs to be looking at, as in...

Scene 1:

  • Business person: Our TPS Report is slow.
  • Oracle person: Yes, our system has a lot of waits. We're working on it.
  • (Later...) Oracle person: Great news! The problem with the waits has been solved.
  • Business person: Er, but the TSP Report is still slow.

Scene 2:

  • Business person: Our TPS Report is slow.
  • Oracle person: I'll look at it.
  • (Later...) Oracle person: I figured out your problem. The TPS Report was doing something stupid that it didn't need to do. It doesn't anymore.
  • Business person: Thanks; I noticed. It runs in, like, only a couple seconds now.