Search

Top 60 Oracle Blogs

Recent comments

Oakies Blog Aggregator

Reducing Unnecessary Instances of Temporary Records

In a previous posting, I recommended moving temporary records used by Application Engine programs to a tablespace with a 32Kb block size, and using a larger uniform extent size (I chose 1Mb).

However, continuing the example for my last posting on this subject, TL_TIMEADMIN has 150 temporary records, that have 173 indexes (in HCM8.9). So you get 323 segments for every instance set in the Application Engine properties, and that would consume at least 323Mb of the 32Kb tablespace. If that space consumption is not a problem, then stop reading now.

However, I noticed that some temporary records are used by several Application Engine programs. This is usually because one program call another and the temporary records are referenced in both. However if both programs have a number of instances of temporary records defined in their properties, then temporary tables will be built for both.

Lets take an example TL_PUB_TM_AE calls TL_PUB_TM1. They are both delivered with 5 instances.

#000000; padding-left: 4px; padding-right: 4px; background-color:#EEEEEE">
AE_APPLID    TEMPTBLINSTANCES
------------ ----------------
TL_PUB_TM1 5
TL_PUB_TM_AE 5

They share 8 temporary records in common.

#000000; padding-left: 4px; padding-right: 4px; background-color:#EEEEEE">
SELECT a.recname, a.ae_applid, b.ae_applid
FROM psaeappltemptbl a
FULL OUTER JOIN psaeappltemptbl b
ON a.recname = b.recname
AND b.ae_applid = 'TL_PUB_TM_AE'
WHERE a.ae_applid = 'TL_PUB_TM1'
ORDER BY 1
/

RECNAME AE_APPLID AE_APPLID
--------------- ------------ ------------
TL_PROF_LIST TL_PUB_TM1
TL_PROF_WRK TL_PUB_TM1
WRK_PROJ1_TAO TL_PUB_TM1 TL_PUB_TM_AE
WRK_PROJ2_TAO TL_PUB_TM1 TL_PUB_TM_AE
WRK_PROJ3_TAO TL_PUB_TM1 TL_PUB_TM_AE
WRK_PROJ4_TAO TL_PUB_TM1 TL_PUB_TM_AE
WRK_PROJ5_TAO TL_PUB_TM1 TL_PUB_TM_AE
WRK_PROJ6_TAO TL_PUB_TM1 TL_PUB_TM_AE
WRK_PROJ7_TAO TL_PUB_TM1 TL_PUB_TM_AE
WRK_PROJ_TAO TL_PUB_TM1 TL_PUB_TM_AE

5 temporary records are built by Application Designer for each Application Engine program. But TL_PUB_TM1 is never run on its own. So do you need the extra instances of those 8 temporary records? The temporary records defined on TL_PUB_TM_AE are a subset of TL_PUB_TM1. If you reduced the number of instances on TL_PUB_TM_AE to 0, you would still have the 5 instances defined on TL_PUB_TM_TM1. But that would enable you to drop 40 tables and their indexes.

So, I started to wonder if there was a general principle here. If the temporary tables on an Application Engine program are a subset of those on another program, then providing you make ensure the number of instances on the superset is not less than those of the subset, you could reduce the number of instances on the subset to 0.

This view reports Application Engine programs whose temporary records are a subset of those on another program, and also counts the number of records in the subset.

CREATE OR REPLACE VIEW gfc_aetemptbl_hier AS
SELECT
sup.ae_applid sup_applid, supa.temptblinstances sup_instances
, sub.ae_applid sub_applid, suba.temptblinstances sub_instances
, (SELECT COUNT(*)
FROM psaeappltemptbl supc, psaeappltemptbl subc
WHERE supc.ae_applid = sup.ae_applid
AND subc.ae_applid = sub.ae_applid
AND subc.recname = supc.recname) num_records
FROM (SELECT DISTINCT ae_applid FROM psaeappltemptbl) sup
, (SELECT DISTINCT ae_applid FROM psaeappltemptbl) sub
, psaeappldefn supa
, psaeappldefn suba
WHERE sup.ae_applid != sub.ae_applid
AND supa.ae_applid = sup.ae_applid
AND suba.ae_applid = sub.ae_applid
AND EXISTS( /*a temporary record in common*/
SELECT 'x'
FROM psaeappltemptbl sup1
, psaeappltemptbl sub1
WHERE sub1.ae_applid = sub.ae_applid
AND sup1.ae_applid = sup.ae_applid
AND sup1.recname = sub1.recname
AND ROWNUM = 1)
/*there is no record in the subset that is not in the superset*/
AND NOT EXISTS(
SELECT 'x'
FROM psaeappltemptbl sub2
WHERE sub2.ae_applid = sub.ae_applid
AND NOT EXISTS(
SELECT 'x'
FROM psaeappltemptbl sup2
WHERE sup2.ae_applid = sup.ae_applid
AND sub2.recname = sup2.recname
AND ROWNUM = 1)
AND ROWNUM = 1)
/*there is a record in the subset that is not in the subset - so there is a difference*/
AND EXISTS(
SELECT 'x'
FROM psaeappltemptbl sup2
WHERE sup2.ae_applid = sup.ae_applid
AND NOT EXISTS(
SELECT 'x'
FROM psaeappltemptbl sub2
WHERE sub2.ae_applid = sub.ae_applid
AND sup2.recname = sub2.recname
AND ROWNUM = 1)
AND ROWNUM = 1)
ORDER BY 1,2;

This is the output from the view for the Application Engine programs in the example.

SUP_APPLID   SUP_INSTANCES SUB_APPLID   SUB_INSTANCES NUM_RECORDS
------------ ------------- ------------ ------------- -----------

TL_PUB_TM1 5 TL_PUB_TM_AE 5 8
TL_PUB_TM1 5 TL_PY_PUB_TM 5 5
TL_PUB_TM_AE 5 TL_PY_PUB_TM 5 5

I found that some Application Engine programs have identical sets of temporary records. This can happen when a program is cloned, which some customers do when they want to customise a vanilla program. This view reports on them.

CREATE OR REPLACE VIEW gfc_aetemptbl_eq AS
SELECT sup.ae_applid sup_applid, supa.temptblinstances sup_instances
, sub.ae_applid sub_applid, suba.temptblinstances sub_instances
, (SELECT COUNT(*)
FROM psaeappltemptbl supc, psaeappltemptbl subc
WHERE supc.ae_applid = sup.ae_applid
AND subc.ae_applid = sub.ae_applid
AND subc.recname = supc.recname) num_records
FROM (SELECT DISTINCT ae_applid FROM psaeappltemptbl) sup
, (SELECT DISTINCT ae_applid FROM psaeappltemptbl) sub
, psaeappldefn supa
, psaeappldefn suba
WHERE sup.ae_applid < ae_applid =" sup.ae_applid" ae_applid =" sub.ae_applid" ae_applid =" sub.ae_applid" ae_applid =" sup.ae_applid" recname =" sub1.recname" rownum =" 1)" ae_applid =" sub.ae_applid" ae_applid =" sup.ae_applid" recname =" sup2.recname" rownum =" 1)" rownum =" 1)" ae_applid =" sup.ae_applid" ae_applid =" sub.ae_applid" recname =" sub2.recname" rownum =" 1)" rownum =" 1)">

Here, three programs share the same set of temporary records.

SUP_APPLID   SUP_INSTANCES SUB_APPLID   SUB_INSTANCES NUM_RECORDS
------------ ------------- ------------ ------------- -----------

ELEC_TSCRPT 20 E_TSCRPT_BAT 20 2
ELEC_TSCRPT 20 E_TSCRPT_LIB 20 2
E_TSCRPT_BAT 20 E_TSCRPT_LIB 20 2

I can use these view to set the instances on the subsets to 0 and increase the instances on the supersets as necessary. There are examples of both subsets within subsets and more than two Application Engine programs that share the same set of temporary tables. So I do this repeatedly until all the subsets have zero instances.

This PL/SQL script makes the updates to the Application Engine programs (including maintaining PeopleSoft version numbers), and also creates an Application Designer project called GFC_TTI with all the programs and records. This project can then be used to migrate the Application Engine programs to another environment.

DECLARE
l_any BOOLEAN;
l_projectname VARCHAR2(30 CHAR) := 'GFC_TTI';
l_version_aem INTEGER;
l_version_pjm INTEGER;

PROCEDURE projitem(objecttype INTEGER
,objectid1 INTEGER
,objectvalue1 VARCHAR2) IS
BEGIN
INSERT INTO psprojectitem
(projectname ,objecttype
,objectid1 ,objectvalue1 ,objectid2 ,objectvalue2
,objectid3 ,objectvalue3 ,objectid4 ,objectvalue4
,nodetype ,sourcestatus ,targetstatus ,upgradeaction ,takeaction ,copydone)
VALUES
(l_projectname,objecttype
,objectid1, objectvalue1, 0, ' '
, 0, ' ', 0, ' '
,0,0,0,0,1,0);
EXCEPTION WHEN dup_val_on_index THEN NULL;
END;

BEGIN
UPDATE psversion
SET version = version+1
WHERE objecttypename IN('SYS','AEM','PJM');

UPDATE pslock
SET version = version+1
WHERE objecttypename IN('SYS','AEM','PJM');

SELECT version
INTO l_version_aem
FROM psversion
WHERE objecttypename = 'AEM';

SELECT version
INTO l_version_pjm
FROM psversion
WHERE objecttypename = 'PJM';

l_any := TRUE;
WHILE l_any LOOP
l_any := FALSE;
FOR i IN(
SELECT *
FROM gfc_aetemptbl_hier a
WHERE a.sub_instances > 0
AND NOT EXISTS(
SELECT 'x'
FROM gfc_aetemptbl_hier b
WHERE b.sup_applid = a.sub_applid
AND b.sub_instances > 0
AND ROWNUM = 1)
ORDER BY 1
) LOOP
UPDATE psaeappldefn x
SET temptblinstances =
GREATEST(x.temptblinstances
,i.sub_instances,i.sup_instances)
, version = l_version_aem
, lastupddttm = SYSDATE
WHERE ae_applid = i.sup_applid;

projitem(33,66,i.sup_applid);

UPDATE psaeappldefn x
SET temptblinstances = 0
, version = l_version_aem
, lastupddttm = SYSDATE
WHERE ae_applid = i.sub_applid;

projitem(33,66,i.sub_applid);
l_any := TRUE;
END LOOP;
END LOOP;

l_any := TRUE;
WHILE l_any LOOP
l_any := FALSE;
FOR i IN(
SELECT *
FROM gfc_aetemptbl_eq a
WHERE a.sub_instances > 0
AND NOT EXISTS(
SELECT 'x'
FROM gfc_aetemptbl_eq b
WHERE b.sup_applid = a.sub_applid
AND b.sub_instances > 0
AND ROWNUM = 1)
ORDER BY 1
) LOOP
UPDATE psaeappldefn x
SET temptblinstances =
GREATEST(x.temptblinstances
,i.sub_instances,i.sup_instances)
, version = l_version_aem
, lastupddttm = SYSDATE
WHERE ae_applid = i.sup_applid;

projitem(33,66,i.sub_applid);

UPDATE psaeappldefn x
SET temptblinstances = 0
, version = l_version_aem
, lastupddttm = SYSDATE
WHERE ae_applid = i.sub_applid;

projitem(33,66,i.sub_applid);
l_any := TRUE;
END LOOP;
END LOOP;
END;

INSERT INTO psprojectitem
(projectname ,objecttype
,objectid1 ,objectvalue1 ,objectid2 ,objectvalue2
,objectid3 ,objectvalue3 ,objectid4 ,objectvalue4
,nodetype ,sourcestatus ,targetstatus ,upgradeaction ,takeaction ,copydone)
SELECT DISTINCT
l_projectname,0
, 1, recname, 0, ' '
, 0, ' ', 0, ' '
, 0,0,0,0,1,0
FROM psaeappltemptbl t
, psprojectitem i
WHERE i.projectname = l_projectname
AND i.objecttype = 33
AND i.objectid1 = 66
AND i.objectvalue1 = t.ae_applid
AND NOT EXISTS(
SELECT 'x'
FROM psprojectitem i1
WHERE i1.projectname = l_projectname
AND i1.objecttype = 0
AND i1.objectid1 = 1
AND i1.objectvalue1 = t.recname
);

BEGIN
INSERT INTO psprojectdefn
(projectname,version,projectdescr,tgtservername,tgtdbname
,tgtoprid,tgtopracct,comprelease,srccompreldttm,tgtcompreldttm
,compreldttm,keeptgt,tgtorientation,comparetype,commitlimit
,reportfilter,maintproj,lastupddttm,lastupdoprid,releaselabel
,releasedttm,objectownerid,descrlong)
VALUES
(l_projectname,l_version_pjm,'Temporary Table Instances',' ',' '
,' ',' ',' ',NULL,NULL
,NULL,31,0,1,50
,16232832,0,SYSDATE,'PS',' '
,NULL,' ','Application Engine programs, and related Temporary Records, '
||'whose number of temporary table instances have been changed');
EXCEPTION WHEN dup_val_on_index THEN
UPDATE psprojectdefn
SET version = (SELECT version FROM psversion
WHERE objecttypename = 'PJM')
, lastupddttm = SYSDATE
WHERE projectname = l_projectname;
END;
END;

Conclusion

The effect on my demo HCM8.9 system was to reduce the total number of temporary table instances from 5942 to 5106, a 14% reduction. However, when I tried this on an HCM9.0 system, I got a reduction of only 7%. This shows that PeopleSoft have been more careful about specifying the number of temporary tables in the later version.

Then you can use the script in an earlier posting to remove the excess tables.

PeopleSoft and the Oracle Recycle Bin

If you are running PeopleSoft on Oracle 10g, what do you do about the Recycle Bin? It is a new feature in Oracle 10g, and it is enabled by default. So you are using it, unless you have taken a decision to the contrary.

It works just like the Windows recycle bin. You can drop a table and then flash it back (they didn't call it UNDROP because Oracle marketing now calls everything Flashback). So when you drop a table, Oracle marks it as dropped, and renames it with a system generated name beginning with BIN$. You can look at the contents of the Recycle Bin through a catalogue view.

>create table t (a number);
>drop table t;
>select * from user_recyclebin

OBJECT_NAME ORIGINAL_NAME OPERATION TYPE
------------------------------ -------------------------------- --------- -------------------------
TS_NAME CREATETIME DROPTIME DROPSCN
------------------------------ ------------------- ------------------- ----------
PARTITION_NAME CAN CAN RELATED BASE_OBJECT PURGE_OBJECT SPACE
-------------------------------- --- --- ---------- ----------- ------------ ----------
BIN$ZCLja8iAA9LgRAAOf+3h5A==$0 T DROP TABLE
PSDEFAULT 2009-03-02:13:35:12 2009-03-02:13:35:14 9.7561E+12
YES YES 776642 776642 776642 4

If you don't want a table to go into the recycle bin when you drop it, you can code DROP TABLE ... PURGE - but PeopleSoft doesn't do this.

PeopleSoft alter scripts usually drop and recreate tables. In a production system this is often the best option, otherwise you run the risk of causing rows to migrate to other blocks when you add new columns. In one system, I found 17000 objects in the recycle bin, occupying 1.3Gb.

My Opinion

Personally, I would disable the recycle bin by setting the initialisation parameter RECYCLEBIN = OFF in all PeopleSoft environments, with the possible exception of the development environment.

The RECYCLEBIN parameter can also be set dynamically at session or system level. You could perhaps turn it on prior to upgrade/data migration procedures, and then when satisfied turn it off again and manually purge the recycle bin.

I think Oracle features should be used knowingly. It doesn't matter whether you decide to use a feature or not. It is important that in making that decision you have thought about how to deal with the implications, rather than be surprised later.

PeopleSoft and the Oracle Recycle Bin

If you are running PeopleSoft on Oracle 10g, what do you do about the Recycle Bin? It is a new feature in Oracle 10g, and it is enabled by default. So you are using it, unless you have taken a decision to the contrary.

It works just like the Windows recycle bin. You can drop a table and then flash it back (they didn't call it UNDROP because Oracle marketing now calls everything Flashback). So when you drop a table, Oracle marks it as dropped, and renames it with a system generated name beginning with BIN$. You can look at the contents of the Recycle Bin through a catalogue view.

>create table t (a number);
>drop table t;
>select * from user_recyclebin

OBJECT_NAME ORIGINAL_NAME OPERATION TYPE
------------------------------ -------------------------------- --------- -------------------------
TS_NAME CREATETIME DROPTIME DROPSCN
------------------------------ ------------------- ------------------- ----------
PARTITION_NAME CAN CAN RELATED BASE_OBJECT PURGE_OBJECT SPACE
-------------------------------- --- --- ---------- ----------- ------------ ----------
BIN$ZCLja8iAA9LgRAAOf+3h5A==$0 T DROP TABLE
PSDEFAULT 2009-03-02:13:35:12 2009-03-02:13:35:14 9.7561E+12
YES YES 776642 776642 776642 4

If you don't want a table to go into the recycle bin when you drop it, you can code DROP TABLE ... PURGE - but PeopleSoft doesn't do this.

PeopleSoft alter scripts usually drop and recreate tables. In a production system this is often the best option, otherwise you run the risk of causing rows to migrate to other blocks when you add new columns. In one system, I found 17000 objects in the recycle bin, occupying 1.3Gb.

My Opinion

Personally, I would disable the recycle bin by setting the initialisation parameter RECYCLEBIN = OFF in all PeopleSoft environments, with the possible exception of the development environment.

The RECYCLEBIN parameter can also be set dynamically at session or system level. You could perhaps turn it on prior to upgrade/data migration procedures, and then when satisfied turn it off again and manually purge the recycle bin.

I think Oracle features should be used knowingly. It doesn't matter whether you decide to use a feature or not. It is important that in making that decision you have thought about how to deal with the implications, rather than be surprised later.

Advert of sorts: Canada, UKOUG, Chile, Peru, Poland...

For all events between now and the end of the year, my speaking schedule is published on asktom.oracle.com. This post hi-lights the "international" (for me, if you live there - it is local of course...) trips I'll be doing. You can see all of the US only trips on asktom as well - and there are quite a few (Atlanta, Baton Rouge, Reston VA, Baltimore, San Francisco, Saint Louis, Richmond VA, Birmingham AL - to name a few).

I'm frequently asked "how often am I on the road" - the question behind the question being "how do you get to keep up with the database and stay hands on". I travel exactly 50% of the time - one week on, one week off. I'm getting ready for a week "on" right now. I spend the other 50% working locally (mostly at home) with the database and customers - preparing new material, getting caught up on questions on asktom and the like. From time to time - since they will not move events like Oracle OpenWorld to accommodate my schedule - I travel two weeks and stay off the road for two weeks. One of those "two weeks - no travel" is coming up and I'm very much looking forward to that.

Anyway - to the schedule and links to what's happening...

I'll be in Canada quite a few times this year - three times this week (Quebec, Toronto and Montreal). I'll be back in Canada - a little further west and a lot further north - in November for the ICE conference in Edmonton. I get back to Toronto (for the local user group) again in November as well.

I'll be hitting Chile and Peru in South America in November - again looking forward to that as I've never been down there at all. The user group events are something I really enjoy (more than a seminar - those are very very tiring to deliver. Physically tiring) and getting out to a place I've never been is always interesting.

After that - I'm off to the UKOUG. I've presented there many times now and always look forward to the event in Birmingham. A good time to catch up with a lot of people and much like IOUG in the US - it is a highly technical conference. Every single session is delivered by people using the Oracle software, and mostly delivered by people that use the Oracle software every single day. A lot of hands on knowledge is transferred. Always looking forward to that conference.

My last trip out of the US in 2009 will be to Warsaw, Poland right after the UKOUG. I'll be delivering a two day seminar there. I won't have much time to look around unfortunately as I fly in late Tuesday night from the UK and leave early Friday morning for home. I'll have to have a nice dinner or two out though to get a feel for the city.

Into the first half of 2010 - I'll be hitting places such as Croatia, Tokyo, Moscow, Hungary, two cities in South Africa, the Baltics and Turkey... More details on those to follow...

Forget I/O Bound, You’re Latency Bound, Bub

Since it’s been nearly ten years since I wrote my book, Scaling Oracle8i, I thought it was about time that I started writing again. I thought I would start with the new-fangled blogging thing, and see where it takes me. Here goes.

As some will know, I run a small consulting company called Scale Abilities, based out of the UK. We get involved in all sorts of fun projects and problems (or are they the same thing?), but one area that I seem to find myself focusing on a lot is storage. Specifically, the performance and architecture of storage in Oracle database environments. In fact I’m doing this so much that, whenever I am writing presentations for conferences these days, it always seems to be the dominant subject at the front of my mind.

One particular common thread has been the effect of latency. This isn’t just a storage issue, of course, as I endeavoured to point out in my Hotsos Symposium 2008 presentation “Latency and Skew”. Latency, as the subtitle of that particular talk said, is a silent killer. Silent, in that it often goes undetected, and the effects of it can kill performance (and still remain undetected). I’m not going to go into all the analogies about latency here, but let’s try and put a simple definition out for it:

Latency is the time taken between a request and a response.

If that’s such a simple definition, why is it so difficult to spot? Surely if a log period of time passes between a request and a response, the latency will be simple to nail? No.

The problem is that it is the small latencies that cause the problems. Specifically, it is the “small, but not so small that they are not important” ones that are so difficult to spot and yet cause so many problems. Perhaps an example is now in order:

A couple of years ago, a customer of mine was experiencing a performance problem on their newly virtualised database server (VMware 3.5). The problem statement went a little bit like this: Oracle on VMware is broken – it runs much slower on VMware than on physical servers. The customer was preparing a new physical server in order to remove VMware from the equation. Upon further investigation, I determined the following:

  1. The VMware host (physical server running VMware) was a completely different architecture to the previous dedicated server. The old server was one of the Intel Prescott core type (3.2GHz, Global Warming included at no extra cost), and the new one was one of the Core 2 type with VT instructions.
  2. Most measurable jobs were actually faster on the virtualised platform
  3. Only one job was slower

The single job that was slower was so much slower that it overshadowed all the other timings that had improved. Of the four critical batch jobs, the timings looked like this:

Physical server:

  • Job 1: 26s
  • Job 2: 201s
  • Job 3: 457s
  • Job 4: 934s
  • Total: 1618s

Virtualised Server:

  • Job 1: 15s
  • Job 2: 111s
  • Job 3: 208s
  • Job 4: 2820s
  • Total: 3154s

It can be seen that, if one takes the total as the yardstick, the virtualised server is almost twice as slow: Therein lies the danger of using averages and leaping to conclusions. If Job 4 is excluded, the totals are 684s vs 334s, making the virtualised server more than twice as quick as the physical one.

Upon tracing Job 4 on the VMware platform with Oracle extended SQL tracing (10046 level 8), I discovered that it was making a lot of roundtrips to the database. Hang on, let me give that the right emphasis: A LOT of roundtrips. However, each roundtrip was really  fast – about 0.2ms if memory serves. So where’s the problem with that? It’s not exactly high latency, is it? Well it is if you have to do several million of them.

As it turns out, there was something in the VMware stack (perhaps just additional codepath to get through the vSwitch) that was adding around 0.1ms latency to each roundtrip. When this tenth of a millisecond is multiplied by several million (something like 20 million in this case), it becomes a long time. About 2000s to be precise, which more than made up for the extra time. The real answer – do less roundtrips, the new server will be at least twice as fast as the old server.

So what does this have to do with I/O? Plenty. The simple fact is that roundtrips are a classic source of unwanted latency. Latency is the performance killer.

Let’s look at some I/O examples from a real customer system. Note: this is not a particularly well tuned customer system:

  • Latency of 8KB sequential reads: 0.32ms
  • Latency of 4MB sequential reads: 6ms

Obviously, the 4MB reads are taking a lot longer (18x), but that makes sense, right? Apart from one thing: The 4MB reads are serving 512x more data in each payload. The net result of these numbers is as follows:

  • Time to sequentially read 2.1GB in 8KB pieces: 76s
  • Time to sequentially read 2.1GB in 4MB pieces: 24s

So what happened to the 52s difference in these examples? Was this some kind of tuning problem? Was this a fault on the storage array? No. The 52s of time was lost in latency, nowhere else.

Here’s another definition of latency: Latency is wasted time.

So, look out for that latency. Think about it when selecting nested-loop table joins instead of full-table scans. Think about it when doing single-row processing in your Java app. Think about it before you blame your I/O system!

mock-ups, prototypes and production builds

(I can't talk about planes going off on tangents but I've moved them all to the end this time. Read them in whatever order works for your brain - or skip 'em. They're not crucial to the post.)While I didn't get to work on the SR-71 or the F-117, I had the opportunity to work with men who did. One particular man, Walt Paskowietz, became something of a mentor/champion for me and taught me about

Brain Rules, multitasking and cubicles


One of my favorite books is Brain Rules by John Medina. The following passage is from the book's introduction:

Most of us have no idea how our brain works.

This has strange consequences. We try to talk on our cell phones and drive at the same time, even though it is literally impossible for our brains to multitask when it comes to paying attention. We have created high-stress office environments, even though a stressed brain is significantly less productive. Our schools are designed so that most real learning has to occur at home. This would be funny if it weren’t so harmful. Blame it on the fact that brain scientists rarely have a conversation with teachers and business professionals, education majors and accountants, superintendents and CEOs. Unless you have the Journal of Neuroscience sitting on your coffee table, you’re out of the loop.

This book is meant to get you into the loop.

I've had this book for a while and recently picked it back up to re-read it. Today, this bit of the intro seemed appropriate to discuss here. Brain researchers have proven that it is (as stated above) nearly impossible for our brains to multitask. However, given the way most of us attempt to operate every day, we don't believe or accept this is true.

I was thinking about how I am constantly doing what I believe to be multitasking. For example, I listen to an audiobook during my drive to and from work. Since the book is on my iPhone, I may be interrupted by a call that comes in, so I often talk on the phone while driving. When I get to work, I switch from my audiobook to a playlist of light jazz or something else with few words (I admit that I can't quite handle music with lyrics while trying to work as I always end up singing along and forget what I am doing). I work in a "cubicle village" with about a dozen of my colleagues where we each occupy about a 5x7 foot area with a half height divider wall in between. Every sound is clearly audible throughout the village so I constantly find myself vicariously sucked into other conversations and phone calls. [That's precisely the reason why I tell myself I wear earbuds so I can drown out some of the noise.] Then, at the end of the day, I switch back to my audiobook and drive home.

What I noticed this morning was that when I got to work, I couldn't really remember the drive to get there. I did remember the predicament my audiobook's main character was in when I stopped listening, but the drive was a blur. :)

I then started thinking about my work day. There have been several times when I'll be working on something and a song or a nearby conversation pulls my attention away from what I'm doing and I end up struggling to get back on track after a few seconds (or minutes) of distraction.

In Medina's book introduction he says that we've created high-stress environments and in so doing have made ourselves less productive. I think that's true. How many of you actually have a real office...you know, one with a door and 4 real walls? How many times a day are you distracted or completely interrupted from what you're working on because of your environment? I understand the economics behind cubicles vs offices, but I think it would be fascinating to know what the real cost is in terms of lost productivity.

I don't know that I'll give up my audiobook during my drive (and I may regret it one day if listening to it results in me being unable to avoid an accident), and I'm confident that I'll not rate an office with a door for a long while. But knowing that my brain can't multitask is apparently not going to stop me from trying to emulate the effect.

I'm not a DBA anymore...

After nine years and nine months of running the database that hosts asktom, I've retired.

Not from answering questions, but rather from being the DBA and semi-SA for the machine that was asktom.oracle.com.

Asktom.oracle.com is now synonymous with APEX.oracle.com - Joel Kallman facilitated the move (here is the story on that). So, I'm now running in a database with thousands of other applications. And - it seems a bit faster than my old hardware (which was held together with some glue and duct-tape - the drives were failing left and right...).

For some stories on APEX and the ability to handle a pretty large load - I suggest you glance at:

But now, for the first time in years, I won't be afraid to check asktom when I land and get off of the plane. If something goes wrong - the data center guys will fix it :)

And yes, the intermittent "some IE users get an error on asktom" issue was found and fixed.

On the Importance of Diagnosing Before Resolving

Today a reader posted a question I like at our Method R website. It's about the story I tell in the article called, "Can you explain Method R so even my boss could understand it?" The story is about sending your son on a shopping trip, and it takes him too long to complete the errand. The point is that an excellent way to fix any kind of performance problem is to profile the response time for the properly chosen task, which is the basis for Method R (both the method and the company).

Here is the profile that details where the boy's time went during his errand:

                    --Duration---
Subtask minutes % Executions
------------------ ------- ---- ----------
Talk with friends 37 62% 3
Choose item 10 17% 5
Walk to/from store 8 13% 2
Pay cashier 5 8% 1
------------------ ------- ---- ----------
Total 60 100%

I went on to describe that the big leverage in this profile is the elimination of the subtask called "Talk with friends," which will reduce response time by 62%.

The interesting question that a reader posted is this:

Not sure this is always the right approach. For example, lets imagine the son has to pick 50 items
Talk 3 times 37 minutes
Choose item 50 times 45 minutes
Walk 2 times 8 minutes
Pay 1 time 5 minutes
Working on "choose item" is maybe not the right thing to do...

Let's explore it. Here's what the profile would look like if this were to happen:

          --Duration---
Subtask minutes % Executions
------- ------- ---- ----------
Choose 45 47% 50
Talk 37 39% 3
Walk 8 8% 2
Pay 5 5% 1
------- ------- ---- ----------
Total 95 100%

The point of the inquiry is this:

The right answer in this case, too, is to begin with eliminating Talk from the profile. That's because, even though it's not ranked at the very top of the profile, Talk is completely unnecessary to the real goal (grocery shopping). It's a time-waster that simply shouldn't be in the profile. At all. But with Cary's method of addressing the profile from the top downward, you would instead focus on the "Choose" line, which is the wrong thing.

In chapters 1 through 4 of our book about Method R for Oracle, I explained the method much more thoroughly than I did in the very brief article. In my brevity, I skipped past an important point. Here's a summary of the Method R steps for diagnosing and resolving performance problems using a profile:

  1. (Diagnosis phase) For each subtask (row in the profile), visiting subtasks in order of descending duration...
    1. Can you eliminate any executions without sacrificing required function?
    2. Can you improve (reduce) individual execution latency?
  2. (Resolution phase) Choose the candidate solution with the best net value (that is, the greatest value of benefit minus cost).

Here's a narrative of executing the steps of the diagnostic phase, one at a time, upon the new profile, which—again—is this:

          --Duration---
Subtask minutes % Executions
------- ------- ---- ----------
Choose 45 47% 50
Talk 37 39% 3
Walk 8 8% 2
Pay 5 5% 1
------- ------- ---- ----------
Total 95 100%
  1. Execution elimination for the Choose subtask: If you really need all 50 items, then no, you can't eliminate any Choose executions.
  2. Latency optimization for the Choose subtask: Perhaps you could optimize the mean latency (which is .9 minutes per item). My wife does this. For example, she knows better where the items in the store are, so she spends less time searching for them. (I, on the other hand, can get lost in my own shower.) If, for example, you could reduce mean latency to, say, .8 minutes per item by giving your boy a map, then you could save (.9 – .8) × 50 = 5 minutes (5%). (Note that we don't execute the solution yet; we're just diagnosing right now.)
  3. Execution elimination for the Talk subtask: Hmm, seems to me like if your true goal is fast grocery shopping, then you don't need your boy executing any of these 3 Talk events. Proposed time savings: 37 minutes (39%).
  4. Latency optimization for the Talk subtask: Since you can eliminate all Talk calls, no need to bother thinking about latency reduction. ...Unless you're prey to some external constraint (like social advancement, say, in attempt to maximize your probability of having rich and beautiful grandchildren someday), in which case you should think about latency reduction instead of execution elimination.
  5. Execution elimination for the Walk subtask: Well, the boy has to get there, and he has to get back, so this "executions=2" figure looks optimal. (Those Oracle applications we often see that process one row per network I/O call would have 50 Walk executions, one for each Choose call.)
  6. Latency optimization for the Walk subtask: Walking takes 4 minutes each way. Driving might take less time, but then again, it might actually take even more. Will driving introduce new dependent subtasks? Warm Up? Park? De-ice? Even driving doesn't eliminate all the walking... Plus, there's not a lot of leverage in optimizing Walk, because it accounts for only 8% of total response time to begin with, so it's not worth a whole lot of bother trying to shave it down by some marginal proportion, especially since inserting a car into your life (or letting your boy drive yours) is no trivial matter.
  7. Execution elimination for the Pay subtask: The execution count on Pay is already optimized down to the legally required minimum. No real opportunity for improvement here without some kind of radical architecture change.
  8. Latency optimization for the Pay subtask: It takes 5 minutes to Pay? That seems a bit much. So you should look at the payment process. Or should you? Even if you totally eliminate Pay from the profile, it's only going to save 5% of your time. But, if every minute counts, then yes, you look at it. ...Especially if there might be an easy way to improve it. If the benefit comes at practically no cost, then you'll take it, even if the benefit is only small. So, imagine that you find out that the reason Pay was so slow is that it was executed by writing a check, which required waiting for store manager approval. Using cash or a credit/debit card might improve response time by, say, 4 minutes (4%).

Now you're done assessing the effects of (1) execution elimination and (2) latency reduction for each line in the profile. That ends the diagnostic phase of the method. The next step is the resolution phase: to determine which of these candidate solutions is the best. Given the analysis I've walked you through, I'd rank the candidate solutions in this order:

  1. Eliminate all 3 executions of Talk. That'll save 37 minutes (39%), and it's easy to implement; you don't have to buy a car, apply for a credit card, train the boy how to shop faster, or change the architecture of how shopping works. You can simply discard the "requirement" to chat, or you can specify that it be performed only during non-errand time windows.
  2. Optimize Pay latency by using cash or a card, if it's easy enough to give your boy access to cash or a card. That will save 4 minutes, which—by the way—will be a more important proportion of the total errand time after you eliminate all the Talk executions.
  3. Finally, consider optimizing Choose latency. Maybe giving your son a map of the store will help. Maybe you should print your grocery list more neatly so he can read it without having to ask for help. Maybe by simply sending him to the store more often, he'll get faster as his familiarity with the place improves.

That's it.

So the point I want to highlight is this:

I'm not saying you should stick to the top line of your profile until you've absolutely conquered it.

It is important to pass completely through your profile to construct your set of candidate solutions. Then, on a separate pass, you evaluate those candidate solutions to determine which ones you want to implement, and in what order. That first full pass is key. You have to do it for Method R to be reliable for solving any performance problem.

Every day I learn something new...

Two things

One, here is something new I learned today. I wasn't aware of that, that is a nice "stealth" feature :)

The other thing - I have a podcast on 11g Release 2 "things" posted on oracle.com if you are interested....