Search

Top 60 Oracle Blogs

Recent comments

Oakies Blog Aggregator

SMART goals

I was just reviewing the written goals for a project I've started working on and after looking at a few of them, I was reminded of how important it is to make sure your goals are "SMART".

For example, one of the project goals is: To create a standard framework that supports high levels of service. Hmmm... When I read that goal, I found myself thinking of the concept of SMART goals that I learned a long time ago. Somehow this goal doesn't seem to fit the SMART paradigm.

So, what is a SMART goal?

SMART is just a mnemonic that can help you remember how to effectively formulate a goal. Here's a brief description:

Specific - The What, Why, and How.
What are you going to do? What do you want to accomplish?
Why is it important (reasons, purpose, benefits)?
How will you accomplish the goal?

Measurable - Concrete criteria for measuring progress.
If you can measure properly, you can see change occur as you progress toward the goal. Ask questions like "how much?" or "how many?".

Attainable - Takes enough effort to be a bit of a stretch, but not so far out of reach that is nearly impossible to complete.

Relevant - "Do-able"...not "easy". The skills are available, the learning curve is not vertical and the goal is within bounds of the overall plan.

Time-Bound - A clear target date or time period to work towards. Without the time element, the commitment to begin action is often too vague.

I think the project goals I've been reviewing have most of the details that would cover the SMART elements elsewhere in the project docs, but I'm glad for the opportunity to review this method for stating goals.

Latency Hiding For Fun and Profit

Yep, another post with the word ‘latency’ written all over it.

I’ve talked a lot about latency, and how it is more often than not completely immutable. So, if the latency cannot be improved upon because of some pesky law of physics, what can be done to reduce that wasted time? Just three things, actually:

  1. Don’t do it.
  2. Do it less often.
  3. Be productive with the otherwise wasted time.

The first option is constantly overlooked – do you really need to be doing this task that makes you wait around? The second option is the classic ‘do things in bigger lumps between the latency’ – making less roundtrips being the classic example. This post is about the third option, which is technically referred to as latency hiding.

Everybody knows what latency hiding is, but most don’t realise it. Here’s a classic example:

I need some salad to go with the chicken I am about to roast. Do I:

(a) go to the supermarket immediately and buy the salad, then worry about cooking the chicken?

OR

(b) get the chicken in the oven right away, then go to the supermarket?

Unless the time required to buy the salad is much longer than the chicken’s cook-time, the answer is always going to be (b), right? That’s latency hiding, also known as Asynchronous Processing. Let’s look at the numbers:

Variable definitions:

Supermarket Trip=1800s

Chicken Cook-Time=4800s

Calculations:

Option (a)=1800s+4800s=6600s (oh man, nearly two hours until dinner!)

Option (b)=4800s (with 1800s supermarket time hidden within it)

Here’s another example: You have a big code compile to do, and an empty stomach to fill. In which order do you execute those tasks? Hit ‘make’, then grab a sandwich, right?

As a side note, this is one of my classic character flaws – I just live for having tasks running in parallel this way. Not a flaw, I hear you say? Anyone that has tried to get software parallelism  (such as Oracle Parallel Execution) knows the problem – some tasks finish quicker than expected, and then there’s a bunch of idle threads.  In the real world, this means that my lunch is often very delayed, much to the chagrin of my lunch buddies.

OK, so how does this kind of thing work with software? Let’s look at a couple of examples:

  1. Read Ahead
  2. Async Writes

Read ahead  from physical disk is the most common example of (1), but it equally applies to result prefetching in, say, AJAX applications. Whatever the specific type, it capitalises on parallel processing from two resources. Let’s look at the disk example for clarification.

Disk read ahead is where additional, unrequested, reads are carried out after an initial batch of real requested reads. So, if a batch job makes a read request for blocks 1,2,3 and 4 of a file, “the disk” returns those blocks back and then immediately goes on to read blocks 5,6,7,8, keeping them in cache. If blocks 5,6,7,8 are then subsequently requested by the batch job after the first blocks are processed, they can immediately be returned from cache, thus hiding the latency from the batch job. This has the impact of hiding the latency from the batch job and increases throughput as a direct result.

Async writes are essentially the exact opposite of read-ahead. Let’s take the well-known Oracle example of async writes, that of the DBWR process flushing out dirty buffers to disk. The synchronous way to do this is to generate a list of dirty buffers and then issue a series of synchronous writes (one after the other) until they are all complete. Then start again by looking for more dirty buffers. The async I/O way to do the same operation is to generate the list, issue an async write request (which returns instantly), and immediately start looking for more dirty buffers. This way, the DBWR process can spend more useful time looking for buffers – the latency is hidden, assuming the storage can keep up.

By the way, the other horror of the synchronous write is that there is no way that the I/O queues can be pushed hard enough for efficient I/O when sending out a single buffer at a time. Async writes remedy that problem.

I’ve left a lot of the technical detail out of that last example, such as the reaping of return results from the async I/O process, but didn’t want to cloud the issue. Oops, I guess I just clouded the issue, just ignore that last sentence…

4th Planboard DBA Symposium: Registration now open

On November 17 Planboard will run her 4th Dutch DBA Symposium and the registration is now open. This “for Dutch DBA’s, by Dutch DBA’s” symposium has become the place to be for the serious DBA who wants to share his or her knowledge with other DBA’s in an open environment with plenty of networking time [...]

nonsense correlation

I was doing a little light reading on my Saturday night in my Oxford Dictionary of Statistics by Graham Upton and Ian Cook and came across this definition:nonsense correlation: A term used to describe a situation where two variables (X and Y, say) are correlated without being causally related to one another. The usual explanation is that they are both related to a third variable, Z. Often the

Spotting the Red Flags (Part 1 of n)

As a consultant I get to see many different systems, managed by different people. A large proportion of these systems are broken in exactly the same ways to others, which makes spotting the problems pretty straightforward! It occurred to me at some point that this is a bit like a crime scene – somebody had murdered the system, and it was my job to find out ‘whodunnit’, or at least ‘whatdunnit’. The ‘what’ is quite frequently one (or few) actions performed by the administrator, normally with good intention, that result in system performance or availability carnage. I call these actions ‘Red Flags’, and spotting them can save a lot of time.

A couple of years ago at the Hotsos conference, I did a small impromptu survey. It’s not all that often that you have 500 Oracle geeks in a room that you can solicit opinion from, so it seemed like a good chance to cast the net for Red Flags. I seeded the process with a few of my personal favourites and then started writing as the suggestions came in. Here are the ‘seed’ entries:

  • Any non-default setting for optimizer_index_cost_adj
  • Carefully configured KEEP and RECYCLE pools
  • Multiple block sizes
  • Some kind of spin_count configuration

Remember, these are not necessarily the wrong thing to do to your system, but they probably are. That’s why they attract my attention in the first instance.

I got a whole load of responses. Of those responses, some had missed the point and just identified something that was broken. This is more subtle than that – these are things that are forensic evidence of system homicide. Some of the decent ones I got back follow:

  • /*+ RULE */ hint in abundance (I know, lot’s of apps do this, and it sometimes makes sense)
  • Numeric data in VARCHAR2 columns
  • Indexes and tables on ‘separate disks’
  • All columns in a table NULLable

Of course, there were more cynical responses too, such as  (he he): cluster_database=true

I was going to write some kind of presentation about it, but I think this might work nicely as a blog entry, potentially multiple entries! If anyone has some good suggestions, please let me know and I’ll start adding them into part 2, along with any more I think of… :)

The amazing truth about the Red Flags, though, is that they are incredibly hard to dislodge. Once optimizer_index_cost_adj is set to anything but default, it is nigh on impossible to convince anyone that the system is probably now tuned on its head. Same with a multiple buffer pool configuration, and same with multiple block sizes.

The best flowchart ever

This was forwarded to me and I totally laughed out loud. I've seen a lot of flow charts (mostly of the boring technical sort), but this is my absolute all-time favorite! Zoom in on it or visit the originating site to see it better.

It's a flow chart of the song "Total Eclipse of the Heart".


Happy Birthday, OFA Standard

Yesterday I finally posted to the Method R website some of the papers I wrote while I was at Oracle Corporation (1989–1999). You can now find these papers where I am.

When I was uploading my OFA Standard paper, I noticed that today—24 September 2009—is the fourteenth birthday of its publication date. So, even though the original OFA paper was around for a few years before 1995, please join me in celebrating the birthday of the final version of the official OFA Standard document.

Resumes, interviews and truth in advertising

OK...what's the deal with resumes that say one thing (so that a candidate looks nearly perfect) and then you interview them and find out they can barely spell Oracle? I'd think that if your resume says you've been working with Oracle since 1988 and have worked extensively with PL/SQL, you'd know what a REF CURSOR is or maybe even know a bit about collections or something, right? I asked one candidate these questions and they said they'd never ran into those 'features'. So finally, just for fun, I asked "On a scale of 1-10, with 10 being expert, where would you rank yourself in terms of your PL/SQL proficiency?" The answer: "Well, I suppose it's a bit conceited to give yourself a 10, so I'll just be humble and say 9."

Are you kidding me? Really?

Personally, I think my resume is lacking overall. I seem to have a hard time distilling over 20 years of experience into a couple of pages and making the "real me" show up on paper. But, if you get me into the ballpark (i.e. an interview), I'll hit most every pitch you throw at me. But, I'd be terrified to try to over-sell myself and get caught unable to deliver the goods. So, it's just a bit scary for me to look at resumes and think "Wow!" and then talk to the person for 3 minutes and think "Yuck!"

At what point did this become the norm and not the exception? Or, am I just in the midst of some weirdly skewed tilt of the interview universe? It almost reminds me of my reaction when reading an ad that claims "World's Best" or "Indescribably Perfect" or some other line and knowing it's just a ploy. I'm not a fan of those that skirt the edges of "truth in advertising". But when it comes to selling yourself, I'd really hope the claims you make could be backed up. Sigh...

Here's hoping that the rest of this week's interviews are with folks who match their advertising.

Dynamically Switching PeopleSoft Temporary Records between Global Temporary and Normal Tables during Application Engine Programs

I am working on a Time & Labour system. We run the main T&L process in different modes. There is a company wide overnight batch process, but individual parts of the company can run the process for their section during the day to process exceptions and to generate schedules. This is done with a custom AE program that calls the delivered TL_TIMEADMIN.

Running on Oracle 10gR2, we have faced performance problems caused by contention between concurrent truncate operations in these processes (see Factors Affecting Performance of Concurrent Truncate of Working Storage Tables).

One way to evade this problem is to convert the working storage tables to be Global Temporary Tables (GTTs). The contention problem caused by serialisation of truncate operations does not occur with GTTs. Last year I wrote a blog entry (Global Temporary Tables and PeopleSoft Temporary Records) that discussed when it was possible to use GTTs in Application Engine (AE) programs. Essentially, you can only use GTTs when restart is disabled for all AE programs that reference a temporary records.

So we could consider running the small but frequent daytime processes with restart disabled. Then we can make the non-shared instances of the temporary records into Global Temporary tables for the non-restartable daytime processes. However, we want to continue to run the overnight processes with restart enabled so that we have the option to restart a process that fails overnight from where it crashed, rather than going back to the beginning. The same non-shared instances may need to be Global Temporary during the day but normal recoverable tables by night.

Previously, I have only converted tables to Global Temporary where they are not referenced by a restartable AE program. I have now devised a way of switching a table to being a GTT if it is allocated to a process for which restart is disabled, and switching it back to a normal table if not. This is the best of both worlds.

When a non-shared instance of a temporary record is allocated to a particular process instance, a row is inserted into PS_AETEMPTBLMGR. The value of the restart disable flag is also stored in PS_AETEMPTBLMGR. I have created a trigger that switches a non-shared instance of a temporary record from a normal table to a GTT if restart is disabled, or switches it back to a normal table if restart is enabled. The trigger will create the GTT and any indexes on it if necessary, and will rename the tables as necessary so that the correct version has the default name expected by the process.

So if, for example, I have instance number 11 of a record called WRK_SHIFT_TAO, then the table will be called PS_WRK_SHIFT_TAO11. If that is allocated to a non-restartable AE program, the trigger will check for a table called GT_WRK_SHIFT_TAO11. If it doesn't exist the trigger will create it, dynamically generating the DDL with the dbms_metadata package. Thus, the structure of the GTT will be identical to PS_WRK_SHIFT_TAO11, the leading PS will also be changed to GT. The same indexes will also be created. The original normal table PS_WRK_SHIFT_TAO11 will be renamed to XX_WRK_SHIFT_TAO11, and then the GTT, at that point called GT_WRK_SHIFT_TAO11 will be renamed to PS_WRK_SHIFT_TAO11. The indexes are not renamed. If later the same instance is allocated to a restartable process, the change will be reversed by renaming again. The GTT will not be dropped so that it does not need to be recreated again the next time the non-restartable program uses it.

All DDL generated and executed by the trigger is written to the Message Log for that process (click on the screen shot to enlarge).

The shared instance of a record (the one without a instance number suffix) is never rebuilt as a Global Temporary table because it is possible that a restartable and non-restartable process might both use the shared instance at the same time.

One complication is how to handle changes to the temporary records. Application Designer will only create normal tables. So, if the table is to be rebuilt, them it needs to be switched back to a normal table, and the corresponding GTT created by the trigger should then be dropped. The Application Designer can recreate the table in the usual way. Next time the non-restartable AE runs, it will recreate the GTT with the new structure.

An updated version of T_LOCK is available which handles PeopleSoft temporary records and prevent DDL on PS_ tables with corresponding GT_ tables. The commands which swap these tables back and forth are explicitly permitted by the new version of this trigger.

Dynamically Switching PeopleSoft Temporary Records between Global Temporary and Normal Tables during Application Engine Programs

I am working on a Time & Labour system. We run the main T&L process in different modes. There is a company wide overnight batch process, but individual parts of the company can run the process for their section during the day to process exceptions and to generate schedules. This is done with a custom AE program that calls the delivered TL_TIMEADMIN.

Running on Oracle 10gR2, we have faced performance problems caused by contention between concurrent truncate operations in these processes (see Factors Affecting Performance of Concurrent Truncate of Working Storage Tables).

One way to evade this problem is to convert the working storage tables to be Global Temporary Tables (GTTs). The contention problem caused by serialisation of truncate operations does not occur with GTTs. Last year I wrote a blog entry (Global Temporary Tables and PeopleSoft Temporary Records) that discussed when it was possible to use GTTs in Application Engine (AE) programs. Essentially, you can only use GTTs when restart is disabled for all AE programs that reference a temporary records.

So we could consider running the small but frequent daytime processes with restart disabled. Then we can make the non-shared instances of the temporary records into Global Temporary tables for the non-restartable daytime processes. However, we want to continue to run the overnight processes with restart enabled so that we have the option to restart a process that fails overnight from where it crashed, rather than going back to the beginning. The same non-shared instances may need to be Global Temporary during the day but normal recoverable tables by night.

Previously, I have only converted tables to Global Temporary where they are not referenced by a restartable AE program. I have now devised a way of switching a table to being a GTT if it is allocated to a process for which restart is disabled, and switching it back to a normal table if not. This is the best of both worlds.

When a non-shared instance of a temporary record is allocated to a particular process instance, a row is inserted into PS_AETEMPTBLMGR. The value of the restart disable flag is also stored in PS_AETEMPTBLMGR. I have created a trigger that switches a non-shared instance of a temporary record from a normal table to a GTT if restart is disabled, or switches it back to a normal table if restart is enabled. The trigger will create the GTT and any indexes on it if necessary, and will rename the tables as necessary so that the correct version has the default name expected by the process.

So if, for example, I have instance number 11 of a record called WRK_SHIFT_TAO, then the table will be called PS_WRK_SHIFT_TAO11. If that is allocated to a non-restartable AE program, the trigger will check for a table called GT_WRK_SHIFT_TAO11. If it doesn't exist the trigger will create it, dynamically generating the DDL with the dbms_metadata package. Thus, the structure of the GTT will be identical to PS_WRK_SHIFT_TAO11, the leading PS will also be changed to GT. The same indexes will also be created. The original normal table PS_WRK_SHIFT_TAO11 will be renamed to XX_WRK_SHIFT_TAO11, and then the GTT, at that point called GT_WRK_SHIFT_TAO11 will be renamed to PS_WRK_SHIFT_TAO11. The indexes are not renamed. If later the same instance is allocated to a restartable process, the change will be reversed by renaming again. The GTT will not be dropped so that it does not need to be recreated again the next time the non-restartable program uses it.

All DDL generated and executed by the trigger is written to the Message Log for that process (click on the screen shot to enlarge).

The shared instance of a record (the one without a instance number suffix) is never rebuilt as a Global Temporary table because it is possible that a restartable and non-restartable process might both use the shared instance at the same time.

One complication is how to handle changes to the temporary records. Application Designer will only create normal tables. So, if the table is to be rebuilt, them it needs to be switched back to a normal table, and the corresponding GTT created by the trigger should then be dropped. The Application Designer can recreate the table in the usual way. Next time the non-restartable AE runs, it will recreate the GTT with the new structure.

An updated version of T_LOCK is available which handles PeopleSoft temporary records and prevent DDL on PS_ tables with corresponding GT_ tables. The commands which swap these tables back and forth are explicitly permitted by the new version of this trigger.