Who's online

There are currently 0 users and 41 guests online.

Recent comments


Oakies Blog Aggregator

#DOAG2014 bits of info for DBAs

Just wanted to share some pieces of information from the recent DOAG annual conference that you may find interesting.

From Mike Dietrich’s presentation about Database Upgrade:

Mike Dietrich

Database Replay is extremely useful to predict after-upgrade performance on a test system,

especially we can record the production load on 10g even.

From Carsten Czarski’s talk about XML DB:

Carsten Czarski

With 12c, XML DB is mandatory and it provides an easy way to upload BLOBs via ftp into the database.

From Ulrike Schwinn’s talk about the Resource Manager I took away that

The resource manager becomes more and more popular and important, especially for Multitenant

- something Hans Forbrich reinforced later on.

Particularly I liked way she presented later on about ADO: Very many live demonstrations – that’s how I try to do my own presentations also :-)

Ulrike Schwinn

Frank Schneede did a great job debunking Exadata myths. For example,

You don’t need to have all cores enabled with Exadata X4 in order to save license cost. That’s called Capacity on Demand.

Frank Schneede

If I should name one presentation that was most useful for me, it’ll be probably Frank’s.

Markus Michalewicz delivered an excellent talk as expected about RAC cache fusion:

Markus Michalewicz

Two important messages:

RAC scales well (far) beyond three nodes because there are never more than three nodes involved for cache fusion intercommunication.

And Multitenant and RAC are a perfect fit.

One Data Guard snippet out of Larry Carpenter’s talk about Global Data Services (GDS):

GDS makes it possible to automate the failover of the Real-Time Query service to the primary in case the physical standby has an outage.

Larry Carpenter

Hans Forbrich talked about Multitenant. He showed great presentation skills and although I knew the technical details before, the way he highlighted certain aspects was still very helpful for me.

Hans Forbrich

One key message was that

Multitenant is here to stay.

DBAs should learn about it and become familiar with it as soon as possible,

because sooner than later it will have to be administered in production!

Tagged: #DOAG2014

Parallel Costs

While creating a POC of a SQL rewrite recently I received a little surprise as I switched my query from serial execution to parallel execution and saw the optimizer’s estimated cost increase dramatically. I’ll explain why in a moment, but it made me think it might be worth setting up a very simple demonstration of the anomaly. I created a table t1 by copying view all_source – which happened to give me a table with about 100,000 rows and 1117 blocks – and then ran the query ‘select max(line) from t1;’ repeating the query with a /*+ parallel(t1 2) */ hint. From here are the two execution plans I got:

| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
|   0 | SELECT STATEMENT   |      |     1 |     4 |   153   (4)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |     4 |            |          |
|   2 |   TABLE ACCESS FULL| T1   | 99173 |   387K|   153   (4)| 00:00:01 |

| Id  | Operation              | Name     | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
|   0 | SELECT STATEMENT       |          |     1 |     4 |  1010   (1)| 00:00:05 |        |      |            |
|   1 |  SORT AGGREGATE        |          |     1 |     4 |            |          |        |      |            |
|   2 |   PX COORDINATOR       |          |       |       |            |          |        |      |            |
|   3 |    PX SEND QC (RANDOM) | :TQ10000 |     1 |     4 |            |          |  Q1,00 | P->S | QC (RAND)  |
|   4 |     SORT AGGREGATE     |          |     1 |     4 |            |          |  Q1,00 | PCWP |            |
|   5 |      PX BLOCK ITERATOR |          | 99173 |   387K|  1010   (1)| 00:00:05 |  Q1,00 | PCWC |            |
|   6 |       TABLE ACCESS FULL| T1       | 99173 |   387K|  1010   (1)| 00:00:05 |  Q1,00 | PCWP |            |

It seems that the optimizer thinks that running the query parallel 2 will take five times as long as running it serially ! Is it, perhaps, some cunning fix to the optimizer that is trying to allow for process startup time for the parallel execution slaves ? Is it a bug ? No – it’s just that I hadn’t got around to setting my system stats and they were complete garbage thanks to various other tests I had been running over the previous couple of weeks. Critically, I had not adjusted the “parallel-oriented” statistics to be consistent with the “serial-oriented” stats.

Here, from the 10053 trace file for the parallel run, is the section on the Single Table Access costing, together with the system stats and base statistical information:

  Using WORKLOAD Stats
  CPUSPEED: 1110 millions instructions/sec
  SREADTIM: 4.540000 milliseconds
  MREADTIM: 12.440000 millisecons
  MBRC:     21 blocks
  MAXTHR:   3000000 bytes/sec
  SLAVETHR: 1000000 bytes/sec

Table Stats::
  Table: T1  Alias: T1
    #Rows: 99173  #Blks:  1117  AvgRowLen:  76.00  ChainCnt:  0.00
Access path analysis for T1
  Single Table Cardinality Estimation for T1[T1]
  Table: T1  Alias: T1
    Card: Original: 99173.000000  Rounded: 99173  Computed: 99173.00  Non Adjusted: 99173.00
  Access Path: TableScan
    Cost:  151.13  Resp: 1010.06  Degree: 0
      Cost_io: 147.00  Cost_cpu: 20826330
      Resp_io: 1007.76  Resp_cpu: 11570183

I’m going to walk through the optimizer’s calculations that got the serial I/O cost (cost_io: 147.00) and the parallel I/O cost (Resp_io: 1007.76), but before I do that I’ll point out how inconsistent some of the system stat are. The multiblock read time (mreadtim) is 12.44 milliseconds, to read an average of 21 blocks (mbrc) which, converting to bytes per second means (21 * 8192) * 1000/12.44 = 13,828,938 bytes per second; but the I/O rate for a single parallel execution slave (slavethr) is only 1,000,000 bytes per second – which means a single serial session can (apparently) tablescan nearly 14 times as fast as an individual parallel execution slave. It’s not surprising that somehow the optimizer thinks a serial tablescan will be faster than parallel 2) – but let’s check exactly how the arithmetic falls out.


  • #Blks: 1117, MBRC = 21 => read requests = 1117/21 = 53.19
  • sreadtim = 4.54 milliseconds, mreadtim = 12.44 milliseconds = 2.74 * sreadtim
  • Cost = 53.19 * 2.74 + 1 (_tablescan_cost_plus_one = true) = 146.74 — which looks close enough.


  • #Blks: 1117, block size = 8KB => total I/O requirement = 9,150,464 bytes
  • slavethr: 1,000,000 bytes/sec, degree 2 => total throughput 2,000,000 bytes/sec => elapsed I/O time = 4.575232 seconds
  • sreadtim = 4.54 milliseconds => cost = 4.575232 / 0.00454 = 1007.76 QED.

Two final thoughts:

First, if I increase the degree of parallelism to 3 the cost drops to 673 (671.84 plus a bit of CPU); if I increase the degree any further the cost doesn’t drop any further – that’s because the maxthr (maximum throughput) is 3,000,000. The optimizer uses the smaller of maxthr and (degree * slavethr) in the parallel arithmetic.

Secondly, and most importantly, the way the optimizer behaves will be affected by the various parameters relating to parallelism, so here’s a list of the relevant settings on the instance I was using when creating this example:

SQL> show parameter parallel

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
fast_start_parallel_rollback         string      LOW
parallel_adaptive_multi_user         boolean     TRUE
parallel_automatic_tuning            boolean     FALSE
parallel_degree_limit                string      CPU
parallel_degree_policy               string      MANUAL
parallel_execution_message_size      integer     16384
parallel_force_local                 boolean     FALSE
parallel_instance_group              string
parallel_io_cap_enabled              boolean     FALSE
parallel_max_servers                 integer     80
parallel_min_percent                 integer     0
parallel_min_servers                 integer     0
parallel_min_time_threshold          string      AUTO
parallel_server                      boolean     FALSE
parallel_server_instances            integer     1
parallel_servers_target              integer     32
parallel_threads_per_cpu             integer     2
recovery_parallelism                 integer     0

Note, particularly, that I have not enabled parallel_automatic_tuning.

For further details on parallel costs and the cost/time equivalence that the optimizer uses in recent versions, here’s a link to an article by Randolf Geist.

Why is “Dave Unknown” Trying to Social Media With Me?

I know some people share my opinion on this and others totally disagree – but I fail to appreciate why people I have never met, spoken with or care about want to Social Media with me. If we have not met but there is a high probability we share unusual interests then OK, perhaps – but the fact that we both can spell Oracle or know what a gene is does not count as unusual shared interests. Maybe I am just too old to “get it” or just too grumpy to appreciate their efforts.

I’m not the biggest fan of Social Media but I don’t actively dislike it either. I mean, I’m blogging so that means I have some appreciation for it. I have a Twitter account and sometimes I Twit. But not a lot. I don’t have time or inclination to log on every day and see what people have seen that they think is funny/odd/outrageous/titillating on the web, which airport they are currently being bored in or what publication/talk/blog post of theirs they want to big up. Or what cereal they have just eaten {really? Some of you think this would interest anyone?} But occasionally I hang out there and swap twit twaddle and follow links and maybe even put up my own links to my fabulous blog utterings. But I don’t follow people I don’t in some way know or have a reason to be interested in {and I don’t include seeing them on TV as my being interested in them – I followed a couple of people on twitter early on that I thought would be interesting, based on their Popular Culture output. And very quickly decided I’d stand a better chance of continuing to like them if I was not being informed of all the dross that crossed their minds when they had not rehearsed their material}.

For me, the main Social Media thing that baffles and slightly annoys me is LinkedIn Wannabes. Why are you contacting me if I don’t know you and you don’t know me? I don’t know 7.039 billion people. OK, you know some Oracle – so do probably 0.7039 million people (wow, what a worrying thought) that I also don’t know. It’s not personal that I have no interest in being LinkedIn with you, it’s the opposite. I impersonally don’t feel a need to link with you.

Do I want to link in with Dave in Denver CO, USA who is a Java developer? I’ve nothing against you, Dave, but I’m highly unlikely to meet you and we probably have little to talk about, especially as I almost never communicate with people via LinkedIn {and I don’t know anyone who does really communicate via LinkedIn}. I struggle to keep up with people I have met in the flesh or I absolutely know I have shared interests with, so random LinkedIn Wannabes, no chance. If I met you in person I’d probably like to have a chat and I might even buy you a beer, and perhaps we would become friends and I’d welcome your LinkedIn invite with open keyboard. But frankly, until you’re drinking that Carlsberg I just got from the bar for you, you are one in 7.039 billion unknown people to me.

So am I being unfriendly? Well, when I get a LinkedIn request I almost always check out the person. Is it someone I have worked with or met at a conference and it might be nice to maintain some sort of vague contact with? Occasionally it is. Once it a blue moon it turns out to be someone I actually know (or know of) quite well and I feel daft that I did not recognise them. Sometimes it is someone I don’t know but they know 15 people I do (hopefully mostly the ones I like  :-) ) and I can see they share strong work interests with me.  I link in. But most of the time I don’t know them and *they have over 500 contacts*. 

Over 500 contacts? Really? Really? And you know all these people? No, you don’t Dave. You are just collecting stamps. I’m as important to you as that. So now, though I know nothing much about you, I know I am unimportant to you, I’m just a stamp. I definitely do NOT want to be LinkedIn with you.

Occasionally it is worse. I’m not a stamp, I’m a little bit of potential collateral, a maybe-bit-of-income for them. The person is a recruitment consultant or a salesperson or a company representative who has figured out that for every 200 hundred people they bother they get a lead. So they contact thousands of us. Well, you can really stuff your invite up where the sun does not shine.

But most of the time it is stamp collecting. This seems very common with our South Asian friends. I don’t know why, maybe it is a cultural thing, maybe the universities there tell their students that this is a good way to progress (I can’t see that it is but I’m happy to be corrected if I am wrong), I don’t know – but 75% of LinkedIn invites I get from people with 500+ contacts are from that part of the world.

I’ve noticed one key thing about LinkedIn stamp collecting (or potential-collateral) invites – none of them have bothered to change the standard invite text.

Hi M

I’d like to add you to my professional network on LinedIn

- Dave Unknown

Hint – if you really want to link with me, change the text to something, anything and I mean *anything* else. Try

Oi, Martin

I’ve met you and you smell of fish and your jokes are pathetic. Link in to me else I will throw things at you next time you present

- Dave Unknown

That’ll get my attention.

What kicked of this diatribe by me? It was when we got the below at work:linkedin_who


It really tickled me. This person is so desperately stamp collecting that they are trying to link to Admin in Technical Services. Of course I removed names to protect the guilty but, really, Ramzan “the import/export professional” – I think you should take a bit more care in your stamp collecting efforts.

Conference Organisation from the Inside – UKOUG Tech14

An interesting experience I have had this year is being more involved in helping organise the annual UKOUG Oracle Technical Conference – Tech14. I fully intended to blog about things as we progressed, but it never happened got going so I did not.. But I thought it would be interesting to do a couple of blogs about it now, for anyone interested, as the conference itself approaches.

If you have never helped organise a conference or user group meeting then you probably think there is not a lot of work involved. You would be quite wrong. If you have been a volunteer at one, as in you have presented or chaired sessions, then you will have more understanding – but still probably fall short of the mark in estimating the effort involved. There is a lot involved.

The UKOUG is, I think, the largest Oracle User Group in the world and the annual conference has grown significantly since I first got involved around the turn of the millennium {which is now quite a while back – yes, we are all getting quite old}. In fact, it is now a set of conferences and events dedicated to Oracle E-Business suite, JD Edwards, PeopleSoft, Hyperion and regional conferences for Ireland and Scotland (sorry Wales) as well as the annual technical event that used to be the single conference. This year Tech14 is in the same location as Apps14, which covers most of the application areas I just mentioned. I rather like the fact we are returning to being in the same place but still have two events as it matches the reality of the two groups. There is a lot of cross-over between apps and tech for some of us whereas for many, you belong in one camp or the other. It’s a bit like do you like football or rugby…

So where did I fit into the picture? Each year the UKOUG approach some of it’s volunteers and asks them if they would mind giving them a little bit of help with the conference that year. Any that do not run away quickly are corralled into a room at head office in Wimbledon and bribed them with tea, coffee and biscuits. We are arranged into being the content committees for various areas. I was part of the committee for the Database stream and ended up being the Chair. This does not make me any more significant, it just means if someone has to make a decision when the committee is split or they just want a quick answer to a question (such as “can Dave swap his presentation slot with Senthil’s”), then it will be me the office contacts. OK, I suppose it means I have a little more input but as everything is open, others on the database committee (or others) can cry foul.

There are also committees for Middleware, Development, OS & Engineered systems, Business analytics… I am sure I have forgotten one! In many ways the Database stream is easiest as I do not think it has as broad a remit as, for example, development, and the core database is the core database. But we also have the largest community and thus the largest number of papers put forward and streams to organise.

So What do the committees do? Our responsibility is primarily to agree on the technical content of our steams. ie What presentations go into it, the order of them, plan any threads or themes to run through a day or several days and ensure that at any given time there are talks, roundtables and workshops across a spectrum of topics and not 4 all on backups or ADF. Sounds easy? No, it’s not. I’ll go into why in a later post.

We also help with decisions about wider issues for the conference – when the keynotes occur, who to ask to do the keynotes, the evening events and some wider issues like that. However, the actual location and timing of the event is set in stone before we get involved – it has to be as those major decisions have to be made over a year in advance. Personally, I think the venue at The Liverpool ACC is a good one. I can understand some people feeling Liverpool is a bit far to go but in reality it only takes an hour or two longer to get there than to what was the traditional home of the conference in Birmingham. And frankly, I was tired of Birmingham and the usual pub I ended up in was getting truly ratty and unpleasant. The ACC is at Albert Doc and a lot of bars, restaurants and ,I suspect, nightclubs (for those who like loud music and terrible lager at a premium price) are in the area.

Most of the work planning the actual conference is of course done by the office staff and I know that for smaller user groups all the work is done by volunteers – I’ve done a couple of myself too – so some of you might think we volunteers for the UKOUG conference have it a bit easy. But the conference is massive and we do {most of us} have proper jobs to do too. So if something is not as you would like at the UKOUG conference, or in fact at any conference, it is probably not through lack of effort. Just let us know {nicely, please} and we will try and not make the same mistake next time.

Quantum Data

That’s data that isn’t there until you look for it, sort of, from the optimizer’s perspective.

Here’s some code to create a sample data set:

create table t1
with generator as (
	select	--+ materialize
		rownum id
	from dual
	connect by
		level <= 1e4
	rownum					id,
	mod(rownum-1,200)			mod_200,
	mod(rownum-1,10000)			mod_10000,
	lpad(rownum,50)				padding
	generator	v1,
	generator	v2
	rownum <= 1e6

		ownname		 => user,
		tabname		 =>'T1',
		method_opt 	 => 'for all columns size 1'

Now derive the execution plans for a couple of queries noting, particularly, that we are using queries that are NOT CONSISTENT with the current state of the data (or more importantly the statistics about the data) – we’re querying outside the known range.

select * from t1 where mod_200  = 300;
select * from t1 where mod_200 >= 300;

| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
|   0 | SELECT STATEMENT  |      |  2462 |   151K|  1246   (5)| 00:00:07 |
|*  1 |  TABLE ACCESS FULL| T1   |  2462 |   151K|  1246   (5)| 00:00:07 |

Predicate Information (identified by operation id):
   1 - filter("MOD_200"=300)

SQL> select * from t1 where mod_200 >=300;

| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
|   0 | SELECT STATEMENT  |      |  2462 |   151K|  1246   (5)| 00:00:07 |
|*  1 |  TABLE ACCESS FULL| T1   |  2462 |   151K|  1246   (5)| 00:00:07 |

Predicate Information (identified by operation id):
   1 - filter("MOD_200">=300)

The predicted cardinality for mod_200 = 300 is the same as that for mod_200 >= 300. So, to be self-consistent, the optimizer really ought to report no rows (or a token 1 row) for any value of mod_200 greater than 300 – but it doesn’t.

SQL> select * from t1 where mod_200 = 350;

| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
|   0 | SELECT STATEMENT  |      |  1206 | 75978 |  1246   (5)| 00:00:07 |
|*  1 |  TABLE ACCESS FULL| T1   |  1206 | 75978 |  1246   (5)| 00:00:07 |

Predicate Information (identified by operation id):
   1 - filter("MOD_200"=350)

SQL> select * from t1 where mod_200 =360;

| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
|   0 | SELECT STATEMENT  |      |   955 | 60165 |  1246   (5)| 00:00:07 |
|*  1 |  TABLE ACCESS FULL| T1   |   955 | 60165 |  1246   (5)| 00:00:07 |

Predicate Information (identified by operation id):
   1 - filter("MOD_200"=360)

SQL> select * from t1 where mod_200 =370;

| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
|   0 | SELECT STATEMENT  |      |   704 | 44352 |  1246   (5)| 00:00:07 |
|*  1 |  TABLE ACCESS FULL| T1   |   704 | 44352 |  1246   (5)| 00:00:07 |

Predicate Information (identified by operation id):
   1 - filter("MOD_200"=370)

SQL> select * from t1 where mod_200 =380;

| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
|   0 | SELECT STATEMENT  |      |   452 | 28476 |  1246   (5)| 00:00:07 |
|*  1 |  TABLE ACCESS FULL| T1   |   452 | 28476 |  1246   (5)| 00:00:07 |

Predicate Information (identified by operation id):
   1 - filter("MOD_200"=380)

SQL> select * from t1 where mod_200 in (350, 360, 370, 380);

| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
|   0 | SELECT STATEMENT  |      |  3317 |   204K|  1275   (7)| 00:00:07 |
|*  1 |  TABLE ACCESS FULL| T1   |  3317 |   204K|  1275   (7)| 00:00:07 |

Predicate Information (identified by operation id):
   1 - filter("MOD_200"=350 OR "MOD_200"=360 OR "MOD_200"=370 OR "MOD_200"=380)

The IN-list results are consistent with the results for the individual values – but the result for the IN-list is NOT consistent with the result for the original mod_200 >= 300. The optimizer uses a “linear decay” strategy for handling predicates that go out of range, but not in a consistent way. It seems that, as far as out-of-range, range-based predicates are concerned, the data doesn’t exist until the wave front collapses.


This type of anomaly COULD affect some execution plans if your statistics haven’t been engineered to avoid the problems of “out of range” traps.

The Myth of Great Places to Work

Why do people have this crazy obsession of working for a specific company, or the perception that a company is great or not so great? People—especially those immediately around you—make up a huge part of the attractiveness of a company to work for; not companies themselves. Consider these factors first before you choose one to be a "dream" company.

It's not the traditional blog post I generally write. But it's something I consider near and dear to my heart; so I thought I would do it anyway.

Recently I was honored to have been asked to address the graduating classes of the Computer Science, Information Systems and Electronics Engineering departments of a university. After the graduation ceremony, many students came up to me to talk about their career aspirations, how they can accomplish that, what else they should learn, etc.; but mostly they were just venting their frustration of not being able to land that job with that “dream” organization. What were the dream organizations? Google, Apple, Facebook, and the likes. That was no surprise. According to a recent article I read apparently over 2 million job seekers apply to Google every year. It’s 10 times harder getting into Google than getting into Harvard. I am not aware of the statistics for Apple and Facebook but I suspect they will be similarly crazy. No wonder the kids are getting frustrated for not making it to the circle they consider they will be happy.

To lift their spirit I offered: why not look for other technology companies such as IBM, Microsoft, Intel, Oracle, etc. Oh, yes, those will have to do since the top tier companies are so unreachable—they say resignedly. They have to just “do”—for successful organizations like IBM and Intel? Or—I offered further—why not work for the rest they didn’t even mention, for instance, Starwood Hotels which is betting heavily on technology to drive its business further.

I was met with blank stares. Even incredulous ones that seemed to say “you must be kidding”!

The Fallacy of Organizational Greatness?

In addition to what I do for a living, I mentor young graduates of IT fields. Apart from the great satisfaction of giving them tips on the minefield called life they are about to venture into—and sort of playing a big brother role—it puts me in touch with the reality on the aspirations and efforts of today’s generation, which in turn helps me shape up my own career and team building activities. What I saw during the graduation ceremony is not unique to that university; it’s more or less a general trend. So, I have asked this and I will ask again, why?

Facebook, Google, Apple—these are more than just names; they have sort of a mystical aura around them along with their founders. Books and mainstream movies have been done on Mark Zuckerberg. When Steve Job passed away, I was at a conference in San Francisco. Someone announced it to the group reading from breaking news on a smartphone. A hush came over the room immediately. Later I noticed with amazement how ordinary people lit candles and left commemorative notes at Apple Stores. All this for a man who they never met, never had a business interaction with, and certainly did not earn their livelihood from. Yet they revered him as some sort of a demi-god. It’s not a surprise then that these same people would consider for that company he founded as close to working at the pearly gates of the heaven. The aura of splendor, especially mythical, has always fascinated humans and driven them to achieve even loftier goals. Had it not been the case, Columbus would never have attempted sea route to India, landing in Central America and founding the European colonies in the New World.

While there is no doubt that Apple, Facebook, Google and more like then are great organizations to work for, let's ponder over that all important question: what makes a great organization to work for? Some provide free food, free shoe shine and free dependent day care for their employees. Some provide after hour parties, employee assistance programs for substance abuse, multi-cultural celebrations and more. Do these make an organization attractive? I would humbly submit: NO. These do make up some portion of being attractive; but unfortunately these simply get a lot more share of their attention while masking the real attractiveness. I hope to change that perception and hopefully steer the attention to those areas that really matter.

The Irony of Perks

Granted; all these are great benefit for many employees. Those are tangible, objective benefits one can connect with and understand. But, what about the traveling employee who doesn’t spend as much time in the office? Those would be meaningless for her. Working from home is a huge perk for many. But few organizations make it a blanket rule for working from home. Almost all defer the availability of that privilege to the immediate supervisor. If your job is not classified as remote-only; then it’s up to your manager to decide if and when you can work from home. Most office-based workers do not actually want to work remote all the time; they just want the flexibility to do so, if needed, on occasions—a sick child, a furniture delivery, a scheduled appliance repair, or just to take a break from a strenuous commute. If that authority to allow that flexibility is left to the immediate supervisor, not to the employee, then the flexibility becomes, due to a lack of better words, inflexible.

Take another perk. Some of those “adorable” organizations have committees to value and celebrate a family oriented organization. A company I worked for had a “happy hour” every Friday where finger foods, drinks and even alcohol were served—unlimited. It definitely made some people happy. Some. For the rest of us who had deadlines to meet and piled work to address to and were anxious to get back to our families, this happy hour was just a mirage. In fact it was mildly depressing working while watching the lucky few frolicking. Perk? I do not think so.

So, once again, what makes a happy organization? I have given this advice to my mentees and whoever else has asked for: it’s your immediate supervisor and team that define the goodness of the organization. An organization is a happy place to work only if the organization's processes support that positive mentality from the colleagues; not otherwise.

Whose Personality?

I often chuckle when I hear the phrase “culture of an organization”. Organizations are not people; they simply cannot have a personality or temperament. Like any social group, they do have rules. The more objective and transparent the rules are the closer the organization is to having a culture. But achieving that kind of objectivity and transparency is far from being simple. In fact, in today’s litigious environment most organization deliberately shy away from objectivity, leaving wide latitude for maneuverability for unforeseen situations. Transparency is also in the same boat—it’s often impractical to communicate everything to all the employees. Too much information is often ignored. But the lack of public information leads to bureaucracy; those “in the know” have the power. Therefore, defining a good organization by its perceived culture is widely subjective. For those who understand and appreciate rudimentary statistics, the margin of error is just too high to accept it.

On the other hand, an employee’s immediate surroundings and daily interactions make up for a majority of the employee’s satisfaction level. The immediate supervisor plays a huge role in that space. Supervisors come in many forms: team leads who approve your code, managers who approve vacation requests or vice presidents who approve departmental budgets; the point is that they have some power over what the employee wants to do—professionally or personally. The important and the only relevant fact is that supervisors are people. They are the ones—not organizations—who have personalities, preconceived notions, hidden agendas and sometimes unexplained emotions. And these are what directly affect the employee satisfaction.

In my long career I have seen different types of supervisors and I am sure I will continue to see more. One supervisor—highly placed—not only refused to acknowledge even the above and beyond types of accomplishments, but actually came down pretty hard on his subordinates for rudimentary slip ups, even for small misunderstandings. Another supervisor, on the contrary, blew up the accomplishment of her team to such an extent that writing this blog entry may have been compared to conquering Mount Everest. Granted, these are extremes; but the fact is that both these managers worked for the same company. In fact they reported to the same person. Same organization and same department; but the satisfaction level of people under them? It varied like day and night. Whatever happened to the presumed organization culture, even in a microcosm like a single department? Well, it does not exist. These two managers defined that—right inside the same organization.

Being acknowledged and recognized for activities is a basic human trait and makes up for a majority of the employee satisfaction. Once again, supervisors are expected to do that. Even if an organization eventually passes out accolades, the supervisor is expected to provide the impetus. I have always wondered why it is so difficult for some supervisors to acknowledge the accomplishments of the subordinates; and I don’t think I will come to a conclusion quickly. Many factors affect it—jealousy, a sense of insecurity, perception of hurt pride; but regardless of the exact case, the fact is that these are personality traits of human beings. They define how happy you will be in an organization; not the organization itself.

Highly functioning organizations:

  1. Understand that employee satisfaction has a direct correlation to those interacting with the employees
  2. Realize that the pattern of behavior can’t be predicted, simply because human beings can’t be predicted
  3. Make sure the processes and procedures are in place to encourage positive and discourage negative behavior.

Without the above actions, organizations cannot be really attractive.

What should we do about it?

Why am I talking about it? It’s certainly not a gripe. I want to bring attention on two very important issues.

  1. We are dangerously advising our young generations to pursue the goal of landing a job at “dream company” without sufficient merit.
  2. We are encouraging organizations to not to do the right thing perpetuating myths about creating a happy environment. And, as a corollary, we are ignoring those organizations doing the right thing.

The actions organizations can do to make it a happy environment at work and attract talented workers falls under two broad categories:

  • Objective – actions that are visible, measurable and unambiguous. Example: number of vacation days
  • Subjective – actions that are left to subjective analysis and action and can be ambiguous. Organizations should leave as little in this category as possible.

Objective Actions

It may seem that defining objective actions is easy; but it’s far from it. Take for instance, number of vacation days. What if the employee can’t take all the vacation days at the end of the year? Some organizations allow employees to carry it forward to the following year, or allow them to cash the days. Some force them to lapse. For truly motivated employees who work on important, time critical projects, taking time off may not be an option. What happens to those employees falling in the latter class? They lose the days they truly "earn". Are they happy? Of course not. Are they valuable for the company? Of course; that’s why they were asked not to take vacation days. So valuable employees perceive this policy as a penalty for their contribution.

Allowing employees to cash out on unused vacation days encourage people not to take vacation at all and cost the company serious money. So this probably should not be blanket policy. It should be considered case by case. But it’s important that this decision is made:

  • with full transparency where all discussions are clear to the employee without any hidden discussions or decisions.
  • at the time when the vacation days are requested but denied, not at the end of the year as is usually the case.

Same goes for sick days. Some organizations have limited number of sick days. What happens when an employee does not get sick? The days lapse for that year. Or, when the sickness lasts beyond the designated number of days? The employee does not get paid? I know a few organizations that do not limit sick days. Instead they purchase an insurance policy on the employee’s behalf that pays for the salary for extended periods of sickness. If the organization has such a policy, then that definitely brings it up on the attractiveness scale. Attributes like this create tangible value for the employees while limiting the exposure of the organization to fraudulent and financially risky propositions.

And, that transparent, unambiguous, objective perks go a long way in employee satisfaction.

Subjective Actions

This is one area the organizations should pay specific attention to. Knowing that the actions of the immediate supervisors and the team mates make up the bulk of the employee happiness and productivity, it’s important to ensure greatness in these areas. Here are some suggestions. As a prospective job-seeker, these are areas you should pay attention to. Full marks for those who have clearly defined, less ambiguous policies on these.

Institute Feedback Loops

Periodic, organized meeting between immediate supervisor and an employee where the specific aspects of the employee’s activities are discussed. This probably occurs naturally and informally anyway; but the key is set of two things:

  1. Make it formal where a representative of HR may preside with written notes of the meeting that can be referred to afterwards.
  2. Make it fairly frequent, e.g. monthly; not just once or twice a year. More frequent interactions and feedback means most tense situations can be defused before blowing up.

Wrap-around Feedback

Often known in various other terms such as 360-degree feedback, surround feedback, etc. where a neutral department or organization such as HR collects feedback from all employees associated with the employee in question—peers, customers, collaborators, managers, higher level managers, etc. instead of just managers. This allows the employee to see herself in the multifaceted mirror and prevent surprises in the future.

Skip Level Feedback

This is one area where most organizations fail. This means a higher up skips the level below her and directly talks with the employees under confidentiality. For instance A VP may skip Director below her and solicits feedback from managers under that Director. Organizations differ on their rank classifications; so the exact titles will vary widely. But I hope you got the idea.

Give Precise Feedback

Once again, this is where most fail. Feedback is essential to highly functional teams; but it is not just praise. The provider of the feedback must give a critique; not just a pat in the back. It has to be precise and accurate. It should point out where the employee has failed, and could have improved. Remember, the only beneficiary in this case is the employee; not the provider. Once the organization establishes a strong critique loop, it becomes almost second nature to most and results in employee satisfaction.

Discourage Bad Supervisors

Remember, supervisors are nothing but people and they have their unique personality quirks. Rather than fighting to get that perfect supervisor—which is not even possible—organizations should strive to push the existing supervisors by discouraging the negative behavior. Take the case of supervisor mentioned earlier who simply wasn’t comfortable with recognizing the employees he was responsible for; and was far more comfortable with disparaging them. Instead of psycho-analyzing the behavior, which produces little result, organizations should simply have clear procedures to identify and discourage such behavior. Had the organization acted earlier, the supervisor would have changed or moved away. Either way, that would have been a victory for the organization.

In Summary

All I wanted in this short write up is to dispel the myth that organizations are great to work for based on their culture and personality. People, not companies, make or break the employee satisfaction scale. Unfortunately a lot of text is written on organizational behavior with little emphasis on the biggest factor of all—human behavior which is highly subjective. To be really a satisfying place to work, organizations must establish clear, precise, unambiguous policies and procedures to steer the human behavior in the desired direction and make it highly detrimental otherwise. That is the only way to ramp up employee satisfaction.

Young reader who are just starting out: pay attention to this “great teams make great places to work” philosophy in choosing a place to work; not a ranking of the company as a whole. But these young readers are leaders of future. It’s my sincere hope that the future will be shaped by those leaders with this concept squarely in their thoughts.

Happy Thanksgiving, all.


“You can’t compare apples with oranges.”

Oh, yes you can! The answer is 72,731,533,037,581,000,000,000,000,000,000,000.

SQL> create table fruit(v1 varchar2(30));
SQL> insert into fruit values('apples');
SQL> insert into fruit values('oranges');
SQL> commit;
SQL> begin
  2  	     dbms_stats.gather_table_stats(
  3  		     ownname	      => user,
  4  		     tabname	      =>'FRUIT',
  5  		     method_opt       => 'for all columns size 2'
  6  	     );
  7  end;
  8  /
SQL> select
  2  	     endpoint_number,
  3  	     endpoint_value,
  4  	     to_char(endpoint_value,'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx') hex_value
  5  from
  6  	     user_tab_histograms
  7  where
  8  	     table_name = 'FRUIT'
  9  order by
 10  	     endpoint_number
 11  ;

ENDPOINT_NUMBER                                   ENDPOINT_VALUE HEX_VALUE
--------------- ------------------------------------------------ -------------------------------
              1  505,933,332,254,715,000,000,000,000,000,000,000  6170706c65731ad171a7dca6e00000
              2  578,664,865,292,296,000,000,000,000,000,000,000  6f72616e67658acc6c9dcaf5000000
SQL> select
  2  	     max(endpoint_value) - min(endpoint_value) diff
  3  from
  4  	     user_tab_histograms
  5  where
  6  	     table_name = 'FRUIT'
  7  ;

SQL> spool off

Oracle Virtual Technology Symposium: Download

For all those who attended my session today in Oracle Virtual Technology Summit on InMemory, thank you very much. Some questions were cut off towards the end since it was an automated session. I will be happy to answer, if your question was in that category.
Since this was an on-demand show, you can watch the replay anytime by visiting this site. Remember, this requires a connection to watch; it's not a download. You can download the slides from here and the scripts I used in the demo here.

Please post the questions you couldn't do during the webinar.

Plan puzzle

I was in Munich a few weeks ago running a course on Designing Optimal SQL and Troubleshooting and Tuning, but just before I flew in to Munich one of the attendees emailed me with an example of a statement that behaved a little strangely and asked me if we could look at it during the course.  It displays an odd little feature, and I thought it might be interesting to write up what I did to find out what was going on. We’ll start with the problem query and execution plan:

select     s section_size,
           max(program_id) ,avg(program_id)
from       fullclones_test
cross join  (select distinct section_size s from fullclones_test)
where      section_size = (select min(section_size) from fullclones_test)
and        clone_size >= s
group by   s
order by   1; 

Since I found this a little difficult to follow (with repetitions of the same table name, and column aliases switching back and forth) I did a little cosmetic modification; all I’ve done is add table aliases and query block names, then arranged the text for my visual benefit. The aliases and query block names can help when dissecting the anomaly.

	/*+ qb_name(main) */
	ftv.s	section_size
	fullclones_test	ft1
cross join
	select	/*+ qb_name(inline) */
		distinct ft2.section_size s
	from	fullclones_test	ft2
	)	ftv
where	ft1.section_size = (
		select 	/*+ qb_name(subq) */
		from	fullclones_test	ft3
and	ft1.clone_size >= ftv.s
group by
order by

This query ran reasonably quickly (about half a second), producing the following execution plan:

| Id  | Operation              | Name            | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
|   0 | SELECT STATEMENT       |                 |      1 |        |   404 (100)|      4 |00:00:00.05 |    1116 |       |       |          |
|   1 |  SORT GROUP BY NOSORT  |                 |      1 |      5 |   404   (6)|      4 |00:00:00.05 |    1116 |       |       |          |
|   2 |   MERGE JOIN           |                 |      1 |  48637 |   299   (7)|  50361 |00:00:00.58 |    1116 |       |       |          |
|   3 |    SORT JOIN           |                 |      1 |      5 |   114  (11)|      5 |00:00:00.02 |     372 |  2048 |  2048 | 2048  (0)|
|   4 |     VIEW               |                 |      1 |      5 |   114  (11)|      5 |00:00:00.02 |     372 |       |       |          |
|   5 |      HASH UNIQUE       |                 |      1 |      5 |   114  (11)|      5 |00:00:00.02 |     372 |  5894K|  3254K|  884K (0)|
|   6 |       TABLE ACCESS FULL| FULLCLONES_TEST |      1 |  99999 |   105   (3)|  99999 |00:00:00.31 |     372 |       |       |          |
|*  7 |    SORT JOIN           |                 |      5 |  20000 |   185   (4)|  50361 |00:00:00.16 |     744 |   619K|   472K|  550K (0)|
|*  8 |     TABLE ACCESS FULL  | FULLCLONES_TEST |      1 |  20000 |   106   (4)|  20076 |00:00:00.09 |     744 |       |       |          |
|   9 |      SORT AGGREGATE    |                 |      1 |      1 |            |      1 |00:00:00.01 |     372 |       |       |          |
|  10 |       TABLE ACCESS FULL| FULLCLONES_TEST |      1 |  99999 |   105   (3)|  99999 |00:00:00.29 |     372 |       |       |          |

Query Block Name / Object Alias (identified by operation id):
   1 - SEL$071BB01A
   4 - INLINE       / FTV@SEL$1
   5 - INLINE   
   6 - INLINE       / FT2@INLINE   
   8 - SEL$071BB01A / FT1@SEL$1
   9 - SUBQ 
  10 - SUBQ         / FT3@SUBQ

Predicate Information (identified by operation id):
   7 - access("FT1"."CLONE_SIZE">="FTV"."S")
   8 - filter("FT1"."SECTION_SIZE"=)

As you can see by comparing the block name / object alias information, we can identify a single full tablescan being executed at line 9 to produce the min(section_size) in the subquery.

We can also see that the “select distinct” executes at lines 5/6 to produce 5 rows which are then joined with a merge join to the first full tablescan of t1.

If you’re wondering about the appearance of a sel$1 despite my efforts to name every query block, that’s the (almost) inevitable side effect of using ANSI syntax – virtually every join after the first two tables will introduce a new (unnameable) query block to introduce the next table.

Now here’s the anomaly: if we eliminate the avg() from the select list we’re going to produce a result that ought to require less work – but look what happens:

| Id  | Operation              | Name            | Starts | E-Rows | Cost (%CPU)| A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
|   0 | SELECT STATEMENT       |                 |      1 |        | 10802 (100)|      4 |00:02:48.83 |    1116 |       |       |          |
|   1 |  SORT GROUP BY NOSORT  |                 |      1 |      5 | 10802  (94)|      4 |00:02:48.83 |    1116 |       |       |          |
|   2 |   MERGE JOIN           |                 |      1 |    972M| 10697  (95)|   1007M|03:21:28.41 |    1116 |       |       |          |
|   3 |    SORT JOIN           |                 |      1 |  99999 |   380   (4)|  80042 |00:00:00.39 |     372 |  2037K|   674K| 1810K (0)|
|   4 |     TABLE ACCESS FULL  | FULLCLONES_TEST |      1 |  99999 |   105   (3)|  99999 |00:00:00.26 |     372 |       |       |          |
|*  5 |    SORT JOIN           |                 |  80042 |  20000 |   185   (4)|   1007M|00:57:11.13 |     744 |   619K|   472K|  550K (0)|
|*  6 |     TABLE ACCESS FULL  | FULLCLONES_TEST |      1 |  20000 |   106   (4)|  20076 |00:00:00.11 |     744 |       |       |          |
|   7 |      SORT AGGREGATE    |                 |      1 |      1 |            |      1 |00:00:00.01 |     372 |       |       |          |
|   8 |       TABLE ACCESS FULL| FULLCLONES_TEST |      1 |  99999 |   105   (3)|  99999 |00:00:00.28 |     372 |       |       |          |

Query Block Name / Object Alias (identified by operation id):
   1 - SEL$6B65F52B
   4 - SEL$6B65F52B / FT2@INLINE
   6 - SEL$6B65F52B / FT1@SEL$1
   7 - SUBQ
   8 - SUBQ         / FT3@SUBQ

Predicate Information (identified by operation id):
   5 - access("FT1"."CLONE_SIZE">="FT2"."SECTION_SIZE")
   6 - filter("FT1"."SECTION_SIZE"=)

Ignore the timings from lines 2 and 5 – I was using the hint gather_plan_statistics to collect the rowsource execution stats, and those lines are showing a massive sampling error. The query took about 7 minutes 30 seconds to run. The key difference is that line 4 shows that the “select distinct” is NOT aggregated early – the optimizer has used complex view merging to “join then aggregate” rather than “aggregate then join”. As you can see, this was a bad choice and the join has produced over a billion (US) rows at line 2 which then have to aggregated down to just 4 rows in line 1.

The question then is why ? If I put a /*+ no_merge */ hint in query block named “inline” the optimizer accepts the hint and goes back to the plan that aggregates early and runs very quickly – so it’s not a question of the optimizer bypassing some mechanism to avoid getting the wrong answer. I think the only option available to use for further investigation at this point is to examine the 10053 (optimizer) trace file to see what’s going on.

From the (12c)  trace file where we select the avg() we see the following lines:

OJE: Begin: find best directive for query block INLINE (#0)
OJE: End: finding best directive for query block INLINE (#0)
CVM: CBQT Marking query block INLINE (#0) as valid for CVM.
CVM:   Not Merging INLINE (#0) into SEL$1 (#0) due to CBQT directive.

From the equivalent position in the trace file where we select only the max() we see the lines:

OJE: Begin: find best directive for query block INLINE (#0)
OJE: End: finding best directive for query block INLINE (#0)
CVM:   Merging SPJ view INLINE (#0) into SEL$1 (#0)

It’s always hard to decide exactly WHY things happen – but it looks as if the optimizer merges the view heuristically in the max() case “because it can”, but has a heuristic (internal directive) that stops it from merging in the avg() case. What a good job we have hints !


In cases like this it’s always possible that there’s a generic case that might produce wrong results even though the specific case won’t – so it’s worth spending a little time thinking about how the wrong results might appear. It’s also worth keep hold of the SQL as a test case because if there’s a generic block in place to handle specific cases you may find that future enhancements allow the block to be lifted for some cases, and it’s nice to be able to check for such cases as you upgrade.

On the other hand, you can get back to the efficient plan if you change the inline view to be:

        select  /*+
                ft2.section_size s , count(*) ct
        from    fullclones_test ft2
        group by
        )       ftv

That’s just a little bit surprising – but I’m not going to pursue this one any further, beyond noting that there are some interesting anomalies available with inline aggregate views, even in

Footnote 2:

If you want to experiment further, here’s the SQL to generate the data set:

create table fullclones_test (
        program_id      number(5,0),
        start_line      number,
        clone_id        number(5,0),
        clone_line      number,
        clone_size      number,
        range_start     number,
        section_size    number(4,0)

insert into fullclones_test (
        program_id, start_line, clone_id, clone_line,
        clone_size, range_start, section_size
connect by
        level <100000


exec dbms_stats.gather_table_stats(user,'fullclones_test')

Finally, for consistently reproducible results I had engineered my system stats as follows:


Will the real data virtualization please stand up?

There is a post from a good friend at Oracle entitled “Will the REAL SnapClone functionality please stand up?” and, as well-written and technically rich as the post is, I am particularly moved to comment on the very last and conclusive sentence in the post…

So with all of that, why would you look at a point solution that only covers one part of managing your Oracle infrastructure?

The post does not refer to Delphix by name, and it could in fact be referring to any number of companies, but Delphix is the market leader in this space, so it is reasonable to assume that the “Product X” mentioned throughout the post is Delphix.  The same holds true for any post commenting on relational database technology, which can reasonably be assumed to refer to Oracle.  Regardless, I was struck by the use of the phrase point solution in that final sentence of the post, and how it really is a matter of perspective, and how interesting is that perspective.

First of all, before we go any further, please let me say that, as an Oracle DBA for the past 20 years, I think that the current release of Oracle’s Enterprise Manager, EM12c, is the finest and most complete release of the product since I tested early versions of Oracle EM alongside the Oracle8i database in the late 1990s.  At that time, the product was full of promise, but it wasn’t something upon which an enterprise could truly rely.  That has certainly changed, and it has been a long time coming, starting with the advent of utilities like AWR, ASH, and Active Session History.  If you have extensive Oracle technology in your organization, you should be using EM12c to manage it.  Not EM11g, or EM10g, but EM12c.  It really is that good, and it is getting better, and there are talented people behind it, and you simply need it if you want to maximize your investment in Oracle technology.

But just because EM12c is the center of the universe of Oracle technology, what about organizations for whom Oracle technology is merely a component?  Many organizations have diverse IT infrastructures comprising Microsoft, IBM, SAP, and open-source technologies, and all of those technology components share the need for the basic use-cases of quickly and economically cloning production to create non-production environments to support development, testing, reporting, archival, and training activities.

Should those diverse IT organizations employ a silo tool like EM12c just for cloning Oracle databases, and then find the same functionality separately for each of those other separate technologies?  Would doing so be a tactical or a strategic decision?

So in response to the final question in the SnapClone post, I ask another question in turn…

Why would one look at a point solution that covers only Oracle database?

Access to data for development and testing is the biggest constraint limiting development and testing, so it doesn’t make sense to not enable data virtualization for all applications, regardless of whether they are comprised of Oracle technology or not.  IT agility is a strategic capability important to the entire business, not a technical challenge for a component silo.

But perhaps, in the interest of continuing the Oracle-only focus of the SnapClone post, we could stay inside the bounds of Oracle.  Fair enough, as a theoretical exercise…

So, even if we limit the discussion only to Oracle technology, it quickly becomes obvious that another important question looms…

Why would one look at a point solution that covers only the Oracle database, leaving the application software, database software, configuration files, and all the other necessary parts of an application as a further problem to be solved?

Anybody who has managed IT environments knows that the database is just one part of a complete application stack.  This is true for applications by Oracle (i.e. E-Business Suites, PeopleSoft, JDEdwards, Demantra, Retek, etc), as well as prominent applications like SAP, and every other application vendor on the planet, and beyond.

To do this, one needs a solution that virtualizes file-system directories with software, files, and everything that comprises the application, not just an Oracle database.

To provision those complete environments for developers and testers quickly and inexpensively, one needs both server virtualization and data virtualization.

Unless one has spent the past 10 years in deep space chasing a comet, you’ve already got server virtualization on board.  Check.

Now, for data virtualization, you need to virtualize Oracle databases, check.  And you also need to virtualize SQL Server databases, check.  And PostgreSQL and Sybase databases, check and check.  In the near future, Delphix will likely be virtualizing IBM DB2 and MySQL databases, not to mention MongoDB and Hadoop, ‘cuz that’s what we do.  Check, check, … check-a-mundo dudes and dudettes.

Despite this, even if you’re a single-vendor organization, you need to virtualize files directories and files, on UNIX/Linux platforms as well as Windows servers.

Delphix does all of the above, which is one reason why we’re the market leader in this space.  Check.

A substantial portion of the Fortune 500 already relies on data virtualization from Delphix today, across their entire technology portfolio, as the partial list online here shows.

Perhaps it is only a point solution from one perspective, but be sure that your perspective is aligned with those of your whole organization, and that you’re not just thinking of a strategic business capability as merely “functionality” within a silo.