Who's online

There are currently 0 users and 26 guests online.

Recent comments

Oakies Blog Aggregator - Dynamic Sampling (III) Part II

Just again a short note that the third installment's final part of the series on Dynamic Sampling has been published on

I show there how indexes can be used to get Dynamic Sampling working with rare / clustered values, and expand on further ideas how to deal with tricky situations for the optimizer.

Proactive Tuning

There’s no such thing as proactive tuning. Either you do it right before it goes into production (in which case it’s called “design”) or you react to the nasty surprises you get after it’s gone live.

Even if you do something about the problems before anyone has complained that’s still “reactive” not “proactive” – and the trick is to be reactive as soon as possible. For most companies the best strategy is probably to stick a big screen on the wall with the “Top Activity” page from OEM (or equivalent) so that the key players (DBAs and Developers) can keep a casual eye out for resource hogs and prioritise dealing with them as they appear.

Footnote: if you fall into that rare group of people who can look at what’s currently happening, identify the design errors, and explain how to fix them before the performance starts to nose-dive, then I’ll let you say you’re doing proactive tuning; but personally I tend to call that process “eliminating design errors”.

Hate the player, not the game — or my case for Data Engineers

Has “database” become a dirty word within your organization lately? If you’re someone who has been a data technologist for the better part of your career, you may be wondering why the technologies you work with everyday seem to be acquiring such a bad rap. From NoSQL to No DB the current influx of brogrammers seem to take extreme pride in describing how they’re able to write code while avoiding any kind of database technology whatsoever.

The impetus for this post actually started with something I read on the ‘Net the other day about Command Query Responsibility Segregation (CQRS), and how I was initially excited about the concept.

Martin Fowler has a nice, gentle introduction the topic here.

Before I get into the post, however, I think it’s useful for me to describe some of my attitudes toward data management. What’s really odd is that while I rather strongly disagree with the tone of Uncle Bob Martin’s Rant, I actually strongly agree with his assertion about the high value of use-case driven development.

I’ve had gentle debates about the meaning of “data as truth” with several people, and the age-old debate of whether data is more “important” than application code. Generally I’ve found that such debates end up as religious arguments instead of attempting to get to the value of acting on data / or data in action. Because in the end it’s hard for data to have value unless its acted on by a set of processing directives (applications), and while it’s possible to have valuable applications that don’t require knowledge about the past (basic rule engine codifications), in general they need each other.

Why I call myself a data engineer

I’ve been impressed with EMC’s attempt to define a Data Science curriculum. In particular, I like how they describe the different skills and roles necessary for a successful data science team, including the hot new title of data scientist. The data science team often includes a data architect, a data engineer, and a database administrator. So, what is a data engineer? In a blog by Steve Todd, Director of EMC’s Global Research and Innovation Portfolio, he has the following characterizations:

The “Database Administrator” provisions and configures the database environment to support the analytical needs of the working team. The “Data Engineer” tends to have deep technical skills to assist with tuning SQL queries for data management and extraction. They also support data ingest to the analytic sandbox. These people can be one in the same, but many times the data engineer is an expert on queries and data manipulation (and not necessarily analytics as such). The DBA may be good at this too, but many times they may simply be someone who is primarily skilled at setting up and deploying a large database schema, or product, or stack.

Many, many DBAs wear both hats, but I think it’s not a good idea — in general I think that DBA is to data engineer as system administrator is to software engineer, but the lack of data engineers has forced DBAs into dual-roles, often for which they are not well-suited. While I have basic DBA skills, I’m much better at the skills listed under the data engineer — and I enjoy working with the data scientists or application developers who have questions about the data and/or how they’d like it structured to support their use cases.

This is one of the reasons why I agree with Uncle Bob’s rant in which he also rails against frameworks in addition to the database — I just wish frameworks had received equal billing in the rant and title, but I’m guessing that the No DB vitriol resonated more highly with readers. In general I like making sure data is organized in such a way as to support as many use cases as possible. That includes being performant for each use case — which may mean taking advantage of techniques to denormalize, duplicate and synchronize, cache and distribute data.

I suppose I could write a similar rant on No Data Frameworks, but then I’d probably step into the ORM battle, which really isn’t the focus of this post. But just to follow on to Uncle Bob’s rant — the reason I dislike many ORM Data Frameworks is that they tightly bind the application to a particular physical implementation of a data layout, which then limits and constrains my ability to adapt the data layout for new use cases, and leads to “persistence avoidance” in application code.

True story — on a recent Agile project, I was providing guidance on the data layer when I noticed that a bit of information for a new use case wasn’t being captured. I suggested to the team that it would be easy to extend the data layer in order to retain the additional information and I was met with groans: “But that means touching the persistence framework — that’s a big change!” — I was flabbergasted. Isn’t the data layer usually blamed for being inflexible? Are you telling me that it’s actually the framework causing the inflexibility?

Again I point back to Uncle Bob on Clean and Screaming Architecture.

If you’re still reading this, I’m sure you’re wondering how this ties in to CQRS and the original blog title.

When I first read about CQRS in Martin Fowler’s post, I became really interested — the idea that you would use different models for commands (“change” the data) and queries (“read” the data) made me think that frameworks that directly map models into applications could be retired in favor of messages related to use cases instead of model objects. To me, this means a data service API or set of virtual data layers which provide interfaces to data for applications, regardless of how the data is physically stored or organized. Huzzah! This would free me as a data engineer to ensure that I organized the data in ways which efficiently supported use cases. Since I tend to work in full-featured RDBMS systems, that meant I could wrap data using a data service API using whatever works, including things like stored procedures or RESTful web APIs using something like Oracle’s APEX listener.

So imagine my dismay when reading about CQRS and coming upon a whole series of blog posts about implementing CQRS expressly to “get rid of the database“. I intently read through the entire series trying to figure out what was wrong with the database that necessitated “getting rid of it” to implement CQRS. All to no avail. I’ve left a comment asking for that information, because I’m generally curious about it, but I have a guess.

It’s not about technology — it’s about the organization and its associated personalities that foster such an attitude.

Really now. In an organization with responsive data engineers there shouldn’t be a need to “get rid of the database”. One of the best reasons to have a database is that it provides so many ways to build the different kinds of models and transform the data between them with minimal need for additional frameworks or mountains of custom code.

In the end, I’m guessing that after years of hearing “No, we can’t do that” from the DBA’s-designated-as-data-engineers, the application team had come to equate the people with the technology. The implication is that the technology is the constraint instead of the people responsible for it.

So, what’s a way out? If your existing technology is the game, make sure you get the best players for every role and responsibility — don’t make your DBAs “play out of position” or else they’ll become hated representations of barriers to progress. If your organizational structure is the game, hate the game, change the game and advocate for skilled data engineers who can make your data more responsive to your businesses use cases. If you believe in “data as truth”, then invest in people who can make that data as useful as possible to as many use cases as you have.

OTN Tour of Latin America: Trinidad & Tobago…

We arrived at Trinidad & Tobago in the evening and it was hot and humid. I decided to wash some clothes in the hotel room sink, which was a mistake. It was so humid that even after about 36 hours my clothes were still damp and smelled a bit like wet dog when I packed them. Asking around it sounds like Guatemala will be just as humid, so I might have to use a hotel laundry service to get my clothes clean. :)

The timetable was such that I really didn’t get to see anything of Trinidad & Tobago. The whole of the first day was taken up with the conference. It was a single track event, so every presentation was to the whole audience. I was the first speaker, so I did a quick poll of the audience and decided to switch from my intended presentation to a more general one, so as not to alienate all but 5 of the audience. :)

The previous events were using translation services, so I had been constantly trying to speak slower. Presenting to an English speaking audience should have made it easier, but I found myself rushing a little. You gotta laugh. I’m never satisfied. :)

When the event was over a few of us went next door to the Hyatt Hotel and sat on the terrace chatting for a couple of hours. I had a Hyatt Sunrise and a Toasted Almond Joy, both of which were very sweet and a little alcoholic. :) Sheeri and I had agreed to present at a separate event the following day, so we were discussing the event with Loren (the organiser). When I heard the audience was made up of students I decided none of my presentations were suitable, so it was back to the hotel to write something new. I got most of the outline done before bed and (almost) finished it in the morning. I spoke to Debra Lilley at breakfast and she gave me a couple of ideas, which I added when I got to the event.

I like to rehearse my presentations a lot, so doing an on-the-fly presentation like this was pretty scary for me. I was really pleased with the way it went. I feel like it flowed well and the response from the audience was good. If anything, I feel like it went better than my presentation from the previous day. During the presentation I made some comments about documenting code and systems, so when I finished Loren asked if I could do a short presentation about documentation. I figured “in for a penny, in for a pound”, so while Sheeri was presenting I wrote some slides. As soon as Sheeri was finished I did a 20 minute spot on documentation. Two unplanned presentations in one day is about as much as my tiny brain could cope with. :)

Soon after our sessions were done, we got a taxi to the airport for our flights to Guatemala. The crammed timetable meant all I really saw of Trinidad & Tobago was:

  • The airport.
  • The Hilton hotel.
  • The government building hosting the two events.
  • The terrace of the Hyatt hotel.
  • The scenes on the two fridge magnets I bought at the airport.
  • A guy selling coconuts from a cart near a park (thanks to Debra for reminding me :) ).

I’m not sure I can count this as a visit to the carribean. It’s a bit like saying you’ve been to a country when all you’ve done is a wait at an airport for a connecting flight. :)

The timetable for Guatamala suggests it’s going to be a similar story there. Such is life!

Thanks to Loren and the gang for organising the event and of course, thanks to the Oracle ACE program for making it all happen.



PS. I’m looking forward to getting some sleep. I’ve kinda forgotten what it’s like…

PPS. I don’t have many photos, but what I have are here.

OTN Tour of Latin America: Trinidad & Tobago… was first posted on July 11, 2012 at 1:43 pm.
©2012 "The ORACLE-BASE Blog". Use of this feed is for personal non-commercial use only. If you are not reading this article in your feed reader, then the site is guilty of copyright infringement.

The linux ‘perf’ utility with EL 6

Some of you might have experimented with, or used Oprofile, ltrace/strace, Systemtap, and wished you could look further into the userland process, like dtrace does, because dtrace allows you to profile and look into a process’ function calls.

If you’ve used oprofile/strace/systemtap/etc., you probably found out you can see all the kernel functions which are processed, but it does not get into userspace, or with very limited information. Perhaps the only tool which is different is gdb, which enables you to see userspace function names, but gdb is not a profiler, it’s a debugger. And it works best with (I assume it made for) debugging symbols in the executable. Most (all I presume) commercial executables, like the oracle database executable, do not have debugging symbols.

Some time ago, a tweet from Kevin Closson caught my attention, which showed a profile of a Linux system running Oracle, which showed the functions inside Oracle. This totally shocked me! That is something I want to be able to do on Linux too!!

It was done using the perf utility. The perf utility is delivered with the kernel (rpm) package on EL Linuxes, and the since I am using Oracle Linux, I can tell you it’s delivered with the ‘UEK’ kernels (the 2.6.32+ ones). I’ve eventually used the latest version of the UEK kernel (2.6.39-200.24.1.el6uek.x86_64), which you get when you subscribe to the ol6_UEK_latest yum channel. Newer versions of the kernel have more perf functions.

The ‘perf top’ sub-utility shows the kernel and userland functions which are active (by sampling active processes). It can be used in a variety of ways, some of them are: system-wide, per processor or per process. This allows you to do look beyond the Oracle wait interface. As Tanel Poder often states in his performance classes: the Oracle wait interface is a good starting point for performance investigation, but sometimes you need to look deeper. You always need to use the appropriate layer to look at when investigating issues.

Note when using virtualization: virtualization does not allow direct access to the performance monitoring hardware. So if you try to use perf on that (or oprofile for that matter), you get no output. You can get around that with perf by using a software event. I use the software event cpu-clock. This means I invoke perf with the option ‘-e cpu-clock’

For example: this statement uses a lot of CPU, and might (depending on Oracle version, operating system type and version) never return an answer. But it is perfect to have a statement that consumes approximately 1 CPU, and keeps doing that for some time:

select count(*) from dba_extents, dba_extents, dba_extents

A quick glimpse on the statistics profile using sql_monitor reveals:

Global Stats
| Elapsed |   Cpu   |  Other   | Buffer |
| Time(s) | Time(s) | Waits(s) |  Gets	|
|     109 |	103 |	  6.23 |   9755 |

Of course I could look in the sql monitoring plan details, which would reveal what row sources are active at the moment, but let’s see what ‘perf top -e cpu-clock -p shows:

   PerfTop:     108 irqs/sec  kernel: 0.9%  exact:  0.0% [1000Hz cpu-clock],  (target_pid: 12727)

             samples  pcnt function                  DSO
             _______ _____ _________________________ ____________________________________________________

             1414.00 58.8% kxhrHash                  /u01/app/oracle/product/
              389.00 16.2% qerhjSplitProbe           /u01/app/oracle/product/
              263.00 10.9% rworofprVeryFastUnpackRow /u01/app/oracle/product/
              122.00  5.1% smbget                    /u01/app/oracle/product/
               93.00  3.9% sorgetqbf                 /u01/app/oracle/product/
               86.00  3.6% qersoFetch                /u01/app/oracle/product/
               31.00  1.3% qerhjInnerProbeHashTable  /u01/app/oracle/product/

This shows 58.8% of the samples takes of this process are busy in the function ‘kxhrHash’. So it’s the hashing which this process is busy with. Another important and interesting aspect is: there is very little time spend in kernel mode: (kernel: 0.9%). But…the information about where the time is spend could have been read in the sql monitoring report: it reports CPU and waits per row source.

Where perf comes in to fulfill something which could not be done different, is when you got multiple processes, and you cannot setup perf for every distinct process, or you want to understand system-wide performance. To mimic a system doing OLTP, I’ve setup Kevin Closson’s SLOB in PIO mode, and use perf top -e cpu-clock without specifying a process ID, so I measure the entire system. I get the following profile:

  PerfTop:     984 irqs/sec  kernel:92.4%  exact:  0.0% [1000Hz cpu-clock],  (all, 1 CPU)

             samples  pcnt function                           DSO
             _______ _____ __________________________________ ______________________________________________________________________

            13102.00 88.3% _raw_spin_unlock_irqrestore        [kernel.kallsyms]                                                     
              772.00  5.2% copy_page                          [kernel.kallsyms]                                                     
              298.00  2.0% __do_softirq                       [kernel.kallsyms]                                                     
              114.00  0.8% scsi_request_fn                    [kernel.kallsyms]                                                     
               29.00  0.2% kcbgtcr                            /u01/app/oracle/product/                  
               25.00  0.2% kmem_cache_alloc                   [kernel.kallsyms]                                                     
               25.00  0.2% finish_task_switch                 [kernel.kallsyms]                                                     
               22.00  0.1% kcbzib                             /u01/app/oracle/product/                  
               17.00  0.1% sxorchk                            /u01/app/oracle/product/

Conclusion: the majority of the processes spend time in kernel calls (kernel.kallsyms). This is not too surprising: I've setup SLOB in 'PIO mode', which means I've reduced the Oracle database buffer cache as much as possible. With Oracle version, the smallest buffer cache I can create is 4MB. Next I've let 10 readers do index range scans (that is one of the functions of SLOB: swamping your system with single block (8KB) IO's). The majority of the processes are spending time in the kernel function _raw_spin_unlock_irqrestore. This is how a system which is swamped with physical IO looks like with perf.

This all changes when I revert SLOB to LIO mode, and issue the same test (SLOB run with 10 readers):

   PerfTop:    1004 irqs/sec  kernel: 3.1%  exact:  0.0% [1000Hz cpu-clock],  (all, 1 CPU)

             samples  pcnt function                    DSO
             _______ _____ ___________________________ ____________________________________________________

              478.00 23.9% kcbgtcr                     /u01/app/oracle/product/
              154.00  7.7% qetlbr                      /u01/app/oracle/product/
              149.00  7.4% kcbz_fp_buf                 /u01/app/oracle/product/
              133.00  6.6% kdsgrp                      /u01/app/oracle/product/
              104.00  5.2% kafger                      /u01/app/oracle/product/
               94.00  4.7% kcbrls                      /u01/app/oracle/product/
               60.00  3.0% ktrgcm                      /u01/app/oracle/product/
               60.00  3.0% ktrget3                     /u01/app/oracle/product/

Ah, we now spend only 3.1% in kernel mode, and the majority of the time is spend in kcbgtcr (quite probably meaning 'kernel cache buffer get consistent read')! I've modified the buffer cache to be able to contain all blocks (also known as SLOB 'LIO mode'), so once they are read, only a logical IO is needed to fetch the block.

Expanding SQL

Here’s a little thing that Dan Morgan mentioned to me some time ago. It’s a little routine from a package (owned by sys) that appeared in that gives you some idea of the mess hidden behind a query that uses views. The procedure is dbms_sql2.expand_sql_text and takes two (CLOB) parameters, an IN parameter that is the text you want expanded, and an OUT parameter that is likely to be a long and messy piece of text. Rather than say any more, here’s an example of usage:

set linesize 70
set pagesize 0
set feedback off

	m_sql_in	clob :='select * from dba_synonyms where owner = ''TEST_USER''';
	m_sql_out	clob := empty_clob();




Notice how I’ve doubled up all the single quotes.
Here’s the output – you might want to try this only after setting linesize to 32767, and then you can paste the result into an SQL formatter:

_LINK" FROM "SYS"."USER$" "A4","SYS"."SYN$" "A3", (SELECT "A6"."OBJ#"
A6"."TYPE#" "TYPE#","A6"."CTIME" "CTIME","A6"."MTIME" "MTIME","A6"."ST
."SPARE1" "SPARE1","A6"."SPARE2" "SPARE2","A6"."SPARE3" "SPARE3","A6".
"SPARE4" "SPARE4","A6"."SPARE5" "SPARE5","A6"."SPARE6" "SPARE6","A6"."
SPARE3" "OWNER#",CASE  WHEN (("A6"."TYPE#"<>4 AND "A6"."TYPE#"<>5 AND
"A6"."TYPE#"<>7 AND "A6"."TYPE#"<>8 AND "A6"."TYPE#"<>9 AND "A6"."TYPE
#"<>10 AND "A6"."TYPE#"<>11 AND "A6"."TYPE#"<>12 AND "A6"."TYPE#"<>13
AND "A6"."TYPE#"<>14 AND "A6"."TYPE#"<>22 AND "A6"."TYPE#"<>87) OR BIT
A9"."NAME" "NAME" FROM SYS."OBJ$" "A9" WHERE "A9"."OBJ#"="A5"."SPARE2"
ER$" "A5" WHERE "A6"."OWNER#"="A5"."USER#" AND ("A6"."TYPE#"<>4 AND "A
6"."TYPE#"<>5 AND "A6"."TYPE#"<>7 AND "A6"."TYPE#"<>8 AND "A6"."TYPE#"
<>9 AND "A6"."TYPE#"<>10 AND "A6"."TYPE#"<>11 AND "A6"."TYPE#"<>12 AND
 "A6"."TYPE#"<>13 AND "A6"."TYPE#"<>14 AND "A6"."TYPE#"<>22 AND "A6"."
TYPE#"<>87 AND "A6"."TYPE#"<>88 OR BITAND("A5"."SPARE1",16)=0 OR ("A6"
."TYPE#"=4 OR "A6"."TYPE#"=5 OR "A6"."TYPE#"=7 OR "A6"."TYPE#"=8 OR "A
6"."TYPE#"=9 OR "A6"."TYPE#"=10 OR "A6"."TYPE#"=11 OR "A6"."TYPE#"=12
OR "A6"."TYPE#"=13 OR "A6"."TYPE#"=14 OR "A6"."TYPE#"=22 OR "A6"."TYPE
#"=87) AND ("A5"."TYPE#"<>2 AND SYS_CONTEXT('userenv','current_edition
_name')='ORA$BASE' OR "A5"."TYPE#"=2 AND "A5"."SPARE2"=SYS_CONTEXT('us
erenv','current_edition_id') OR  EXISTS (SELECT 0 FROM SYS."OBJ$" "A8"
,SYS."USER$" "A7" WHERE "A8"."TYPE#"=88 AND "A8"."DATAOBJ#"="A6"."OBJ#
" AND "A8"."OWNER#"="A7"."USER#" AND "A7"."TYPE#"=2 AND "A7"."SPARE2"=
SYS_CONTEXT('userenv','current_edition_id'))))) "A2" WHERE "A2"."OBJ#"
="A3"."OBJ#" AND "A2"."TYPE#"=5 AND "A2"."OWNER#"="A4"."USER#") "A1" W

This expansion is probably the first step the optimizer takes in handling your code – but it’s still not the transformed text that is ultimately the “unparsed” version of your statement.

Random Plans

Have you ever felt that the optimizer was persecuting you by picking plans at random ? Perhaps you’re not paranoid, perhaps it’s part of Oracle Corp’s. master plan to take over the world. If you look closely at the list of hidden parameters you’ll find that some details of this cunning plan have leaked. In Oracle created a new parameter _optimizer_random_plan with the description “optimizer seed value for random plans”. Who knows what terrible effects we may see when the default value of this parameter changes.

Scale Abilities shortlisted as UKOUG Partner of the Year

The UK Oracle User Group has revealed the shortlisted Partners for this year’s Partner of the Year Awards which features Oracle suppliers in a variety of areas including awards such as; E-Business Suite; Public Sector and Innovation Partner of the Year. We are pleased to announce that Scale Abilities has been shortlisted for Database Partner […]

InSync12 Conference: Melbourne 21-22 August 2012

I’ll again be presenting at this year’s InSync12 Conference, this year held in Melbourne on 21-22 August at the Sebel Albert Park Hotel. It’s Australia’s main Oracle conference of the year and should again be an excellent event with lots of great speakers. I’ll be presenting my paper on “Indexing In Exadata“, which should be [...]

Parallel Execution Challenge – It’s About Time

July 8, 2012 As I was reading a book, I saw a comment essentially stating that with multiple CPUs/cores, parallel execution will be faster than serial execution of the same query.  The book provided a test case - I decided to change around the test case a bit so that the results were a bit more fair, [...]