Who's online

There are currently 0 users and 35 guests online.

Recent comments

Oakies Blog Aggregator

The twelve days of … AskTom

On the 1st day of AskTom, a poster sent to me

A question about 12c


On the 2nd day of AskTom, a poster sent to me

two SQL queries,

in a question about 12c


On the 3rd day of AskTom, a poster sent to me

three table joins,

two SQL queries,

in a question about 12c


On the 4th day of AskTom, a poster sent to me

four nested queries,

three table joins,

two SQL queries,

in a question about 12c


On the 5th day of AskTom, a poster sent to me

five pattern matches,

four nested queries,

three table joins,

two SQL queries,

in a question about 12c


On the 6th day of AskTom, a poster sent to me

six analytic functions,

five pattern matches,

four nested queries,

three table joins,

two SQL queries,

in a question about 12c


On the 7th day of AskTom, a poster sent to me

seven regular expressions,

six analytic functions,

five pattern matches,

four nested queries,

three table joins,

two SQL queries,

in a question about 12c


On the 8th day of AskTom, a poster sent to me

eight model clauses,

seven regular expressions,

six analytic functions,

five pattern matches,

four nested queries,

three table joins,

two SQL queries,

in a question about 12c


On the 9th day of AskTom, a poster sent to me

nine object types,

eight model clauses,

seven regular expressions,

six analytic functions,

five pattern matches,

four nested queries,

three table joins,

two SQL queries,

in a question about 12c


On the 10th day of AskTom, a poster sent to me

ten cubes and rollups,

nine object types,

eight model clauses,

seven regular expressions,

six analytic functions,

five pattern matches,

four nested queries,

three table joins,

two SQL queries,

in a question about 12c


On the 11th day of AskTom, a poster sent to me

eleven policy functions,

ten cubes and rollups,

nine object types,

eight model clauses,

seven regular expressions,

six analytic functions,

five pattern matches,

four nested queries,

three table joins,

two SQL queries,

in a question about 12c


On the 12th day of AskTom, a poster sent to me

twelve optimizer hints,

eleven policy functions,

ten cubes and rollups,

nine object types,

eight model clauses,

seven regular expressions,

six analytic functions,

five pattern matches,

four nested queries,

three table joins,

two SQL queries,

in a question about 12c


and I said….



Thanks to all those people who have been using AskTom since Chris and I took the reins in September.  We value your questions – we hope you are getting value out of our answers.  Just dont forget to provide us with test cases Smile

Top 5 SQL Monitor Features

SQL Monitor is one cool tool.  Via EM12c, it empowers everyone, (yes, that means the developer, too, so give them the access to view this data via Cloud Control, please!)  This is a top five list, but it doesn’t stop there, just remember, this is my favorite features when using Cloud Control with SQL Monitoring.

Once logged into the database, you can gain access to SQL Monitor by clicking on Performance –> SQL Monitoring.

1. High Level View

The main view of active and inactive processing, (just remember, SQL Monitor processes will age out, so keep that in mind that this tool isn’t for looking at historical processing.) is a great place to start when we talk about value to your day to day job.


From the main view, you can easily see what sessions are active, the SQLID, if there is parallel and what the DOP, (Degree of Parallelism) granted, database time, (broken down by wait event percentage) and IO Requests.  There is a ton of pertinent data here, including plan hash values, start and stop time, plus the SQL Text, but everyone has to admit, this is what we all want and need to know.

2. SQL ID Details via SQL Monitor

By clicking on any of the SQLIDs displayed in the SQL Monitor main page, you can then dig into the details surrounding a SQL process.


For the given SQLID, you can view the duration and type of wait event resource consumption.  You can quickly see what the cost is and in graphs added, quickly identify what performance challenges a particular execution plan is most hindered by.  There are arrows to help the DBA or developer navigate the plan to understand what pain points are most evident and where the plan connects from bottom to top as the process filters through data to come up with it’s results.

3.  Offload Percent

Yes, you can use a query to see the offload percent for a process, but SQL Monitor is nice enough to show you right in Cloud Control-


If you’re working on an Exadata and want to make sure you’re using all those great engineered system features you paid for, it’s nice to know you’re offloading to a cell node vs. doing the work on your database node that should be used for other work than scanning objects.

4. Resource Usage Details for a Distinct SQLID

Knowing how many resources are going to a process are essential to knowing how to eliminate extra resource usage.


Under Metrics for each SQLID captured by SQL Monitor, you’ll be able to view the amount of CPU, PGA and IO used by the process.  You’ll even be able to tell what kind of waits you’re experiencing if you don’t use Google Chrome, (I’m joking…OK, maybe I’m not, but you understand browser challenges, I’m sure… :))

5.  Session Details

By clicking on the User from the main view, I’m able to log into the session details.  The actual session ID won’t be displayed, (in this case, we just see SH for each session) but we can click on the link, there by displaying vital info about the session details.


We can quickly link to the SQLID, view if there are any session blocking the one we are investigating and see pertinent wait information including the P1, P2 and P3 waits.

If I click on the bottom link in the WAIT section, it will take me to the object that is causing the most waits, which happens to be the SALES table.


I can now view column and constraint info without having to go to the schema feature within Cloud Control.

I can also view information about indexes and statistics, which no longer is accessible from SQL Details in Cloud Control.


6.  (Bonus) Viewing Waits and Pain Points

As a bonus, knowing how many of us like to see tabular vs. graphical execution plans, it may be worthwhile to also view them in graphical form.  Below you can see as I hover my cursor over the sections that are highlighted with yellow lines from the Sales table and then Red for the filtering, you can note that it clearly displays that there are 13 million rows having a nested loop performed out of a 26 million row table.  I think most of us know that a hash join would have performed much faster and this is why the section has been highlighted in red as the pain point for the SQL in question.


Just a little bonus for those of you still as hooked on tabular execution plans as I am… :)

Have a great week after the long weekend and hope your Thanksgiving was wonderful!


Tags:  ,





Copyright © DBA Kevlar [Top 5 SQL Monitor Features], All Right Reserved. 2015.

RAM is the new disk – and how to measure its performance – Part 3 – CPU Instructions & Cycles

If you haven’t read the previous parts of this series yet, here are the links: [ Part 1 | Part 2 ].

A Refresher

In the first part of this series I said that RAM access is the slow component of a modern in-memory database engine and for performance you’d want to reduce RAM access as much as possible. Reduced memory traffic thanks to the new columnar data formats is the most important enabler for the awesome In-Memory processing performance and SIMD is just icing on the cake.

In the second part I also showed how to measure the CPU efficiency of your (Oracle) process using a Linux perf stat command. How well your applications actually utilize your CPU execution units depends on many factors. The biggest factor is your process’es cache efficiency that depends on the CPU cache size and your application’s memory access patterns. Regardless of what the OS CPU accounting tools like top or vmstat may show you, your “100% busy” CPUs may actually spend a significant amount of their cycles internally idle, with a stalled pipeline, waiting for some event (like a memory line arrival from RAM) to happen.

Luckily there are plenty of tools for measuring what’s actually going on inside the CPUs, thanks to modern processors having CPU Performance Counters (CPC) built in to them.

A key derived metric for understanding CPU-efficiency is the IPC (instructions per cycle). Years ago people were actually talking about the inverse metric CPI (cycles per instruction) as on average it took more than one CPU cycle to complete an instruction’s execution (again, due to the abovementioned reasons like memory stalls). However, thanks to today’s superscalar processors with out-of-order execution on a modern CPU’s multiple execution units – and with large CPU caches – a well-optimized application can execute multiple instructions per a single CPU cycle, thus it’s more natural to use the IPC (instructions-per-cycle) metric. With IPC, higher is better.

Here’s a trimmed snippet from the previous article, a process that was doing a fully cached full table scan of an Oracle table (stored in plain old row-oriented format):

Performance counter stats for process id '34783':

      27373.819908 task-clock                #    0.912 CPUs utilized
    86,428,653,040 cycles                    #    3.157 GHz                     [33.33%]
    32,115,412,877 instructions              #    #ff0000;">0.37  insns per cycle
                                             #    #ff0000;">2.39  stalled cycles per insn [40.00%]
    76,697,049,420 stalled-cycles-frontend   #   88.74% frontend cycles idle    [40.00%]
    58,627,393,395 stalled-cycles-backend    #   67.83% backend  cycles idle    [40.00%]
       256,440,384 cache-references          #    9.368 M/sec                   [26.67%]
       222,036,981 cache-misses              #   86.584 % of all cache refs     [26.66%]

      30.000601214 seconds time elapsed

The IPC of the above task is pretty bad – the CPU managed to complete only 0.37 instructions per CPU cycle. On average every instruction execution was stalled in the execution pipeline for 2.39 CPU cycles.

Note: Various additional metrics can be used for drilling down into why the CPUs spent so much time stalling (like cache misses & RAM access). I covered the typical perf stat metrics in the part 2 of this series so won’t go in more detail here.

Test Scenarios

The goal of my experiments was to measure the number CPU-efficiency of different data scanning approaches in Oracle – on different data storage formats. I focused only on data scanning and filtering, not joins or aggregations. I ensured that everything would be cached in Oracle’s buffer cache or in-memory column store for all test runs – so disk IO was not a factor here (again, read more about my test environment setup in part 2 of this series).

The queries I ran were mostly variations of this:

SELECT COUNT(cust_valid) FROM customers_nopart c WHERE cust_id > 0

Although I was after testing the full table scanning speeds, I also added two examples of scanning through the entire table’s rows via index range scans. This allows me to show how inefficient index range scans can be when accessing a large part of a table’s rows even when all is cached in memory. Even though you see different WHERE clauses in some of the tests, they all are designed so that they go through all rows of the table (just using different access patterns and code paths).

The descriptions of test runs should be self-explanatory:


SELECT /*+ MONITOR INDEX(c(cust_postal_code)) */ COUNT(cust_valid)
FROM customers_nopart c WHERE cust_postal_code > '0';


SELECT /*+ MONITOR INDEX(c(cust_id)) */ COUNT(cust_valid)
FROM customers_nopart c WHERE cust_id > 0;


FROM customers_nopart c WHERE cust_id > 0;


FROM customers_nopart c WHERE cust_id > 0;


FROM customers_nopart c;


SELECT /*+ MONITOR */ COUNT(cust_valid) 
FROM customers_nopart_hcc_ql WHERE cust_id > 0

Note how all experiments except the last one are scanning the same physical table just with different options (like index scan or in-memory access path) enabled. The last experiment is against a copy of the same table (same columns, same rows), but just physically formatted in the HCC format (and fully cached in buffer cache).

Test Results: Raw Numbers

It is not enough to just look into the CPU performance counters of different experiments, they are too low level. For the full picture, we also want to know how much work (like logical IOs etc) the application was doing and how many rows were eventually processed in each case. Also I verified that I did get the exact desired execution plans, access paths and that no physical IOs or other wait events happened using the usual Oracle metrics (see the log below).

Here’s the experiment log file with full performance numbers from SQL Monitoring reports, Snapper and perf stat:

I also put all these numbers (plus some derived values) into a spreadsheet. I’ve pasted a screenshot of the data below for convenience, but you can access the entire spreadsheet with its raw data and charts here (note that the spreadsheet has multiple tabs and configurable pivot charts in it):

Raw perf stat data from the experiments:

oracle scan test results.png

Now let’s plot some charts!

Test Results: CPU Instructions

Let’s start from something simple and gradually work our way deeper. I will start from listing the task-clock-ms metric that shows the CPU time usage of the Oracle process in milliseconds for each of my test table scans. This metric comes from the OS-level and not from within the CPU:

CPU time used for scanning the dataset (in milliseconds)

As I mentioned earlier, I added two index (full) range scan based approaches for comparison. Looks like the index-based “full table scans” seen in first and second columns are using the most CPU-time as the OS sees it (~120 and close to 40 seconds of CPU respectively).

Now let’s see how many CPU instructions (how much work “requested” from CPU) the Oracle process executed for scanning the same dataset using different access paths and storage formats:

oracle table scan instructions clean.png
CPU instructions executed for scanning the dataset

Wow, the index-based approaches seem to be issuing multiple times more CPU instructions per query execution than any of the full table scans. Whatever loops the Oracle process is executing for processing the index-based query, it runs more of them. Or whatever functions it calls within those loops, the functions are “fatter”. Or both.

Let’s look into an Oracle-level metric session logical reads to see how many buffer gets it is doing:

oracle buffer gets clean.png
Buffer gets done for a table scan


Wow, using the index with bad clustering factor (1st bar) causes Oracle to do over 60M logical IOs, while the table scans do around 1.6M of logical IOs each. Retrieving all rows of a table via an index range scan is super-inefficient, given that the underlying table size is only 1613824 blocks.

This inefficiency is due to index range scans having to re-visit the same datablocks multiple times (up to one visit per row, depending on the clustering factor of the index used). This would cause another logical IO and use more CPU cycles for each buffer re-visit, except in cases where Oracle has managed to keep a buffer pinned since last visit. The index range scan with a good clustering factor needs to do much fewer logical IOs as given the more “local” clustered table access pattern, the re-visited buffers are much more likely found already looked-up and pinned (shown as the buffer is pinned count metric in V$SESSTAT).

Knowing that my test table has 69,642,625 rows in it, I can also derive an average CPU instructions per row processed metric from the total instruction amounts:

instructions per row.png

The same numbers in tabular form:

Screen Shot 2015-11-30 at 00.38.12

Indeed there seem to be radical code path differences (that come from underlying data and cache structure differences) that make an index-based lookup use thousands of instructions per row processed, while an in-memory scan with a single predicate used only 102 instructions per row processed on average. The in-memory counting without any predicates didn’t need to execute any data comparison logic in it, so could do its data access and counting with only 43 instructions per row on average.

So far I’ve shown you some basic stuff. As this article is about studying the full table scan efficiency, I will omit the index-access metrics from further charts. The raw metrics are all available in the raw text file and spreadsheet mentioned above.

Here are again the buffer gets of only the four different full table scan test cases:

oracle buffer gets table scan only.png
Buffer gets done for full table scans

All test cases except the HCC-compressed table scan cause the same amount of buffer gets (~1.6M) as this is the original table’s size in blocks. The HCC table is only slightly smaller – didn’t get great compression with the query low setting.

Now let’s check the number CPU instructions executed by these test runs:

oracle table scan only instructions.png
CPU instructions executed for full table scans

Wow, despite the table sizes and number of logical IOs being relatively similar, the amount of machine code the Oracle process executes is wildly different! Remember, all that my query is doing is just scanning and filtering the data followed with a basic COUNT(column) operation – no additional sorting, joining is done. The in-memory access paths (column 3 & 4) get away with executing much fewer CPU instructions than the regular buffered tables in row-format and HCC format (columns 1 & 2 in the chart).

All the above shows that not all logical IOs are equal, depending on your workload and execution plans (how many block visits, how many rows extracted per block visit) and underlying storage formats (regular row-format, HCC in buffer cache or compressed columns in In-Memory column store), you may end up doing a different amount of CPU work per row retrieved for your query.

This was true before the In-Memory option and even more noticeable with the In-Memory option. But more about this in a future article.

Test Results: CPU Cycles

Let’s go deeper. We already looked into how many buffer gets and CPU instructions the process executed for the different test cases. Now let’s look into how much actual CPU time (in form of CPU cycles) these tests consumed. I added the CPU cycles metric to instructions for that:

instructions and cycles.png
CPU instructions and cycles used for full table scans

Hey, what? How come the regular row-oriented block format table scan (TABLE BUFCACHE) takes over twice more CPU cycles compared to its instructions executed?

Also, how come all the other table access methods use noticeably less CPU cycles than the number of instructions they’ve executed?

If you paid attention to this article (and previous ones) you’ll already know why. In the 1st example (TABLE BUFCACHE) the CPU must have been “waiting” for something a lot, instructions having spent multiple cycles “idle”, stalled in the pipeline, waiting for some event or necessary condition to happen (like a memory line arriving from RAM).

For example, if you are constantly waiting for the “random” RAM lines you want to access due to inefficient memory structures for scanning (like Oracle’s row-oriented datablocks), the CPU will be bottlenecked by RAM access. The CPU’s internal execution units, other than the load-store units, would be idle most of the time. The OS top command would still show you 100% utilization of a CPU by your process, but in reality you could squeeze much more out of your CPU if it didn’t have to wait for RAM so much.

In the other 3 examples above (columns 2-4), apparently there is no serious RAM (or other pipeline-stalling) bottleneck as in all cases we are able to use the multiple execution units of modern superscalar CPUs to complete more than one instruction per CPU cycle. Of course more improvements might be possible, but more about this in a following post.

For now I’ll conclude this (lengthy) post with one more chart with the fundamental derived metric instructions per cycle (IPC):

instructions per cycle.png

The IPC metric is derived from the previously shown instructions and CPU cycles metrics by a simple division. Higher IPC is better as it means that your CPU execution units are more utilized, it gets more done. However, as IPC is a ratio, you should never look into the IPC value alone, always look into it together with instructions and cycles metrics. It’s better to execute 1 Million instructions with IPC of 0.5 than 1 Billion instructions with an IPC of 3 – but looking into IPC in isolation doesn’t tell you how much work was actually done. Additionally, you’d want to use your application level metrics that give you an indication of how much application work got done (I used Oracle’s buffer gets and rows processed metrics for this).

Looks like there’s at least 2 more parts left in this series (advanced metrics and a summary), but let’s see how it goes. Sorry for any typos, it’s getting quite late and I’ll fix ’em some other day :)


NB! Dates updated: After a 1.5 year break, this year’s only Advanced Oracle Troubleshooting training class (updated with Oracle 12c content) takes place on 14-18 December 2015 and 11-15 January 2016 (I had to reschedule the start from November to December). So sign up now if you want to learn new cool stuff!


This is the text of the “whitepaper” I submitted to DOAG for my presentation on “Core Strategies for Troubleshooting”.


In an ideal world, everyone who had to handle performance problems would have access to ASH and the AWR through a graphic interface – but even with these tools you still have to pick the right approach, recognise the correct targets, and acquire information at the boundary that tells you why you have a performance problem and the ways in which you should be addressing it.

There are only three ways you can waste resources on an Oracle system, and only three different types of activity that need to be investigated. If you don’t appreciate that this is the case then you can waste a lot of time following the wrong strategy and attempting to solve the wrong problems. Once you have a clear statement of what you are trying to achieve it becomes much easier to achieve those aims.

Three ways to cause problems

There really are only three symptoms you need to look out for in the database

  • You’re doing it the hard way
  • You’re doing it too often
  • You’re queueing / locking

Inevitably you can see that there is going to be some overlap between the three (and it would be easy to argue that the third is simply a side effect of the other two). If you are executing a single statement “the hard way” you’re likely to be doing single block reads or buffer gets too often. If you’re executing a very lightweight statement too often it’s probably a step in a process that is attempting get a job done “the hard way”, and it’s probably causing (and getting stuck in) queues for latches and mutexes. I’ve included queueing as the third option because the simple action of locking data (deliberately, or by accident) doesn’t fall neatly into the other two.

Another way of looking at this list is to reduce it two just two items with a slightly different flavour: when you spend too much time on a task it’s because you’re either doing too much work, or you’re not being allowed to work.

Three classes of problems to solve

An important aspect of performance problems is the high-level classification; labelling the class of problem properly points you to the correct strategy for investigating the problem. Again there are only three possibilities in the list, which I’ll express as typical complaints:

  • My report is taking too long to run / screen is taking to long to refresh
  • The batch job over-ran the SLA last night
  • The “system” is slow

What’s the most significant difference between three classes of complaint ?

  • “My XXX takes too long”: XXX is a repeatable event that can be isolated and watched – just do it again, and again, and again, and again while I watch every step of what’s going on.
  • The batch job over-ran last night: It’s not a repeatable process, so you’ve got to infer what the problem was from historical evidence; it’s (usually) made up of a number of concurrent processes, which may interfere with each other to varying degrees depending on when their relative start and finish times are.
  • The “system” is slow: possibly all the time, possibly intermittently – if there’s no specific complaint then the only option is to key an eye open for resource-intensive activity to see if you can reduce the workload of specific individual tasks (for which read SQL or PL/SQL statements) or reduce the number of time that those tasks are executed.

The common source

Despite the fact that we can classify performance problems in three ways, it’s worth remembering that ALL the information we might use to drive our diagnosis comes from one place – the Oracle database. At any moment we have sessions that are active, operating statements (cursors), and using a resource such as a file, a latch, a buffer, and so on. We could almost represent each moment by a cube, with sessions along one size, cursors along another, and resources along the third – the cube changes moment by moment, allowing us to visualise time as the fourth dimension in a hypercube of activity.

Instant by instant the Oracle kernel code knows which session is using what resource to operate which cursor – and although the total volume of all that information is far more than could reasonably be recorded, Oracle has many different ways of slicing, dicing and capturing parts of that hypercube – running totals, rolling aggregates, snapshots by session, by cursor, by resource and so on – that are made visible as the dynamic performance views (v$ objects). Trouble-shooting is largely a case of deciding which dynamic performance views are the most appropriate to use for our three classes of task.

Active Session History

Before reviewing the three classes, it’s worth break off for a moment to say a few things about one of the most important and useful views that we have into the wealth of information available; this is the active session history (v$active_session_history / ASH) which Oracle uses to capture a snapshot once every second of what each currently active session is doing; every 10th snapshot is then echoed down into the Automatic Workload Repository (AWR) by a process that runs every hour to copy the dynamic view to a table that can be view through the database view dba_hist_active_sess_history.

The capture rates can all be adjusted: I have never seen anyone change from one snapshot per second, or every 10th snapshot in a production system, but I’ve often seen the dump to the AWR taken every 30 minutes, occasionally 20 or even 15 minutes. On occasion I have asked clients to do a CTAS (create table as select – nologging) to capture the whole of the v$active_session_history to a table that can be exported and reviewed at a later date. Oracle tries to keep a minimum of at least an hour’s ASH in memory but, with a large enough SGA, you may find that this will stretch out to 3 or 4 hours.

Unfortunately although (or perhaps because) ASH and its AWR history are extremely helpful, you have to pay extra licence fees to use the information, and the technology can only be licensed with the Enterprise Edition of Oracle.

My report is slow

The special feature of someone complaining about a specific task is that it’s likely to be repeatable – so we can run it again and again and watch every single detail to see where the time goes. Our slice through the hypercube could take a single session over a period of time and report every action along that path. This, of course, is the 10046 – a.k.a extended SQL trace event. We can enable it in many ways, perhaps through a logon trigger, perhaps through a call to dbms_monitor:

	session_id => &m_sid, 
	serial_num => &m_serial,
	waits      => true, 
	bind       => true, 
	plan_stat  => 'all_executions'

In this example I’ve request all wait states and bind variable to be dumped into the trace file, I’ve also requested that the execution plan (with rowsource execution stats) be dumped for every single execution of every single statement. Sometimes a problem arises because a particular set of bind variables represents a special case that causes a “reasonable” plan to behave very badly. If we’re going to look closely we may as well get as much detail as possible.

The entire “trace” interface was upgraded dramatically in 11g, and one of the useful variants on this theme is particularly relevant to a commonly used Web-based implementation. If you know that a specific screen task corresponds to a particular PL/SQL package you can enable tracing of a cursor (across the system, if necessary) by SQL_ID. So, for example, you might issue the following two commands, with a couple of minutes gap between the two:

alter system
set events ‘sql_trace[SQL:1wthpj7as7urp]
wait=true, bind=true’

— wait a few minutes

alter system
set events ‘sql_trace[SQL:1wthpj7as7urp] off’

Every time the statement with SQL_ID =‘1wthpj7as7urp’ is executed, the session executing it will start adding information to the session trace file, and when the statement ends the tracing will end. This is particularly nice if the “statement” is a top-level call to a PL/SQL procedure because all the SQL inside the procedure will be traced as the package executes.

For a highly focused, highly repeatable task, the 10046 trace event is almost always all you need to do.

The batch over-ran

The big difference between this case and the previous one is that “the batch” is not something you can simply repeat and watch. Moreover, “the batch” is likely to be a large number of separate sections of code that are scheduled to run with a fairly fluid timetable that can result in changes from day to day (or, more likely, night to night) in the set of jobs that might be running concurrently. This means that even if you could re-run the batch job (perhaps on the previous night’s backup) you might not see the same problem appear because a small change in timing could result in a large change in contention).

One of the most important steps of dealing with the batch is pre-emptive: instrument your code and make it possible to compare the run on one night with the run on another. At the very least you need to have something capturing the start and end times of each “significant component” of the batch so you can quickly answer questions like: “which jobs took much longer than usual”, “which job was the first job that took longer than usual”, “which jobs were running concurrently with job X last night when they never usually overlap?”

Ideally you should have much more information than this about each job – basically a report from Oracle which says “how much work did I do, how much time did I spend”: for a session this is simply a report of v$mystat or v$sesstat (joined to v$statname) and v$session_event for the session (v$mystat is a less well- known view that is v$sesstat restricted to “my” session) ; if you classify each job as “connect to the database, do something, disconnect” then this critical log is simply a pair of select statements spooled out somewhere convenient, or written to the database; if you want to break a single connection into significant pieces then a simple pl/sql procedure could read the statistics into a pl/sql array as the piece starts, then re-read the stats and calculate the differences as the piece ends.

Knowing where the time went, and knowing how tasks have behaved differently from previous runs is a big step forward to identifying the problem.

If you don’t have the instrumentation you need then the AWR (if you’re licensed) or Statspack (if you’re not licensed) is a step in the right direction. Apart from the typical hourly snapshots and reports you can take a snapshot as the first and last steps of the batch so that you’ve got “the whole batch” in a single AWR/Statspack report. If you’ve got that you can then do comparisons for things like:

  • Which event consumed much more time than usual
  • Which SQL took much more time than usual
  • Which segment(s) saw much more activity than usual
  • Was there some unusual enqueue activity
  • Can we see some unusual outliers in the event histograms
  • Can we see some unusual memory demands in the pga histogram

Although system-wide summaries rarely point us at exact causes, they can often give us strong clues of areas (and times) where problem originated.

In this respect the “Top Activity” screen from Enterprise Manager (Grid Control / Cloud Control) can be very helpful as it produces a nice graphical presentation of “working”. Where, in the picture of last night’s activity, does the graph start to ramp up, and what colour is the bit that’s growing, and how does that picture compare to the same picture the previous night. (Having two windows open with two different nights makes it wonderfully easy to switch between displays and spot the differences.) Since the top activity screen is created from the dba_hist_active_sess_history, which contains about 100 different details per session of each captured moment, it’s very easy to drill though the spikes to answer questions like: “which object”, “what event”, “which SQL”, “what was the execution plan”, “how much work did that take”, to follow the chain of time back to the cause.

The system is slow

If no-one is going to tell you about specific tasks, and if you don’t have any sort of boundary that allows you to focus on tasks or time-ranges, then the simplest thing to do is look for anything expensive (i.e. time-consuming) and see if you can make it cheaper.

Again, the graphic “Top Activity” screen is very helpful, and I often tell people to arrange to have a system that shows the top activity screens for the most important 2 or 3 databases on a large screen on the wall where any passing DBA might notice a brief spike in workload. There are systems that can be improved by constant monitoring – so long as the monitoring doesn’t take out 100% of an individual’s time but is driven as an informal glance at a picture.

If you’re not licensed to take advantage of the AWR then Statspack can help – but with the smallest time interval (though 15 minutes is as low as I’ve ever gone) between snapshots so that “anomlies” that are short-lived don’t fall out of memory before they can be captured.

An important feature of reading Statspack is that you need to check for missing information – if the headline figure for physical reads is 25M but the “SQL ordered by reads” is 12M then you know that there must be 13M reads that didn’t get captured in the report and that might be the 13M that is causing the problem. Similarly if the “Segments by physical reads” reports 16M reads that’s 4M more than the SQL – but is the 12M a subset of the 16M, or is there only a 3M overlap between the two figures so that between them the 12M and 16M cover the entire 25M. There’s more information in the Statspack report than immediately meets the eye, and a careful drilldown into other areas of the report (typically the Instance Activity) may be needed to clarify the meaning of what you’re seeing as a headline figure.

The other thing you can do with “the slow system” when you don’t have ASH to help is take snapshots (or get some freeware to do the same sort of thing). If the system is slow “right now” you could, for example, take a snapshot of v$sess_io (session I/O), wait 30 seconds then take another snapshot, find the difference and see who is doing most of the I/O work – then chase that session; or take snapshots of v$sesstat limited to (say) statistics like “%redo%” and find out who is generating lots of redo.

Oracle allows you to take this approach back into recent history – there are a number of “metric” views which give you thing like the rolling average, min, and max I/O volumes for the last few intervals of 5 minutes or 1 minute each – telling you, in other words, whether there were any interesting bursts of extreme activity in the recent past. For example, a query against v$sysmetric_summary might give you an output like the following:

------------------------ ------------- ----------- ------------ -----------------------
Physical Reads Per Sec        1,618.95      105.92       358.16 Reads Per Second
Physical Reads Per Txn       97,202.00    5,539.19    20,811.56 Reads Per Txn
Redo Generated Per Sec    6,773,108.94  218,132.86 1,023,458.57 Bytes Per Second
User Calls Per Txn              395.00       43.39        79.85 Calls Per Txn
Total Parse Count Per Sec        31.14        1.88         4.25 Parses Per Second
Host CPU Utilization (%)         64.51        3.93         9.07 % Busy/(Idle+Busy)
Database Time Per Sec            82.96        6.65        15.37 CentiSeconds Per Second
I/O Megabytes per Second         35.58        2.62         5.73 Megabtyes per Second

This summarises the last 12 intervals of 5 minutes. If we look at “Physical Reads per Txn” we can see that there were some extreme swings in activity over that period, so we could drill down into v$sysmetric_history for “Physical Reads per txn”, looking at the 1 minute granularity and see:

METRIC_UNIT                    BEGIN_TIME                VALUE
Physical Reads Per Txn         05-feb 12:45:55          421.00
                               05-feb 12:44:55          477.00
                               05-feb 12:43:55          351.00
                               05-feb 12:42:55          406.84
                               05-feb 12:41:55        1,550.00
                               05-feb 12:40:55       93,984.00
                               05-feb 12:39:55       97,202.00
                               05-feb 12:38:55       93,323.00
                               05-feb 12:37:55          391.00
                               05-feb 12:36:55          504.00
                               05-feb 12:35:55          504.00
                               05-feb 12:34:55          252.00

Yes, a few minutes ago something exploded onto the system doing a huge amount of I/O for about 3 minutes. If we’re lucky we might now drill into the v$sesstat, or v$sess_io, or v$session_event to see if we can find a session that is responsible for a large amount of I/O; and then check v$open_cursor to see if it still has some open cursors that might (if we check v$sql) show us what caused the I/O.

When there are no specific complaints, we just keep an eye open for spikes in activity and try track them down as quickly and cheaply as possible to see if they’re worth addressing.


Oracle gives you a huge amount of information about the work that’s going on and the time that’s being used in the database. Unfortunately the most useful repository of that information is in a dynamic performance view that can only be viewed in the Enterprise Edition after purchasing additional licences. However, the information is summarised, in many different ways in literally hundreds of other dynamic performance views, and it’s easy to pick out helpful information from those views in a variety of ways.

Key to making the best use of those views, though, is recognising that different classes of performance problems require different strategies – and there are only three different classes of problems to worry about.

Choosing Squarespace

Squarespace is a drag-and-drop website-builder for small organizations and creative professionals wanting a polished and functional web presence. Squarespace sites are quick to create and then “just work” without any worries over ongoing maintenance or version updates.

The Use Case

The web is replete with stale websites that are all but abandoned after the initial build. The common scenario I see is of a site developed in a rush of enthusiasm by someone having just enough technical knowledge to stand up a web presence on a platform like WordPress or GetSimple. Perhaps a volunteer burns the midnight oil for a few weeks. Such effort cannot be sustained. Updates bottleneck on the one person who can make them. In short order, the site goes stale. 

Squarespace excels in the scenario I've just described, because it provides a unified site-building platform designed specifically with the average person in mind. All you as the customer need do is choose the look and feel of your site. Then drop in your content, and job done! Anyone with a modicum of comfort running common office PC applications is capable of standing up and maintaining a website on the Squarespace platform. 

There's still a role for the enthusiastic volunteer. The work of getting a site up and running for the first time goes along a lot faster when it's done by someone who can go heads-down for a short interval and get the job done. Hiring an expert to do that initial heavy lifting is worth considering too. But a site on Squarespace is easily maintained after it's up and running by anyone who is comfortable with typical office computing tasks. 

A Success Story

One of my favorite success stories is the CR Ministries site I created for a friend who is a retired Baptist minister. His WordPress hosting service was shutting down, and my friend came to me with just three weeks remaining before the plug was pulled. Would I help him save his website? He was sorry to have waited so long, and could anything be done quickly?

I jumped at the chance to sharpen my Squarespace skills and help a friend in the same effort. His new site was up and running in a week's time, and that was just an evening-only effort on my part. The heavy-lifting took about a week, and some fine tuning extended a few days beyond that. 

My friend's site in WordPress

My friend's site in WordPress

After the migration to Squarespace

After the migration to Squarespace

The Squarespace version of the site maintains the same overall design approach while being more attractive and consistent in its presentation. Even better, my friend is now able to maintain and add content on his own, which he frequently does. This is HUGELY important. His ability to maintain his own content is a bigger win than my one-week migration effort. His win is the win that matters.

The Templates

Airy designs with stunning appeal 

Airy designs with stunning appeal 

One look at Squarespace’s own website makes it obvious that design is a priority. Squarespace provides a limited array of templates as compared to competing platforms, but what templates they have are gorgeous to behold. They are gorgeous, well-designed, and impactful. They are designed with plenty of whitespace for an uncrowded look that’s easy on the eye and allows viewers to see at a glance the content being provided.

However! What's missing are some text-centric templates for business. Especially with tight spacing. Templates favor artists, photographers, restaurants, creative professionals with portfolios. A recent effort to find a template on which to build a site for a medical practice left me frustrated and unsatisfied. Most templates are designed around the idea of large, sweeping imagery. It’s difficult in many cases to envision what templates – if any – might be best for a text-centric site having limited imagery. Many templates present an "artsy" look that is ill-suited to serious business. 

The Dark Side

Some days Squarespace feels like the Pitcher Plant of website builders. There’s plenty of glitz and polish to draw a person in to using the platform, but annoyances and rough edges crop up once any real work begins. Sometimes I feel the developers aren't sweating the details as they ought. 

One example is how the editing interface sometimes blocks you from selecting text in order to apply formatting or make a link. You can see that happening in the following screenshot. I even had trouble typing the next paragraph in this review following the screenshot, because the same label got in the way of my fixing a typo in that paragraph's first line. 

The label indicating a text block pops up and gets in the way of my selecting text

The label indicating a text block pops up and gets in the way of my selecting text

Getting blocks to go side-by-side in the layout engine can be infuriatingly difficult. Sometimes it's necessary to place temporary horizontal lines to isolate blocks going side-by-side, to limit their interference with other blocks on the page. Sometimes you just can't find that right cursor position to make the layout engine do what you want done, and you have to give up and try again some other day. (For the same reason typing figure captions can be a problem). 

Seemingly obvious features that anyone would want to include on a website-building platform are sometimes missing altogether. Squarespace possibly has the only calendaring system on Planet Earth not to implement recurring events. Some templates – Ishimoto is one example – are attractive for their stunning gallery pages, yet those pages in practice have their usability reduced due to their inability to support text and other content blocks. 

Every platform has foibles and annoyances. Squarespace is no exception. Don't be discouraged by some of what I've mentioned. Go in with your eyes open. Be willing to modify your expectations for your site to correspond with what the Squarespace platform has to offer. Success with Squarespace – at least with the drag-and-drop side of it – lies in accepting what the platform can offer. 

When to Avoid?

There's a lock-in aspect with Squarespace. Lock-in is inherent with any platform or technology. But it's worse with Squarespace than with, say, WordPress or GetSimple, because Squarespace locks you into the combination of the software used and the underlying hosting platform. You simply cannot take your site with you when you go, and thus I caution against creating any site having more pages or more content than you are willing to recreate from scratch on some other platform. 

A 20-page site having 10 products? That can likely be rebuilt on Weebly or some other competitor in short order. So no worries. But a 100-page site and 1000 products? And maybe with 500 events in the calendar? Think twice. Because 1000 sets of product details and 500 event listings is a lot of retyping should you move platforms, not to mention the 100 pages. 

Be mindful that Squarespace doesn't support bulk operations. Squarespace is designed around doing everything by hand as a one-off. There are no recurring calendar events. There's no ability to load products in bulk. You can't manage large numbers of old blog posts without scrolling back through the years of old posts page by page. If your needs are such that you need bulk operations, then go elsewhere. 

Finally, avoid Squarespace if you cannot commit into the unknown. You need to be able to commit ahead of time to whatever is provided in the available templates and content blocks. Inevitably you'll discover some limitation or surprise after you've begun. Squarespace might be the wrong choice if you cannot agree with yourself ahead of time to live with whatever surprises and limitations you encounter as you build out your site. 

Is It Too Easy?

You might be tempted to feel like Squarespace is somehow a lesser choice than a real platform such as WordPress. Or maybe you have friends who talk that way. Ignore those feelings, and those friends! 

Sometimes I get the same feelings. I'm a former programmer and database administrator, and what do I need with a drag-and-drop platform? But you know what? I don't care. I get work done. 

And far more importantly, my clients get work done! 

I am beyond happy not to have be called up for every single change my clients need on their sites. I get the exhilaration from the initial standing up of a new website. Then I'm free to move on and let others have the day-to-day maintenance. 

When to Choose?

Squarespace is a wonderful platform on which to run a website and compete head-on with the big players so long as your data and content volumes are modest. You can stand up a fantastic-looking website with integrated commerce and news and galleries and podcasts and plenty of other features otherwise unattainable at anything near the same cost in money and time. 

Yes, there are warts. I’ve put up several sites now for friends and family. Squarespace is productive and I recommend it often, but there sure are some frustrations. I've made sure to mention some of them in this review. Many reviews I’ve seen feel like they are written through rose-colored glasses, and I want to cover the bad stuff too, to give my honest impression. 

But don't let my mention of some negatives cast too large a pall over the good stuff. Are you a creative professional? A small-business owner? A volunteer for a club or a congregation? Do you need a site up and running quickly that you and others (hopefully others!) can maintain with not much more effort than goes into writing a Word document? Then look to Squarespace.

Choosing Squarespace

Squarespace is a drag-and-drop website-builder for small organizations and
creative professionals wanting a polished and functional web presence.
Squarespace sites are quick to create and then...

Read the full post at

Semantic versioning - Verifying full version ranges

The last added function was the cmp function, which we needed to finish the overall goal, of the package: To create a "satisfies" function, that can verify if a specific version is satisfied within a range of versions.

The different ranges, that can be specified are X-ranges (Wildcard, i.e. 1.2.* or 2.*), Tilde-ranges (minimum requirements for minor versions, ~1.2.3, which will allow versions between 1.2.3 and up to 1.2.n, but not 1.3.0) and Caret-ranges and Hyphen ranges. For a full description and detailed example, you should read the following 2 articles:

My agenda

It’s gettting to that time of year, so here are some of the sessions I’ve pencilled in for UKOUG Tech 15:

Sunday 6th:

Probably the whole of the Development stream.

Monday 7th:

9:00 – 9:50: Tom Dale – Fivium : An insight into Oracle Standard Edition, what’s included, what’s not, and how to plug some of those holes!

11:20 – 12:10: Me (et. al.) : Cost Based Optimizer – Panel Session

12:20 – 13:10: Franck Pachot – DBi Services : All About Table Locks: DML, DDL, Foreign Key, Online Operations,…

14:10 – 15:00: Maria Colgan – Oracle : Oracle Database In-Memory By Example

15:10 – 16:00: Philip Brown – Red Stack Tech : How to Evaulate the 12c In-Memory Option

16:30 – 17:20: Tony Hasler – Anvil Computer Services : Optimizer Roundtable

17:30 – 18:20: Joel Goodman – Oracle : Oracle Standard Edition Roundtable

 Tuesday 8th

14:10 – 15:00: Luke Davies – Pythian : It’s Always the Network. Exonerating the Database

15:30 – 16:20: Me : Just Don’t Do It

16:30 – 17:20: Nelson Calero – Pythian : Evolution of Performance Management: Oracle 12c Adaptive Optimization

Wednesday 9th

10:00 – 10:50: Carl Dudley – University of Wolverhampton : Oracle 12c New Features – The Lesser Spotted Variety

A Different Type of Keynote & Jonathan Lewis Panel Session at UKOUG Tech15

Technical people tend not to enjoy Keynotes at conferences. We are allergic to content-light “there has never been a better time to invest in our products” fluffy, frou-frou, big picture talks. We want how-it-works meat on the bones of what is served up to us.

OK, it's a very poor photo but the best I have of Dom presenting. Sorry Dom.

OK, it’s a very poor photo but the best I have of Dom presenting. Sorry Dom.

Well, at the UKOUG Tech15 conference this year (Birmingham ICC 7th-9th December) we have a treat for you – The Database stream keynote is technical AND you get to ask whatever questions you want – questions about Oracle RDBMS technology that is.

Dominic Giles, Maria Colgan and Penny Avril have agreed to be up on stage and, after the first half telling us about some of the things introduced at Oracle OpenWorld 2015, they will take questions. Questions they do not know are coming. Real questions. From people at the conference. Your questions. These are not questions that have been placed by them or checked with them before hand.

Dom does this at smaller user groups; he stands up and asks for any questions from the audience and he just tells it the way he sees it. His incredible knowledge of the product is matched only by his humour (so no huge expectation for you to live up to there, Dom!). Maria and Penny are similarly endowed with knowledge and great presentation skills and are willing to give this a go for us. Brave people.

{I think in the photo Dominic has just been asked about why something in Oracle does not work – and he’s trying to decide whether or not to kill the person who asked…}

You will be able to ask questions on the day, at the session, but you will also be able to post questions at the UKOUG information desk on Monday and I am happy for you to send me any questions you have ( or leave a comment on this blog – I don’t think Maria, Dom or Penny drop by here very often so they won’t see them…:-) ). Of course, there is no point asking a question if you do not intend to be at the conference and at that session!

Having run similar sessions to this at smaller events, I know that you need some questions to get the ball rolling and then, with a little luck, the audience warms up and asks questions. The key thing is, no matter the source, the panel do not know the questions before-hand. I’ve seen sessions like that, with placed questions, and it just comes over as fake.

Why did I mention Jonathan Lewis? Well, on Monday at 11:20 he is doing another panel session taking questions, with Nigel Bayliss, Christian Antognini and Maria Colgan (again – we work them hard). This session is focused on the Cost Based Optimizer. We already have enough initial questions but if you are curious about the optimizer and performance, maybe ask your own question from the floor, it’s a must-see session. Jonathan talks about this session in this blog post.

So at UKOUG Tech15 you have two panel sessions in the database stream where you can ask questions. We also have several “Roundtable” sessions across the whole agenda which are perfect for asking questions too. If you have never been to one, a Roundtable session is more a discussion in a smaller group, with one or two experts “officially” there as well as usually some unofficial experts in the crowd. Panel session are “pose your question, get expert answers”, roundtables are more interactive, more like a conversation in the bar. They can get quite lively (but fights are rare) :-).

All in all, we are aiming for a good dose of interaction between presenters and delegates. And never forget, most of us presenters are more than happy to chat and answer questions throughout the conference. Just don’t ask hard questions if you meet us in the evenings, when we are half-drunk…

DOAG annual conference 2015 – great as always!

Another annual DOAG conference has passed, and I can only say the very best about it: Perfectly organized, large and modern location, impressive list of well known speakers and over 2100 attendees – wow!


My presentation Best of RMAN was scheduled at the first slot on the first day, so I was a bit concerned whether many people would attend that early. It turned out that the room got so full that I was asked by the organizers to deliver the same talk again next day – which I happily did, again with a packed room :-)


Apart from speaking myself, I enjoyed very much to see friends and colleagues again, as well as people I knew from social media before but never met in person yet. Thank you all for your good company guys – I may appear a bit quiet and distanced sometimes, but I really appreciate you and our little conversations. Personal highlight: The lady who approached me at the stairs, shaking hands and telling me that she knew me from an Oracle University class that she liked so much :-)

There were many interesting presentations, just to mention some of those I attended myself:

Frits Hoogland with Oracle Exadata and database memory


In his very detailed talk, he emphasized the importance of using Huge Pages on Exadata and that MEMORY_TARGET aka Automatic Memory Management is not an option here.

Jonathan Lewis presented Five Hints for Optimising SQL


I’m always amazed how much this man knows and how good he is able to explain it to us mere mortals :-)

Lothar Flatz was presenting Anatomie eines SQL Befehls (how to do Oracle Performance Tuning with a scientific approach)


During his very entertaining talk, he quoted no less than seven Oakies (including himself), so that was quite good advertising for the OakTable Network :-)

Frank Schneede delivered Eine Datenbank kommt selten allein (DB Cloning on Exadata using sparse diskgroups)


while Ulrike Schwinn presented Oracle DB Memory Techniken für mehr Performance (Overview about the meanwhile so many different ways to deal with memory in Oracle)


Couldn’t really catch her because she was running out of the picture all the time :-)

Martin Berger also did an interesting talk: Wie misst Oracle die Verwendung von Database Features? (How to make sense of DBA_FEATURE_USAGE STATISTICS)


I liked his presentation style with many live demonstrations very much!

My dear colleague Joel Goodman talked about Automatic Parallel Execution


Joel is what I call a ‘Standup Instructor’ – mention any database related topic and he will be able to deliver an excellent 30 minutes talk about it instantly :-)

A big  THANK YOU to the many people from DOAG who helped to make that event take place again in such an impressive way! Hope to be there again next year.

Tagged: #DOAG2015