Search

OakieTags

Who's online

There are currently 0 users and 41 guests online.

Recent comments

Affiliations

Oakies Blog Aggregator

Returning to the Day Job.

Having the Summer off. It’s something that quite a few IT contractors and some consultants say they intend to do…one year. It’s incredibly appealing of course, to take time off from your usual work to do some other things. Asking around, though, it is not something many of us self-employed types who theoretically could do, actually have done. I think it is because, although the theory is nice, the reality is a period not earning a living – and the background worry of “if I take a break, will I be able to step straight back into gainful employment afterwards”?

Well, purely as a service to all those who wonder about it, I decided to sacrifice myself to the experiment and do it this year. I had the summer off. {I know, I picked what turned out to be a very average summer weather-wise, but you just never know in the UK}. I’d finished a job for a client that had turned from a small number of weeks to several months of full-on seven-days-a-week effort and I had a load of domestic things that desperately needed some attention and my potential next job evaporated. I also have to acknowledge that I am in a somewhat lucky position. We do not have kids so the financial worries and potential impact on the innocent does not exist and my wife has been remarkably tolerant. How did it go? In a word, “great!”. I am a lot healthier and a lot more fired up to get on with work as a result of a few months away from the coalface – but I’ll save the second half of this post for some of the boring details of what I got up to {which you may well want to skip}.

This week the break ends. As I write this I do not have a job – but today I start looking around and I start locking myself in my office for the day. I’ll be doing work, even if this consists of me simply catching up on technical reading and testing some ideas I have about Oracle internals and design. This should hopefully increase the technical content of my Blogs too. If anyone is working on a project that could do with some Oracle performance or system design work, I’d be happy to hear from you.

As you can see, the one thing I am terrible at as a self-employed person is the selling of my services. I’m sure I will not get much response from a simple “give me a job” line in a blog that, let’s face it, is being read mostly by technical-doing-people and not budget-holding-hiring-people. However, you might have noticed a green text box on the right of this page proclaiming “available for consultancy”. It’s a free thing for me to do and I intend to keep it there, swapping it for a red “unavailable” when I am busy. This might of course reveal how poorly I do in finding work – but that could be interesting of itself.

So, what have I done? I’ve kept my hand in on the Oracle front but what about non-IT things? Well, one thing was building a clay pizza oven. That took a day on a course and about 200 hours digging a hole in the garden for clay, cleaning up some old bricks, mixing up endless quantities of cobb with the clay, straw and sand, building it wrong several times and destroying a perfectly good Pizza in it.

The final product works though. We fired it up this last weekend and it got up to 300C and the pizzas we cooked in it were very good – allowing for the odd bit of ash and ember. Hint, do not use wood that spits, like pine and off-cuts from building work. If you are wondering about the big bits of wood in the entrance, that is some green oak offcuts which I popped in there after the last firing. The morning after using the oven, you can put in wood for the next firing and the residual heat helps dry it nicely.

A lot has been done around the garden, especially as I now own a chainsaw. Again, I got myself on a course to learn how to maintain the machine and keep the chain sharp. I had to modify some protective clothing to suit my diminutive frame and then about two dozen trees came down and had to be chopped up. {Monty Python’s lumberjack sketch obviously had a big impact on my development as a child}. I then took the chainsaw for an excursion to Wales and gave my brother’s garden the same treatment. I wonder if anyone local wants a load of trees cut down? :-)

A major task has been to organise and then oversee the start of some building work on the house. As anyone who has had such work done knows, it takes a lot of time to organise the work and then once it starts there are a lot of initial issues to sort out. Much like designing and building a new computer system, upfront design and project initiation can make the whole build run so much more smoothly. At least, I am hoping so!

The building work resulted in an odd little bit of computer work. One of the guys was complaining that his machine had started running slow and now it would start up only to shut down immediately. One of the other builders said “Well ask the governor, he does computers”. Of course, I utterly failed to explain that I do other types of computers and so this PC duly arrived. It was nice to do some IT that was not my normal area. It turns out this machine had a couple of nasty viruses which I had to sort out first and then protect the machine with some decent AV software. This was made difficult by the “Rapport” security software his bank had encouraged him to download, which was somehow blocking a windows update from working, as well as grinding the machine to a halt. The performance impact of the software was causing the windows update to take 10 minutes, it would fail and take 20 minutes to roll back and then force a reboot… and cycle through the update again. I temporarily fixed this by booting off a Windows CD and I could then remove the dodgy security software, despite it’s attempts to stay in place. But the machine was still terribly slow. Dixons had sold him the machine and seen fit to put in only 512MB of memory, for a Windows Vista machine. £10 got us an extra 1GB of memory and a machine that now worked. Payment for all of this effort was a couple of nice bottles of wine and a very happy builder.

The building work is still in progress and, of course, part of why my Summer off has come to an end is that the builders have used up the spare money I had from the last job. *sigh*. Mind you, the Summer off was supposed to end in September but as the weather suddenly improved I got permission for a time extension from the lone worker in the house. Has anyone noticed this blog posting is slowly becoming nothing but an excuse for me to stick up some pictures?

There have been a lot of far more mundane things to do on the domestic front but there has also been some IT stuff I have been up to – but I’ll save that for a another day and a more oracle-centric posting I think. Right, I better try and make the CV look more like I’m the best Oracle guy on the planet {OK, I’d have to change the name on the top}. It leans towards being understated honesty rather than overstretched impressive, and I know that is what potential employers like to see (I think it is better for the interview to indicate more skills than the CV, rather than the other way around) but agents seem to want you appear better than a bizzare child of Tom Kyte and Christian Antognini for them to put you in a pile other than “yet another bog-standard IT grunt”. If the agent won’t put you forward, you can’t make your case in the interview, can you?

AIOUG Webcast: Methodical Performance Tuning

A big thank you to all those you attended my session today. I sincerely hope you got something out of it. Here are the scripts I used in the demo. And, here is the slide deck, if you are interested.

Remember, this was just the beginner's session. We will have intermediate and advanced ones in near future. Stay tuned through the AIOUG site.

Counting Triangles Faster

A few weeks back one of the Vertica developers put up a blog post on counting triangles in an undirected graph with reciprocal edges. The author was comparing the size of the data and the elapsed times to run this calculation on Hadoop and Vertica and put up the work on github and encouraged others: “do try this at home.” So I did.

Compression

Vertica draws attention to the fact that their compression brought the size of the 86,220,856 tuples down to 560MB in size, from a flat file size of 1,263,234,543 bytes resulting in around a 2.25X compression ratio. My first task was to load the data and see how Oracle’s Hybrid Columnar Compression would compare. Below is a graph of the sizes.

As you can see, Oracle’s default HCC query compression (query high) compresses the data over 2X more than Vertica and even HCC query low compression beats out Vertica’s compression number.

Query Elapsed Times

The closest gear I had to Vertica’s hardware was an Exadata X2-2 system — both use 2 socket, 12 core Westmere-EP nodes. While one may try to argue that Exadata may somehow influence the execution times, I’ll point out that I was using In-Memory Parallel Execution so no table data was even read from spinning disk or Exadata Flash Cache — it’s all memory resident in the database nodes’ buffer cache. This seems to be inline with how Vertica executed their tests though not explicitly stated (it’s a reasonable assertion).

After I loaded the data and gathered table stats, I fired off the exact same SQL query that Vertica used to count triangles to see how Oracle would compare. I ran the query on 1, 2 and 4 nodes just like Vertica. Below is a graph of the results.

As you can see, the elapsed times are reasonably close but overall in the favor of Oracle winning 2 of the 3 scale points as well as having a lower sum of the three executions: Vertica 519 seconds, Oracle 487 seconds — advantage Oracle of 32 seconds.

It Should Go Faster!

As a database performance engineer I was thinking to myself, “it really should go faster!” I took a few minutes to look over things to see what could make this perform better. You might think I was looking at parameters or something like that, but you would be wrong. After a few minutes of looking at the query and the execution plan it became obvious to me — it could go faster! I made a rather subtle change to the SQL query and reran my experiments. With the modified SQL query Oracle was now executing twice as fast on 1 node than Vertica was on 4 nodes. Also, on 4 nodes, the elapsed time came in at just 14 seconds, compared to the 97 seconds Vertica reported — a difference of almost 7X! Below are the combined results.

What’s The Go Fast Trick?

I was thinking a bit more about the problem at hand — we need to count vertices but not count them twice since they are reciprocal. Given that for any edge, it exists in both directions, the query can be structured like Vertica wrote it — doing the filtering with a join predicate like e1.source < e2.source to eliminate the duplicates or we can simply use a single table filter predicate like source < dest before the join takes place. One of the first things they taught me in query planning and optimization class was to filter early! That notation pays off big here because the early filter cuts the rows going into the first join as well as the output of the first join by a factor of 2 — 1.8 billion rows output vs. 3.6 billion. That’s a huge savings not only in the first join, but also in the second join as well.

Here is what my revised query looks like:

with
  e1 as (select * from edges where source < dest),
  e2 as (select * from edges where source < dest),
  e3 as (select * from edges where source > dest)
select count(*)
from e1
join e2 on (e1.dest = e2.source)
join e3 on (e2.dest = e3.source)
where e3.dest = e1.source

Summary

First, I’d like to thank the Vertica team for throwing the challenge out there and being kind enough to provide the data, code and their elapsed times. I always enjoy a challenge — especially one that I can improve upon. Now, I’m not going to throw any product marketing nonsense out there as that is certainly not my style (and there certainly is more than enough of that already), but rather I’ll just let the numbers do the talking. I’d also like to point out that this experiment was done without any structure other than the table. And in full disclosure, all of my SQL commands are available as well.

The other comment that I would make is that the new and improved execution times really make a mockery of the exercise when comparing to Hadoop MapReduce or Pig, but I would also mention that this test case is extremely favorable for parallel pipelined databases that can perform all in-memory operations and given the data set is so small, this is the obviously the case. Overall, in my opinion, a poor problem choice to compare the three technologies as it obviously (over) highlights the right tool for the job cliche.

Experiments performed on Oracle Database 11.2.0.2.

Github source code: https://gist.github.com/grahn/1289188

Volatile Data, Dynamic Sampling And Shared Cursors

For the next couple of weeks I'll be picking up various random notes I've made during the sessions that I've attended at OOW. This particular topic was also a problem discussed recently at one of my clients, so it's certainly worth to be published here.

In one of the optimizer related sessions it was mentioned that for highly volatile data - for example often found in Global Temporary Tables (GTT) - it's recommended to use Dynamic Sampling rather than attempting to gather statistics. In particular for GTTs gathering statistics is problematic because the statistics are used globally and shared across all sessions. But GTTs could have a completely different data volume and distribution per session so sharing the statistics doesn't make sense in such scenarios.

So using Dynamic Sampling sounds like a reasonable advice and it probably is in many such cases.

However, there is still a potential problem even when resorting to Dynamic Sampling. If the cursors based on Dynamic Sampling get shared between sessions then they won't be re-optimized even if a GTT in one session is completely different from the one of the session that created the shared cursor previously.

So you can still end up with shared cursors and execution plans that are inappropriate to share across the different sessions. Using Dynamic Sampling doesn't address this issue. It addresses the issue if the cursors do not get shared, for example if they use literals and these literals differ so that different cursors will be generated based on the text matching.

Here is a simple test case that demonstrates the problem:

drop view v_gtt_dyn;

drop table gtt_dyn;

-- Create a Global Temporary Table with an index on it
create global temporary table gtt_dyn (
id number not null
, vc1 varchar2(100)
, filler varchar2(255)
)
on commit preserve rows
;

create index gtt_dyn_idx on gtt_dyn (id);

-- Create a simple view - it will become obvious later
-- why this has been used
create or replace view v_gtt_dyn as select * from gtt_dyn;

-- Run in Session 1
set echo on timing on

-- Unique value in ID column
insert into gtt_dyn
select
rownum as id
, rpad('x', 100) as vc1
, rpad('y', 255) as filler
from
dual
connect by level <= 10000
;

commit;

alter session set statistics_level = all;

alter session set optimizer_dynamic_sampling = 2;

select * from v_gtt_dyn where id = 10 and rownum > 1;

set linesize 200 pagesize 0 trimspool on tab off

select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));

-- Run in Session 2
set echo on timing on

-- Single value in ID column
insert into gtt_dyn
select
10 as id
, rpad('x', 100) as vc1
, rpad('y', 255) as filler
from
dual
connect by level <= 10000
;

commit;

alter session set statistics_level = all;

alter session set optimizer_dynamic_sampling = 2;

select * from v_gtt_dyn where id = 10 and rownum > 1;

set linesize 200 pagesize 0 trimspool on tab off

select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));

Now this is probably an extreme case of data distribution differences but if you run it you'll see it makes the point obvious: In the second session the data distribution of the GTT is completely different, and although there are no statistics gathered on the GTT and hence Dynamic Sampling gets used to arrive at an execution plan, the plan gets shared in the second session (there is only a child number 0) - but the plan is completely inappropriate for the data distribution of the GTT in the that session, you just need to look at the E-Rows and A-Rows columns of the runtime profile:

SQL> select * from v_gtt_dyn where id = 10 and rownum > 1;

no rows selected

Elapsed: 00:00:00.07
SQL>
SQL> set linesize 200 pagesize 0 trimspool on tab off
SQL>
SQL> select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));
SQL_ID bjax3mksw1uza, child number 0
-------------------------------------
select * from v_gtt_dyn where id = 10 and rownum > 1

Plan hash value: 471827990

-------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
-------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 0 |00:00:00.08 | 1117 |
| 1 | COUNT | | 1 | | 0 |00:00:00.08 | 1117 |
|* 2 | FILTER | | 1 | | 0 |00:00:00.08 | 1117 |
| 3 | TABLE ACCESS BY INDEX ROWID| GTT_DYN | 1 | 1 | 10000 |00:00:00.06 | 1117 |
|* 4 | INDEX RANGE SCAN | GTT_DYN_IDX | 1 | 1 | 10000 |00:00:00.02 | 63 |
-------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - filter(ROWNUM>1)
4 - access("ID"=10)

Note
-----
- dynamic sampling used for this statement (level=2)

Imagine a more complex plan with joins and a larger data volume and this is a receipt for disaster.

If this problem cannot be addressed from application side by helping the database to generate different cursors for the different data distributions (for example by simply adding different predicates that don't change the result like 1 = 1, 2 = 2 etc.) then you might be able to handle the issue by using Virtual Private Database (VPD, aka. Row Level Security / RLS, Fine Grained Access Control / FGAC). I've already demonstrated the general approach in the past here, but in this case a slightly more sophisticated approach could make sense.

By adding the following code and RLS policy I can drive Oracle to perform a re-optimization only in those cases where it is appropriate. This limits the damage that the general approach does to the Shared Pool by generating potentially numerous child cursors unconditionally.

create or replace package pkg_rls_force_hard_parse is
function force_hard_parse (in_schema varchar2, in_object varchar2) return varchar2;
end pkg_rls_force_hard_parse;
/

create or replace package body pkg_rls_force_hard_parse is
-- Cache the count in session state
g_cnt number;

function force_hard_parse (in_schema varchar2, in_object varchar2) return varchar2
is
s_predicate varchar2(100);
begin
-- Only execute query once in session
-- Change if re-evaluation is desired
if g_cnt is null then
select
count(*)
into
g_cnt
from
gtt_dyn
where
id = 10
and rownum <= 10;
end if;

-- We end up with exactly two child cursors
-- with the desired different plans
-- These child cursors will be shared accordingly
if g_cnt > 1 then
s_predicate := '1 = 1';
else
s_predicate := '2 = 2';
end if;

return s_predicate;
end force_hard_parse;
end pkg_rls_force_hard_parse;
/

-- CONTEXT_SENSITIVE avoids re-evaluation of policy function at execution time
-- Note however that it doesn't avoid re-evaluation at parse time
exec DBMS_RLS.ADD_POLICY (USER, 'v_gtt_dyn', 'hard_parse_policy', USER, 'pkg_rls_force_hard_parse.force_hard_parse', 'select', policy_type => DBMS_RLS.CONTEXT_SENSITIVE);

Now if you repeat above exercise - ideally with SQL trace enabled to see the additional acitivity caused by the RLS policy - you'll notice that the different sessions will end up with different child cursors and execution plans based on the check made.

Now the reason why the view is in place might become obvious: A RLS policy on the base table would have lead to an infinite recursive execution of the RLS policy function due to the query performed within the function. There are other obvious options how to deal with that, for example storing the RLS policy function in a separate schema with the EXEMPT ACCESS POLICY privilege should also work.

This is the result in the second session now:

SQL> select * from v_gtt_dyn where id = 10 and rownum > 1;

no rows selected

Elapsed: 00:00:00.12
SQL>
SQL> set linesize 200 pagesize 0 trimspool on tab off
SQL>
SQL> select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));
SQL_ID bjax3mksw1uza, child number 1
-------------------------------------
select * from v_gtt_dyn where id = 10 and rownum > 1

Plan hash value: 424976618

-----------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 0 |00:00:00.04 | 1003 |
| 1 | COUNT | | 1 | | 0 |00:00:00.04 | 1003 |
|* 2 | FILTER | | 1 | | 0 |00:00:00.04 | 1003 |
|* 3 | TABLE ACCESS FULL| GTT_DYN | 1 | 9288 | 10000 |00:00:00.03 | 1003 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - filter(ROWNUM>1)
3 - filter("ID"=10)

Note
-----
- dynamic sampling used for this statement (level=2)

Notice how a second child cursor has been generated and that the cardinality estimate is now much closer to the reality.

Adaptive Cursor Sharing / Cardinality Feedback

I was curious to see if recent features like Adaptive Cursor Sharing or Cardinality Feedback would be able to solve the issue when using the 11g releases.

Cardinality Feedback (introduced in 11.2) unfortunately doesn't get used in the scenario described here, because Dynamic Sampling disables Cardinality Feedback in the current implementation.

Note that the usage of bind variables also disables Cardinality Feedback for those parts of the plan affected by the bind variables - as described in the Optimizer blog post that can be found here.

So may be Adaptive Cursor Sharing (ACS, introduced in 11.1) can come to rescue in case bind variables get used.

Of course the usage of bind variables increases the probability of cursor sharing in above scenario. As already outlined in a previous note ACS is a "reactive" and "non-persistent" feature, so it will only be able to correct things that have already been going wrong at least once. Furthermore if the ACS information gets aged out of the Shared Pool again mistakes will have to be repeated to get recognized by ACS.

However it is interesting to note that I wasn't able to get ACS working in a slightly modified scenario like this (without the RLS policy in place of course):

-- Session 1
set echo on timing on

set linesize 200 pagesize 0 trimspool on tab off

insert into gtt_dyn
select
rownum as id
, rpad('x', 100) as vc1
, rpad('y', 255) as filler
from
dual
connect by level <= 10000
;

commit;

alter session set statistics_level = all;

variable n1 number

exec :n1 := 10

select * from v_gtt_dyn where id <= :n1 and rownum > 1;

select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));

-- Session 2
set echo on timing on

set linesize 200 pagesize 0 trimspool on tab off

insert into gtt_dyn
select
10 as id
, rpad('x', 100) as vc1
, rpad('y', 255) as filler
from
dual
connect by level <= 10000
;

commit;

alter session set statistics_level = all;

variable n1 number

exec :n1 := 10

select * from v_gtt_dyn where id <= :n1 and rownum > 1;

select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));

-- Second execution to allow ACS potentially kicking in
select * from v_gtt_dyn where id <= :n1 and rownum > 1;

select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));

There are some interesting points to notice when running this example:

1. A cursor that uses non-equal operators like above less or equal together with bind variables usually gets marked as "bind-sensitive" and will be monitored by ACS. But in the above case the cursor was not marked as bind sensitive and hence ACS didn't even bother to monitor

2. Consequently the two sessions share the single child cursor and the problem is not addressed by ACS even in subsequent executions

3. It looks like that again the usage of Dynamic Sampling disables ACS

Looking at the way ACS manages the Cursor Sharing criteria (check V$SQL_CS_SELECTIVITY for example) I see the problem that ACS probably wouldn't support the fact that the same value for the bind variable resulted in a completely different selectivity range.

May be this is an explanation why ACS is not activated for cursors that use Dynamic Sampling - ACS may only be able to cope with different bind value ranges that lead to different selectivities.

So even when using bind variables and 11g with ACS it looks like that only the RLS policy approach allows to address this issue from a database-only side. Ideally the application should be "data-aware" in such cases and help the database accordingly to arrive at reasonable execution plans by actively unsharing the cursors.

Move the EM12c repository database

I have made a little mistake creating a RAC database for the OEM 12c repository-I now need a little more lightweight solution, especially since I’m going to do some fancy failover testing with this cluster soon! An 11.2.0.3 single instance database without ASM, that’s what I’ll have!

Now how to move the repository database? I have to admit I haven’t done this before, so the plan I came up with is:

  1. Shut down the OMS
  2. Create a backup of the database
  3. Transfer the backup to the destination host
  4. Restore database
  5. Update OEM configuration
  6. Start OMS

Sounds simple enough, and it actually was! To add a little fun to it I decided to the use a NFS volume to backup to. My new database host is called oem12db, and it’s running Oracle 11.2.0.3 64bit on Oracle Linux 6.1 with UEK. I created the NFS export using the following entry in /etc/exports:

/u01/oradata/oraback   192.168.99.0/255.255.255.0 (rw,sync)

Simples! After an “exportfs -a” as root the directory was exported. The mount options are a little stricter, I combined a number of MOS notes here to end up with these mount options:

  • rw,
  • bg,
  • hard,
  • nointr,
  • rsize=65536,
  • wsize=65536,
  • tcp,
  • vers=3,
  • timeo=600,
  • actimeo=0

Previously documentation suggested a {r,w}size of only 32k but one MOS document I found for 11.2.0.2 on Linux x86-64 recommends 64k. The command to mount the file system then turns into this lengthy string:

[root@rac11203node1#] mount -t nfs oem12db:/u01/oradata/oraback on /u01/oradata/oraback \
> -o rw,bg,hard,nointr,rsize=65536,wsize=65536,tcp,vers=3,timeo=600,actimeo=0

Note that I mount the directory at the same place in the tree as it’s exported. That way I am saving me the trouble of registering the backup on the destination host. Now take the backup. This is my lab: my OEM repository is in NOARCHIVELOG mode (something that you should NOT do in real life!) so I have to start the database in mount mode before backing it up:

[oracle@rac11203node1 ~]$ rman target /

Recovery Manager: Release 11.2.0.3.0 - Production on Mon Oct 17 14:54:33 2011

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

connected to target database: EMREP (DBID=3987577832, not open)

RMAN> run {
allocate channel c1 device type disk format '/u01/oradata/oraback/%U';
backup as compressed backupset database;
backup current controlfile;
}
2> 3> 4> 5>
using target database control file instead of recovery catalog
allocated channel: c1
channel c1: SID=248 instance=emrep1 device type=DISK

Starting backup at 17-OCT-11
channel c1: starting compressed full datafile backup set
channel c1: specifying datafile(s) in backup set
input datafile file number=00007 name=+DATA/emrep/datafile/mgmt_tablespace.269.764589287
input datafile file number=00001 name=+DATA/emrep/datafile/system.256.764546085
input datafile file number=00002 name=+DATA/emrep/datafile/sysaux.257.764546087
input datafile file number=00005 name=+DATA/emrep/datafile/undotbs2.264.764546211
input datafile file number=00008 name=+DATA/emrep/datafile/mgmt_ad4j_ts.270.764589289
input datafile file number=00003 name=+DATA/emrep/datafile/undotbs1.258.764546089
input datafile file number=00006 name=+DATA/emrep/datafile/mgmt_ecm_depot_ts.268.764589285
input datafile file number=00004 name=+DATA/emrep/datafile/users.259.764546089
channel c1: starting piece 1 at 17-OCT-11
channel c1: finished piece 1 at 17-OCT-11
piece handle=/u01/oradata/oraback/02mpb7hn_1_1 tag=TAG20111017T145446 comment=NONE
channel c1: backup set complete, elapsed time: 00:01:35
Finished backup at 17-OCT-11

Starting backup at 17-OCT-11
channel c1: starting full datafile backup set
channel c1: specifying datafile(s) in backup set
including current control file in backup set
channel c1: starting piece 1 at 17-OCT-11
channel c1: finished piece 1 at 17-OCT-11
piece handle=/u01/oradata/oraback/03mpb7km_1_1 tag=TAG20111017T145622 comment=NONE
channel c1: backup set complete, elapsed time: 00:00:03
Finished backup at 17-OCT-11

Starting Control File and SPFILE Autobackup at 17-OCT-11
piece handle=/u01/oradata/oraback/c-3987577832-20111017-00 comment=NONE
Finished Control File and SPFILE Autobackup at 17-OCT-11
released channel: c1

RMAN> exit

That takes care of almost everything. Since I’m lazy I also used SQL plus to create a backup of the controlfile as well as a textual representation of the spfile:

SQL> alter database backup controlfile to '/u01/oradata/oraback/controlfile';

Database altered.

SQL> create pfile='/u01/oradata/oraback/pfile' from spfile;

File created

Now after all that NFS setup and backing up across the network I can simply restore the database without having to use scp or similar to transfer the database. The initial work is to create a pfile and password file on oem12db. I’ll spare you the boring detail, the important facts to remember are:

  • Remove references to emrep1 and emrep2-these are RAC specific!
  • Where applicable replace emrep1 with emrep, i.e. for undo_tablespace=UNDOTBS1
  • You don’t need to map online redo log threads to instances-it’s done for you automatically
  • You don’t need to reference the second undo tablespace
  • Change db_create_file_dest to a file system location (I’m not usign ASM on the destination host!)

With that done, and all required directories created, it’s time to start the new database instance in “nomount mode” and fire up an RMAN session. I exported my ORACLE_SID to emrep to denote it’s a single instance database.

[oracle@oem12db dbs]$ rman target /

Recovery Manager: Release 11.2.0.3.0 - Production on Mon Oct 17 15:02:09 2011

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

connected to target database: EMREP (not mounted)

RMAN> restore controlfile from '/u01/oradata/oraback/controlfile';

Starting restore at 17-OCT-11
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=246 device type=DISK

channel ORA_DISK_1: copied control file copy
output file name=/u01/oradata/EMREP/controlfile/o1_mf_79rfcgh8_.ctl
Finished restore at 17-OCT-11

RMAN> alter database mount
2> ;

database mounted
released channel: ORA_DISK_1

RMAN> restore database
2> ;

Starting restore at 17-OCT-11
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=246 device type=DISK

channel ORA_DISK_1: starting datafile backup set restore
channel ORA_DISK_1: specifying datafile(s) to restore from backup set
channel ORA_DISK_1: restoring datafile 00001 to +DATA/emrep/datafile/system.256.764546085
channel ORA_DISK_1: restoring datafile 00002 to +DATA/emrep/datafile/sysaux.257.764546087
channel ORA_DISK_1: restoring datafile 00003 to +DATA/emrep/datafile/undotbs1.258.764546089
channel ORA_DISK_1: restoring datafile 00004 to +DATA/emrep/datafile/users.259.764546089
channel ORA_DISK_1: restoring datafile 00005 to +DATA/emrep/datafile/undotbs2.264.764546211
channel ORA_DISK_1: restoring datafile 00006 to +DATA/emrep/datafile/mgmt_ecm_depot_ts.268.764589285
channel ORA_DISK_1: restoring datafile 00007 to +DATA/emrep/datafile/mgmt_tablespace.269.764589287
channel ORA_DISK_1: restoring datafile 00008 to +DATA/emrep/datafile/mgmt_ad4j_ts.270.764589289
channel ORA_DISK_1: reading from backup piece /u01/oradata/oraback/02mpb7hn_1_1
channel ORA_DISK_1: piece handle=/u01/oradata/oraback/02mpb7hn_1_1 tag=TAG20111017T145446
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:01:25
Finished restore at 17-OCT-11

RMAN> alter database open resetlogs;

database opened

RMAN> exit

This is pretty straight forward and should be familiar to every database administrator:

  1. Restore the controlfile from the backup created earlier
  2. Mount the database
  3. Restore the database
  4. Open the database resetlogs

Did you notice the messages stating “restoring datafile xxx to +DATA”? Oracle is lying to you about it, it’s restoring the db_create_file_dest. This is the beauty about OMF, you tell it in generic terms where you want the file to go to, and don’t bother about the details. Nowhere did I specify db_file_name_convert or log_file_name_convert!

Now it’s time to update the OMS configuration. This operation is vi-less, the configuration is now viewed and modified via emctl (since Grid Control 11.1):

[oracle@oem12oms ~]$ emctl config oms -list_repos_details
Oracle Enterprise Manager Cloud Control 12c Release 12.1.0.1.0
Copyright (c) 1996, 2011 Oracle Corporation.  All rights reserved.
Repository Connect Descriptor : (DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=rac11203scan)(PORT=1521)))(LOAD_BALANCE=ON)(CONNECT_DATA=(SERVICE_NAME=emrep)))
Repository User : SYSMAN

The next command updates the configuration. Keep the warning in mind: if you have multiple management servers you need to run this command on each.

[oracle@oem12oms ~]$ emctl config oms -store_repos_details -repos_port 1521 -repos_sid emrep -repos_host oem12db.localdomain -repos_user SYSMAN -repos_pwd secret
Oracle Enterprise Manager Cloud Control 12c Release 12.1.0.1.0
Copyright (c) 1996, 2011 Oracle Corporation.  All rights reserved.
Successfully updated datasources and stored repository details in Credential Store.
If there are multiple OMSs in this environment, run this store_repos_details command on all of them.
And finally, restart all the OMSs using 'emctl stop oms -all' and 'emctl start oms'.

[oracle@oem12oms ~]$ emctl start oms
Oracle Enterprise Manager Cloud Control 12c Release 12.1.0.1.0
Copyright (c) 1996, 2011 Oracle Corporation.  All rights reserved.
Starting WebTier...
WebTier Successfully Started
Starting Oracle Management Server...
Oracle Management Server Successfully Started
Oracle Management Server is Up
[oracle@oem12oms ~]$

Job done-the OMS has come up and I can free my RAC nodes for other work.

Real Steel…

If you remove the humans from Real Steel, you pretty much have Rocky.

As far as the humans are concerned, Hugh Jackman is ok. The kid who plays his son is a little annoying, but to be fair, so are most of the kids in films. There are quite a few cheesy moments, but they are spread out so they aren’t like fingernails down a chalkboard.

I think the biggest problem with the film is the robots have no personalities. It’s just a giant and very expensive version of Rock’em Sock’em Robots. It’s hard to engage with a chunk of metal when it has no outward signs of personality. They are nothing like Transformers, which are totally real. :)

Having said that, its an OK bit of mindless fun. I tried to listen to other people talking on the way out to gauge the general reaction. It seemed to vary from “Awesome!” to “What a complete pile of xxxx!”. I guess I stand somewhere in the middle.

Cheers

Tim…




OOW2011 A.P. (After Presentation)

Wednesday
What a joy to wake up without a presentation to think about and just a big hangover and lack of breakfast standing between me and "Oracle Optimizer: Best Practices for Managing Optimizer Statistics" with Maria Colgan. As I entered the room she questioned why I was there and in my bleary-eyed state, it took me a while to realise she'll be presenting this at UKOUG so I could have just had breakfast after all and caught her session another time! I'm glad I didn't though and, sitting alongside Randolf Geist again, we were both surprised by how technical some of the content was. Statistics is such a crucial part of my current work that it was good to hear some new perspectives and there was a pretty detailed explanation of how histograms work which I suspect would have helped a lot of people. It was also a very practical, realistic presentation that discussed the challenges inherent in gather statistics on large partitioned objects and offered a variety of solutions that would suit different systems. Good stuff and it was fun to be focussing on learning again, rather than worrying ;-)

Next I finally made it into the Exhibition Halls before they closed at the end of the day, but there is simply too much going on in the massive exhibition at Openworld that I only managed a quick walk around Moscone South, picked up a FusionIO T-shirt and then popped round to the Demogrounds where JB and Uri were showing people some of the new ASH and ADDM features in OEM 12c. I would have liked to hang out longer, but I planned to attend the ASH in 3D presentation on Thursday morning anyway and I had a hot date with the SQL Tuning Experts Roundtable.

This was a guided discussion with a panel consisting of some of the biggest of Oracle names - Graham Wood, Tom Kyte and Benoit Dageville. That third name might not be so familiar, but Benoit was one of the reasons I attended because I hoped I might be able to say hello to him afterwards because I believe he's largely responsible for Real Time SQL Monitoring and I wanted to express my thanks and admiration! All hail SQL Mon! ;-) With so much experience and knowledge on the panel, it's not surprise that the conversation was full of good sense and thought-provoking messages. One that stands out from memory is that object statistics collection options have changed so much over the past few versions and particularly 11g that clever custom stats collection code you might have developed for previous versions should really be revisited. For example, do we still need all those ESTIMATE_PERCENT values, or should we be thinking about default one-pass distinct samplng options?

As the panel drew to a close, I tried to say hello to Tom Kyte and fawn over Benoit (only slightly kidding) but, unsurprisingly, they were surrounded by keen questioners so I made my excuses and left for an important errand. Kellyn had offered her wrist-band so my friend could go to the big party that night. In the end, he didn't make it, but the band still went to a happy home and was much appreciated! Thanks, Kellyn :-)

The rest of the afternoon was spent in Mogens Miracle Office (i.e. Chevys)  - there's a pattern emerging here - which has become a highlight of the conference for me as it allows me to catch up with friends I don't see too often (including those, like Marco, who I hadn't seen much of yet at this conference) but there's also some fascinating technical chat too. I could have attended Larry Ellisons keynote and I heard he was on-form, but I'd heard most of what was going to be announced at the briefing earlier and I know I can always catch up with it later, as you can here. (Actually, as I watch it while typing this, I take back some of the things I said to people about how weird Larry looks these days. He actually looked better this year than he has previous years. Maybe his face is, erm, settling in? "Next Face" ;-))

Sadly, my phone battery was getting lower and lower so I headed back to my hotel to get some charge and get changed before the Bloggers Meetup that Pythian and OTN organise and sponsor these days. As I was walking there, the news of Steve Jobs death started to break and to balance the ludicrous bandana wearing, Paul Vallee made a brief impromptu speech about Jobs. I couldn't catch him properly in the noisy bar but I believe Alex Gorbachev filmed it so maybe it will appear online at some point? As always, the Bloggers Meetup was a fun and friendly introduction to the insanity of the Appreciation Event at Treasure Island.

I followed my own advice and stuck with a couple of guys - Andy Klock and Marco Gralike - for the duration and I couldn't have asked for better company. I could have asked for better music than Sting ;-) but Tom Petty unexpectedly (for me) more than made up for it and we managed to stuff our faces with a little food and a few beers added to some good conversation (and one day, Andy, I'll take your advice ...). Before we could head home, we had some unfinished business to attend to as Andy and I were determined to win a new cuddly toy for the comfy seat at home and a new bear under orders from Pythian's Vanessa Simmons. Although we found the whole challenge pretty tough, the bear came along relatively easy but as soon as I saw a certain other cuddly toy, my heart was set.

Now at this point I could tell you that Andy's fairground game skills are top-notch or I could tell you that the man is a petty thief with no patience but I suppose the best strategy is to say nothing. The fact remains that when we got on the bus back to the W, Susan the Seahorse came with us to meet Alex Gorbachev!

Nice work, Andy!

Thursday
Sigh, I *knew* this would happen when I saw the "AWR and ASH in 3-D: Performance Analysis Tools No DBA Has Seen Before" presentation was scheduled at 9am on Thursday morning :-( Despite setting my alarm, I opened my eyes at 9 and, given the need to pack, check-out and the rest, I was destined to miss it. I decided I would just have to play around with OEM 12c when I got home but was still very disappointed to miss the demo. I realised I could make the next session if I moved quickly, but I have rarely packed in such a hap-hazard way. I don't think the Cuddly Toys knew quite what hit them!

At least I only arrived a little late for the Real-World Performance group's Questions and Answers session with Andrew Holdsworth, Greg Rahn and Graham Wood which was the usual interesting stuff.

Which made Oracle Database Optimizer: Tips for Preventing Suboptimal Execution Plans with Maria Colgan and Mohammed Zait my final session of the week. Maria was on her usual top form going through some initial slides before turning the floor open to questions. Of course, she doesn't know just how much people were enjoying the session because she doesn't do Twitter, but I'm sure we can sort that out before the next conference.

The stand-out for me from the initial slides was how often drilling into the predicates section of a detailed execution plan shows that implicit data type conversion is disabling the possibility of the optimiser using an appropriate index. Watch for that INTERNAL_FUNCTION appearing in your plans! I've solved quite a few SQL performance issues like this over the years. The Q & A section was extremely popular to the extent that after hearing a gentleman use the microphone to declare his undying love for Maria (man, I wish I had a fan club like that), I was able to go outside, grab a smoke, come back in, listen to another 10 minutes of questions before we were kicked out of the hall and Maria was still surrounded by a crowd of questioners for a good 10-15 minutes after that! Eventually somebody had to rescue her so, being the gentleman I am, I stepped in and dragged her along to Moans Office in Chevys where we all had a final lunch, some margeritas were consumed and after a long chat I was ready to head back to the hotel to pick up my luggage and head out to SFO. Seriously, the conversations I had at OOW are just as important to me as the presentations, so thanks for those!

After a fairly uneventful but slightly tiring trip back home this news was waiting for me.

Seahorses found in London? Who would have thought it?! ;-)

These posts have been almost devoid of technical content so might work as a dairy for my friends, family and I, but what if you were hoping to actually learn something and weren't able to attend OOW personally? Well the good news is that the slides for almost all the presentations are available free and online for anyone to check out but, I warn you, there's a lot to try and choose from!

As always, I had a brilliant time at Openworld so thanks to all the friends who made it for me and the good people from Miracle, OTN and Pythian for keeping me in a state of perpetual merriment ;-) Cheers and roll on next year!

Disclosure - I attended the ACE Directors briefing and Openworld 2011
courtesy of
the Oracle ACE Director program, which covered my travel and
accommodation expenses. Cuddly Toy incidentals and sizeable bar bills were on
Pythian, Miracle, OTN or me in most cases. Susan the Seahorse was courtesy of Andy Klock
(@andyklock) of Pythian.

OEM 12c , cloud control

Was emailing with my esteemed college John Beresniewicz at Oracle in the OEM group. John and I worked together on OEM 10g and thank goodness he is still there as he is generally behind any good quantitative visualizations you might see in the product. Here is one cool example he sent me:

The database load, AAS, can be time selected and from the selection a load map is shown, in this case of which objects are creating the most I/O load, group by type of I/O. Super cool. Congrats JB and I look forward to exploring more in OEM 12c “cloud control”.

 

 

Friday Philosophy – The One Absolute Requirement for System Success

Alternative title “The lady from Patient Admin – she says YEEESSSS!!!!!!”

What must you always achieve for an IT system to be a success?

  • Bug free? Never happens.
  • Within budget/time frame? That would be nice.
  • Includes critical business functionality? Please define critical.
  • Secure? Well, it’s important for many systems but then it is often lacking (even when it is important).
  • That it is to specification? Well we all know that’s wrong.

There is only one thing that an IT system must always achieve to be a success.

User Acceptance.

For an individual system other considerations may well be very important, but the user acceptance is, I think, non-negotiable.

The user must get enough out of using the system for it to be worth their while, otherwise at best they will resent using it and at worst… Well, at worst they will use it but will put in any old rubbish to fulfill the dictate that it be used. You would be better off if they did not use the system at all. Here are a couple of examples from my working past.

In the first one, I was involved in extending a hospital management system so that it kept track of the expected departure times for patients, allowing a predication of when beds would become available and calculation of expected occupancy rates. Yes, this was a while ago (maybe 1990) and on an a system that was old then. The information was needed by someone with the title “bed nurse” {or something similar} so that they could better prepare for bringing patients in and keeping a higher bed usage ratio. This was to make the hospital more efficient? No, it was to satisfy a politically demanded report to the NHS executive. Oh, the overall intention was to increase efficiency but the report soon became more important than the idea. So, we added columns in tables and field on screens and prompts for the ward staff to fill in the information. And they didn’t. The nurses were busy, they were pretty demoralized due to having recently been used by the government as a way to control public sector pay and they had nursing duties to do. They were not going to waste a couple of minutes trying to check when Mrs Jenkins was going to be sent home when Mrs Leonard needed a bed pan. The nursing staff were given a hospital-wide telling off, this information had to be entered. They put in the data – but guessed wildly. The design was fine, the report was logically accurate, only the correct staff could run it, but No User Acceptance and thus a failure.

So I added something else. It was a very crude screen that showed a “diagram” of the ward – Down the left and right side of a VT220 screen you saw little oblong boxes with a bed number, name in it, a consultant’s initials, a medical speciality code and the arrival and departure datetime. This was some information we already had plus the new information we wanted and something quite basic, limited and slow to draw. But it was useful to the ward staff. They could find any patient, they knew who to call if there was an emergency {not the actual consultant of course, but their secretary}, they could check when they were leaving, they could see what time someone was expected. From anywhere where there was a terminal, not just the entrance to the ward, they could see all this information. They used it.  They put in the expected departure time {sobering thought, this might not be expected leaving alive} and the bed nurse could plan and the report could be run.

Second example, different hospital. We were putting together a system to schedule outpatient clinics. We knew what we were doing, it’s pretty simple. You have some people (a consultant and probably a senior house officer), a period for the clinic (3 or 4 hours) and a set of people to see, say 40.  Give some flexibility in slot lengths (some people need 5 minutes, some 15) and allow the patients to be booked in. Check for and stop double booking. We did not go and ask the patient admin staff, we knocked up the design and the screens and asked them to test. After all, I was very experienced now, I’d been doing these systems for 3 years… They very quickly came back to us and said it was rubbish. Oh dear.

We went and saw them. I think it was a couple of us programmers, our development manager, the hospital liaison for the project and the patient admin staff. “What’s the problem?” There were a few but the main one was that you could not double book a slot. Why would you want to? Do two patients really want to be consulted at the same time with the same doctor?.
“Err, maybe, it might happen, can we just be able to double book?” OK, we could maybe alter things to allow two patients to be seen at the same time… The patient admin staff are not looking happy. The hospital liaison is looking confused – “You can’t do that! Patient confidentiality can’t be broken!” he says. It got worse. “We need to book all the patients into the first slot, with the consultant, so the letters go out to them saying come to see Mr Winders at 1pm”. The admin staff are now looking very shifty.

If any of you have worked in the health service you are probably way ahead of me. The admin staff needed to book all the patients in at this first slot so that they would all turn up, the consultant would see the two or three he was interested in - and then go and play golf. The SHO would then plough through the rest of the patients for the following three or four hours. If you have ever had to turn up at the start of a consultancy session and sat there for three hours, now you know why. You see, back then, the consultant was only a very small step away from deity level (and I leave it to you to decide if it was a step up or down). What they said went and if they wanted to go and play golf or store 200 medical records in the boot of their car or refuse to speak to “that stupid idiot in renal medicine” then you worked around it. {I’m assured that things are a lot better now, but I’d love to know how it really is}.

We had designed a sensible system, the users needed a non-sensible {to our mind} system. Even the NHS liaison chap had never appreciated exactly how much the consultants abused the system, he thought they just booked the people s(he) wanted at the start of the session, but no. The consultant decided that day who was interesting and as a result every patient had to be there at the start.

I count myself lucky that I learnt from direct experience so soon in my working life that (a) you have to deliver what the user will accept and (b) the only way to know what they want is to show them the system and talk with them.

{For those of you who do not understand the alternative title at the top, it is all about an old DelMOnte fruit juice advert which became a bit of a catchphrase at the time}

{And are you happy now Dom? :-) }

A History of (my) Certifications…

I was chatting with the lady doing OCP Lounge registrations at OOW11. During this chat I mentioned I hadn’t received a certificate for the SQL Expert certification. It never crossed my mind to re-request it, since my certifications are visible on certview.oracle.com anyway. Yesterday, a DHL man delivered the missing certificate, which prompted me to look though my certifications and scan this image.

First, check out the card on the bottom right. I was unaware the “Expert” certifications had a different colour card.

Second, notice anything funny about the 9i DBA OCP certification?

It’s hard to believe it’s over 12 years since I first completed one of these certifications… :)

Cheers

Tim…