January 12, 2012 I thought that I would start this slightly off topic blog article with a bit of humor. Seven months ago I wrote a blog article that refuses to move from the first position in the most visited articles on this blog. In the process of trying to understand why a mathematics focused article [...]
You may have noticed that I’m having a little trouble keeping up to date on the blog at the moment – I know I’ve got several comments on Oracle Core to respond to but haven’t had time to look at them yet. Very briefly, though, I thought I’d point to a note that Doug Burns has just posted about John Beresniewicz’s presentation one “outliers” at the UKOUG conference.
Key feature: it supplies a query that could be very useful for capturing short, but nasty, events; and has links to a couple of documents explaining what it’s trying to do and why. It’s a query that could do with more exercise on production systems so that Doug can get some feedback to JB about how effect it is, and how it could be improved.
On February 14-16, I’ll be at the Colorado Convention Center in Denver, Colorado for RMOUG’s Training Days Conference. This is the largest regional Oracle User Conference in North America and attracts presenters from all around the country and the globe. I’ll be presenting:
Presentation Name: Troubleshooting RAC Background Process
Abstract: RAC background process performance is critical to keep the application performance. This session will demo techniques to review the performance of RAC background processes such as LMS, LMD, LMON, etc. using various statistics and UNIX tools. The presentation will also discuss why certain background processes must run in higher priority to maintain the application performance in RAC.
Presentation Name: A Kind and Gentle Introduction to RAC
Abstract: This session will introduce basic concepts such as cache fusion, conversion to RAC, protocols for interconnect, general architectural overview, GES layer locks, clusterware, etc. The session will also discuss the srvctl command and demo a few of these commands to improve the understanding.
Presentation Name: Parallel Execution in RAC
Abstract: This presentation will start to discuss and demo parallel server allocation, intra, and inter node parallelism aspects. The session will discuss the new parallelism features such as parallel statement queuing, parallel auto dop, and discuss the interaction of those features with RAC. The session will probe a few critical parameters to improve PQ performance in RAC.
Click here for more information or to register for RMOUG’s Training Days.
This post is about a potential pitfall when migrating from 11.2.0.x to the next point release. I stumbled over problem this one on a two node cluster.
The operating system is Oracle Linux 5.5 running 22.214.171.124.3 and I wanted to go to 126.96.36.199.0. As you know, Grid Infrastructure upgrades are out-of-place, in other words require a separate Oracle home. This is also one of the reasons I wouldn’t want less than 20G on a non-lab like environment for the Grid Infrastructure mount points …
Now when you are upgrading from 11.2.0.x to 188.8.131.52 you need to apply a one-off patch, but the correct one! Search for patch number 12539000 (11203:ASM UPGRADE FAILED ON FIRST NODE WITH ORA-03113) and apply the one that matches your version-and pay attention to these PSUs! There is the obvious required opatch update to be performed before again as well.
So much for the prerequisites. Oracle 184.108.40.206 is available as patch 10404530, and part 3 is for Grid Infrastructure which has to be done first. This post only covers the GI upgrade, the database part is usually quite uneventful in comparison…
Upgrading Grid Infrastructure
After unzipping the third patch file you start runInstaller. But not before having carefully unset all pointers to the current 220.127.116.11 GRID_HOME (ORACLE_HOME, ORACLE_SID, LD_LIBRARY_PATH, ORA_CRS_HOME, etc)!
Clicking through OUI is mostly a matter of “next”, “next”, “next”, the action starts with the rootupgrade.sh script. Here’s the output from node1:
[root@node1 ~]# /u01/crs/18.104.22.168/rootupgrade.sh Performing root user operation for Oracle 11g The following environment variables are set as: ORACLE_OWNER= oracle ORACLE_HOME= /u01/crs/22.214.171.124 Enter the full pathname of the local bin directory: [/usr/local/bin]: The contents of "dbhome" have not changed. No need to overwrite. The file "oraenv" already exists in /usr/local/bin. Overwrite it? (y/n) [n]: y Copying oraenv to /usr/local/bin ... The file "coraenv" already exists in /usr/local/bin. Overwrite it? (y/n) [n]: y Copying coraenv to /usr/local/bin ... Entries will be added to the /etc/oratab file as needed by Database Configuration Assistant when a database is created Finished running generic part of root script. Now product-specific root actions will be performed. Using configuration parameter file: /u01/crs/126.96.36.199/crs/install/crsconfig_params Creating trace directory User ignored Prerequisites during installation ASM upgrade has started on first node. CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'node1' ... CRS-2792: Shutdown of Cluster Ready Services-managed resources on 'node1' has completed CRS-2673: Attempting to stop 'ora.gpnpd' on 'node1' CRS-2677: Stop of 'ora.gpnpd' on 'node1' succeeded CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'node1' has completed CRS-4133: Oracle High Availability Services has been stopped. OLR initialization - successful Replacing Clusterware entries in inittab clscfg: EXISTING configuration version 5 detected. clscfg: version 5 is 11g Release 2. Successfully accumulated necessary OCR keys. Creating OCR keys for user 'root', privgrp 'root'.. Operation successful. PRCA-1056 : Unable to upgrade ACFS from version 188.8.131.52.0 to version 184.108.40.206.0 PRCT-1011 : Failed to run "advmutil". Detailed error: advmutil: CLSU-00100: Operating System function: open64 failed with error data: 2advmutil: CLSU-00101: Operating System error message: No such file or directory|advmutil: CLSU-00103: error location: OOF_1|advmutil: CLSU-00104: additional error information: open64 (/dev/asm/orahomevol-315)|advmutil: ADVM-09006: Error opening volume /dev/asm/orahomevol-315 srvctl upgrade model -first ... failed Failed to perform first node tasks for cluster modeling upgrade at /u01/crs/220.127.116.11/crs/install/crsconfig_lib.pm line 9088. /u01/crs/18.104.22.168/perl/bin/perl -I/u01/crs/22.214.171.124/perl/lib -I/u01/crs/126.96.36.199/crs/install /u01/crs/188.8.131.52/crs/install/rootcrs.pl execution failed
So that was not too great indeed-my update failed halfway through. Two facts make this bearable:
Now advmutil was correct-there were no volumes in /dev/asm/*
An analysis of the rootcrs_node1.log file showed that the command that failed was this one
2012-01-06 10:09:10: Executing cmd: /u01/crs/184.108.40.206/bin/srvctl upgrade model -s 220.127.116.11.0 -d 18.104.22.168.0 -p first 2012-01-06 10:09:12: Command output: > PRCA-1056 : Unable to upgrade ACFS from version 22.214.171.124.0 to version 126.96.36.199.0 > PRCT-1011 : Failed to run "advmutil". Detailed error: advmutil: CLSU-00100: Operating System function: open64 failed with error data: 2|advmutil: CLSU-00101: Operating System error message: No such file or directory|advmutil: CLSU-00103: error location: OOF_1|advmutil: CLSU-00104: additional error information: open64 (/dev/asm/orahomevol-315)|advmutil: ADVM-09006: Error opening volume /dev/asm/orahomevol-315 >End Command output 2012-01-06 10:09:12: "/u01/crs/188.8.131.52/bin/srvctl upgrade model -s 184.108.40.206.0 -d 220.127.116.11.0 -p first" failed with status 1. 2012-01-06 10:09:12: srvctl upgrade model -first ... failed
Thinking Clearly is an idea I thought I had adopted from Cary Millsap, but sadly I didn’t apply it here! Lesson learned: don’t assume, check!
I however assumed that because of the shutdown of the clusterware stack there wasn’t any Oracle software running on the node, hence there wouldn’t be an ADVM volume BY DEFINITION. Cluster down-ADVM down too.
Upon checking the log file again, I realised how wrong I was. Most of the lower stack Clusterware daemons were actually running by the time the srvctl command failed to upgrade ACFS to 18.104.22.168. So the reason for this failure had to be a different one. It quickly turned out that ALL the ACFS volumes were disabled. A quick check with asmcmd verified this:
$ asmcmd volinfo -a Volume Name: ORAHOMEVOL Volume Device: /dev/asm/orahomevol-315 State: DISABLED Size (MB): 15120 Resize Unit (MB): 256 Redundancy: UNPROT Stripe Columns: 4 Stripe Width (K): 128 Usage: ACFS Mountpath: /u01/app/oracle/product/22.214.171.124
OK, that explains it all-disabled volumes are obviously NOT presented in /dev/asm/. A call to “asmcmd volenable -a” sorted that problem.
Back to point 1 – rootupgrade.sh is restartable. I then switched back to the root session and started another attempt at running the script and: (drums please) it worked. Now all that was left to do was to run rootupgrade.sh on the second (and last) node. This completed successfully as well. The required patch for the ASM rolling upgrade by the way is needed there and then-the rootcrs_lastnode.log file has these lines:
2012-01-10 09:44:10: Command output: > Started to upgrade the Oracle Clusterware. This operation may take a few minutes. > Started to upgrade the CSS. > Started to upgrade the CRS. > The CRS was successfully upgraded. > Oracle Clusterware operating version was successfully set to 126.96.36.199.0 >End Command output 2012-01-10 09:44:10: /u01/crs/188.8.131.52/bin/crsctl set crs activeversion ... passed 2012-01-10 09:45:10: Rolling upgrade is set to 1 2012-01-10 09:45:10: End ASM rolling upgrade 2012-01-10 09:45:10: Executing as oracle: /u01/crs/184.108.40.206/bin/asmca -silent -upgradeLocalASM -lastNode /u01/crs/220.127.116.11 2012-01-10 09:45:10: Running as user oracle: /u01/crs/18.104.22.168/bin/asmca -silent -upgradeLocalASM -lastNode /u01/crs/22.214.171.124 2012-01-10 09:45:10: Invoking "/u01/crs/126.96.36.199/bin/asmca -silent -upgradeLocalASM -lastNode /u01/crs/188.8.131.52" as user "oracle" 2012-01-10 09:45:10: Executing /bin/su oracle -c "/u01/crs/184.108.40.206/bin/asmca -silent -upgradeLocalASM -lastNode /u01/crs/220.127.116.11" 2012-01-10 09:45:10: Executing cmd: /bin/su oracle -c "/u01/crs/18.104.22.168/bin/asmca -silent -upgradeLocalASM -lastNode /u01/crs/22.214.171.124" 2012-01-10 09:45:51: Command output: > > ASM upgrade has finished on last node. > >End Command output 2012-01-10 09:45:51: end rolling ASM upgrade in last
Note the ROLLING UPGRADE!
If your rootupgrade.sh script bails out with ADVMUTIL, check if your ACFS volumes are enabled-they most likely are not.
Several years ago I met Arjen Visser and Bertie Plaatsman from Dbvisit and they told me about their standby database product, which is a replacement for Data Guard. Now I don’t spend much time on non-Oracle products, but this one was interesting to me as it works on Standard Edition, unlike Data Guard which is an Enterprise Edition option. From that point onward I kept seeing Arjen and conferences and telling myself I really should take a look at the product.
Over last year I bumped into Arjen at a few conferences, along with some other members of the company (Eric, Mike and Vit). They are a cool group of people, so my interest in their products was ignited again. Finally, after several years of showing interest I tried out the standby product towards the end of last year, which resulted in the following article.
I held the article back until now because I was waiting for version 6.0.16 to be released so I could check out the revised web interface.
It’s a really nice product. Simple to install. Easy to use. Does exactly what it says it does. Most importantly, it’s backed by a cool group of people. When I tried the previous releases I had a few comments about the documentation and those were taken on board and changes were made. This is why I like dealing with smaller companies. There aren’t endless layers of bureaucracy involved in changing a few sentences in an install document.
I’ve said I’ll give their replication product (kinda like Golden Gate) a go, but based on previous experience it will probably take me about 4 years to get round to that.
I’m really pleased to see that Richard Foote has started a series on Index Organized Tables. You can see his introductory post on the topic here. As ever with Richard, he puts in lots of detail and explanation and I’ve been a fan of his blogging style for a long time.
I’ve got a few posts on the topic left to do myself, maybe this competition will spur me to get on and write them!
What I will also be very interested to see is the different way we will cover the same topic. Richard has already put in a block dump and dug into the details a little at a low level about how the data is stored, which I have not done. He has also shown how using an IOT instead of a fully overloaded index (where you create an index covering all the columns of the table, to avoid visiting the table for other columns) results in not only saving the space taken up by the redundant heap table but that the IOT index is smaller than the fully overloaded index. This is due to the lack of a rowid.
I put in occasional pictures and maybe write more about how the example matches real world situations. If you want, you can look back at my own introduction to the topic.
I’m sure this is going to be an excellent series and I’ll be following it myself.
Thought it was high time that I covered in a little detail the subject of Index Organized Tables (IOTs). When used appropriately, they can be an extremely useful method of storing and accessing data. Hopefully by the end of this series, you’ll have a better understanding of IOTs, their respective strengths and weaknesses and so perhaps [...]
Chris Date is one of the founding fathers of relational databases. Having worked with Ted Codd at IBM during the time when relational databases were being defined gives Chris a perspective that most of us just don’t have. I’ve had the good fortune to hear him speak in the past (at the Hotsos Symposium) and thought I would do a quick post to highlight the fact that he is scheduled to speak in Dallas the week of Jan 30. Method-R is hosting the event in the Enkitec training facilities in Dallas. So maybe I’ll get to hang around with Chris and Cary that week – that would be cool! Anyway, there are actually 2 classes:
And here’s a link to the registration page: C. J. Date Seminar Registration
By the way, I think every developer and every database architect should have a clear understanding of how the SQL language is designed to work and how relational databases were intended to be laid out. Chris obviously has a unique insight into those topics. One of the tenants of Chris’s teaching is that SQL is a complicated language and since comprehensive testing is almost never really feasible, it is important to write SQL using a disciplined approach based on the underlying relational theory. As a side note, I was talking to a few cohorts around the coffee pot today and was shocked to hear that one of the guys had a CS degree but was not required to take a relational theory class. Back when I got started that was the first class that people took, probably because there were almost no real implementations of the theory at that point. Oracle was just getting started and DB2 was still a distant gleam in Mr. Codd’s eye. But I digress.
It does seem to me that we have an awful lot of systems running on Oracle these days that were designed and written by people without a strong background in relational database fundamentals. I can’t even begin to count the number of times I’ve worked on systems that performed poorly due to poor SQL coding techniques and/or poor database design. Chris’s courses are designed to help you avoid these issues. So this is your chance to learn how to know for sure that your SQL is correct.
Hope to you see you there!