Search

OakieTags

Who's online

There are currently 0 users and 21 guests online.

Recent comments

11g Release 2

So you don’t see any disks when trying to install ASM?

This is a post that highlights the difference between operating systems, but also the fact that sometime it is hard to break out of a habit once you got used to it. My background is that of a Linux enthusiast, even though I equally like Solaris and AIX but I have a little less exposure to those.

Background

I have recently been asked to look at RAC on SPARC, which I gladly did. The system I was given had the usual software stack for RAC at this customer’s site. It comprised of:

  • Solaris 10 Update 9 64bit on SPARC
  • EMC Power Path 5.1
  • EMC VMAX storage – 10x10G LUNs for a specific performance test

The Power Path configuration has already been in place when I got the machine, and I was allocated /dev/emcpower2[0-9] for my ASM testing. For the experienced Linux user who relies on device-mapper-multipath, the Power Path naming convention can be a bit confusing at first. For reference, the pseudo devices we are interested in for ASM are created under /dev/rdsk/-the “raw” device directory for “character” based access rather than the block device in /dev/dsk/.  By default, the Power Path devices are called “emcpower”, followed by a number and a letter (in SPARC). An example would be /dev/rdsk/emcpower20c.

The number (20) is just a sequence number and doesn’t have any other meaning as far as I know. It also doesn’t seem to be consistent across the cluster nodes by default. The suffix-letter (“c”) does have a meaning though: it indicates the slice of the device. For example, the latter “a” indicates slice 0, xxxc is the whole disk, and xxxg is slice 6.

You can use the format command to have a look at the slices currently defined. Simply type “partition” and then “print” to print it.

Now all the disks I was given had a slice 6, from cylinder 274 to the end of the disk:

# format
[disk selection not shown here]
format> partition
partition> print
Current partition table (original):
Total disk cylinders available: 10238 + 2 (reserved cylinders)

Part      Tag    Flag     Cylinders         Size            Blocks
  0       root    wm       0 -   136      128.44MB    (137/0/0)     263040
  1       swap    wu     137 -   273      128.44MB    (137/0/0)     263040
  2     backup    wu       0 - 10237        9.37GB    (10238/0/0) 19656960
  3 unassigned    wm       0                0         (0/0/0)            0
  4 unassigned    wm       0                0         (0/0/0)            0
  5 unassigned    wm       0                0         (0/0/0)            0
  6        usr    wm     274 - 10237        9.12GB    (9964/0/0)  19130880
  7 unassigned    wm       0                0         (0/0/0)            0

partition> quit

If you’d like to see how the disks are connected to the array, of if there are any faults, use powermt display dev=all, or alternatively look at a specific device:

# powermt display dev=emcpower20
Pseudo name=emcpower20a
Symmetrix ID=00x2x4x0x6x4
Logical device ID=0x0x
state=alive; policy=SymmOpt; priority=0; queued-IOs=0;
==============================================================================
--------------- Host ---------------   - Stor -   -- I/O Path --  -- Stats ---
###  HW Path               I/O Paths    Interf.   Mode    State   Q-IOs Errors
==============================================================================
3072 pci@400/pci@0/pci@c/fibre-channel@0/fp@0,0 c2t50000974C00A6118d17s0 FA  7eA   active  alive       0      0
3077 pci@500/pci@0/pci@9/fibre-channel@0/fp@0,0 c3t50000974C00A611Cd17s0 FA  8eA   active  alive       0      0

If you care to learn more about how your system administrator configured the paths (I have 2 HBAs as you can see), use “powermt display”. Also note the disks c2t5….s0 and c3t5…s0 are the native devices aggregated into emcpower20a.

I have changed the permissions on the pseudo devices to 664 and made them owned by oracle:oinstall before launching OUI. In the disk discovery screen I changed the disk discovery string to /dev/rdsk/emcpower2* but didn’t see a single disk in the OUI window. That was odd.

Troubleshooting disk discovery

This error lead me to look at /tmp/OraInstall/installActions.log which referenced a call to kfod, the disk discovery tool. The entry in the log file showed this content:

WARNING: SRVCTL not found as there is no Grid Infrastructure home on the box.
 INFO: Executing [/tmp/OraInstall2011-07-25_11-33-50AM/ext/bin/kfod, nohdr=true, verbose=true, disks=all, status=true\
 op=disks, asm_diskstring='/dev/rdsk/emcpower2*']
 with environment variables {XFILESEARCHPATH=/usr/dt/app-defaults/%L/Dt, DISPLAY=localhost:12.0,\
 PWD=/u01/app/oracle/stage/grid, LC_CTYPE=en_GB.ISO8859-1, _=./runInstaller, USER=oracle,\
 NLSPATH=/usr/dt/lib/nls/msg/%L/%N.cat, LC_COLLATE=en_GB.ISO8859-1, CLASSPATH=, HOME=/export/home/oracle,\
 LC_NUMERIC=en_GB.ISO8859-1, SSH_CONNECTION=10.128.46.56 52883 10.129.48.214 22,\
 LD_LIBRARY_PATH=/tmp/OraInstall2011-07-25_11-33-50AM/ext/bin/../lib,\
 ORACLE_HOME=/tmp/OraInstall2011-07-25_11-33-50AM/ext/bin, SHELL=/usr/bin/bash, LOGNAME=oracle,\
 SSH_TTY=/dev/pts/3, SHLVL=1, LC_MONETARY=en_GB.ISO8859-1, MANPATH=/usr/share/man:/usr/openwin/share/man:\
 /db/pub/man:/db/pub/infra/samba/current/man:/db/pub/infra/rsync/current/man:/db/pub/infra/rcs/current/man:\
 /db/pub/infra/tcpdump/current/man:/usr/dt/share/man:/usr/java1.2/man:/usr/apache/man:/usr/perl5/man:/usr/j2se/man:\
 /usr/local/man:/opt/SUNWconn/ge/man:/opt/SUNWconn/man:/opt/SUNWsan/man:/opt/VRTS/man:/opt/DBpam/man:\
 /opt/SUNWexplo/man:/opt/SUNWcstu/man:/opt/VRTSvlic/man, SSH_CLIENT=10.128.46.56 52883 22, \
 MAIL=/var/mail//oracle, TZ=GB-Eire, LC_TIME=en_GB.ISO8859-1, JAVA_HOME=, LC_MESSAGES=C, PS1=[\u@\h \W]> , \
 OLDPWD=/u01/app/oracle/stage, LC_ALL=, TERM=xterm, TMOUT=0, PATH=/usr/bin:/usr/ccs/bin:/usr/bin:/opt/EMCpower/bin:\
 /etc/emc/bin:/etc:/zones/app/oracle/stage/grid/install}
 INFO: Starting Output Reader Threads for process /tmp/OraInstall2011-07-25_11-33-50AM/ext/bin/kfod
 INFO: Parsing KFOD-00311: Error scanning device /dev/rdsk/c3t50000974C00A611Cd16s7
[more errors skipped]
..

(Re-) Discovery of KFOD

So therein lies the problem! As a side effect I rediscovered the command line options used with kfod! Maybe I can use this for some experiments… All you need to do is to set environment variables for ORACLE_HOME and LD_LIBRARY_PATH as per the above output.

After a little fiddling around I remembered that unlike the tests with ZFS I did before I can’t pass the whole disk to ASM (the VTOC is in the first part of the disk!), but rather have to use a slice. So by changing the disk string to /dev/rdsk/emcpower2*g I got a successful report from kfod, executed as “oracle”:

$ /tmp/OraInstall2011-07-25_11-33-50AM/ext/bin/kfod verbose=true, disks=all status=true op=disks \
> asm_diskstring='/dev/rdsk/emcpower2*g'
--------------------------------------------------------------------------------
 Disk          Size Header    Path                                     User     Group
================================================================================
   1:       9341 Mb CANDIDATE /dev/rdsk/emcpower20g                    oracle   oinstall
   2:       9341 Mb CANDIDATE /dev/rdsk/emcpower21g                    oracle   oinstall
   3:       9341 Mb CANDIDATE /dev/rdsk/emcpower22g                    oracle   oinstall
   4:       9341 Mb CANDIDATE /dev/rdsk/emcpower23g                    oracle   oinstall
   5:       9341 Mb CANDIDATE /dev/rdsk/emcpower24g                    oracle   oinstall
   6:       9341 Mb CANDIDATE /dev/rdsk/emcpower25g                    oracle   oinstall
   7:       9341 Mb CANDIDATE /dev/rdsk/emcpower26g                    oracle   oinstall
   8:       9341 Mb CANDIDATE /dev/rdsk/emcpower27g                    oracle   oinstall
   9:       9341 Mb CANDIDATE /dev/rdsk/emcpower28g                    oracle   oinstall
  10:       9341 Mb CANDIDATE /dev/rdsk/emcpower29g                    oracle   oinstall
KFOD-00311: Error scanning device /dev/rdsk/emcpower2g
ORA-15025: could not open disk "/dev/rdsk/emcpower2g"
SVR4 Error: 13: Permission denied
Additional information: 42
Additional information: 272256
Additional information: 12699536

I haven’t worked out which wildcard characters are valid in the ASM_DISKSTING (‘/dev/rdsk/emcpower2[0-9]g’ didn’t work), so I left it there. The emcpower2g device was not meant to be used in the test anyway.

With this setting I returned to the OUI window and continued the installation.

Lesson Learned

Oracle doesn’t always have useful logs, but in this case it was very useful to look at the install-actions file in /tmp. And also remember that creating ASM disks is different from Linux in Solaris and other platforms.

Using Connection Manager to protect a database

I have known about Oracle’s connection manager for quite a while but never managed to use it in anger. In short there was no need to do so. Now however I have been asked to help in finding a solution to an interesting problem.

In summary, my customer is running DR and UAT in the same data centre. Now for technical reasons they rely on RMAN to refresh UAT, no array mirror splits on the SAN (which would be way faster!) possible. The requirement is to prevent an RMAN session with target=UAT and auxiliary=DR from overwriting the DR databases, all of which are RAC databases on 11.2.The architecture included 2 separate networks for the DR hosts and the UAT hosts. DR was on 192.168.99.0/24 whereas UAT was on 192.168.100.0/24. A gateway host with two NICs connects the two. Initially there was a firewall on the gateway host to prevent traffic from UAT to DR, but because of the way Oracle connections work this proved impossible (the firewall was a simple set of IPTABLES rules). After initial discussions I decided to look at connection manager more closely as that is hailed as a solution to Oracle connectivity and firewalls.

Thinking more closely about the problem I realised that the firewall + static routes approach might not be the best one. So I decided to perform a test which didn’t involve IPTABLES or custom routes, and rely on CMAN only to protect the database. A UAT refresh isn’t something that’s going to happen very frequently, which allows me to shut down CMAN, and thus prevent any communication between the two networks. This is easier than maintaining a firewall-remember the easiest way to do something is not to do it at all.

Overview of Connection Manager

For those of you who aren’t familiar with CMAN, here’s a short summary (based on the official Oracle documentation).

Configuration of Oracle Connection Manager (CMAN) allows the clients to connect through a firewall [I haven’t verified this yet, ed]. CMAN is an executable that allows clients to connect despite a firewall being in place between the client and server. CMAN is similar to the Listener in that it reads a configuration file [called CMAN.ora, ed], which contains an address that Oracle Connection Manager listens for incoming connections. CMAN starts similar to the Listener and will enter a LISTEN state.

This solution [to the firewall issue with TCP redirects, ed] will make the REDIRECT happen inside the firewall and the client will not see it; CMAN comports as a proxy service between the client and the real database listener.

Interestingly, Connection Manager is fully integrated into the FAN/FCF framework and equally suitable for UCP connection pools.

Technically speaking Oracle database instances require the initialisation parameters local_listener and remote_listener to be set. In RAC databases, this is usually the case out of the box, however, in addition to the SCAN, the remote_listener must include the CMAN listener as well-an example is provided in this document.

Installing Connection Manager

Connection Manager is now part of the Oracle client, and you can install it by choosing the “custom” option. From the list of selectable options, pick “Oracle Net Listener” and “Oracle Connection Manager”.

From there on it’s exactly the same as any other client installation.

Testing

A quick test with 2 separate networks reveals that the concept actually works. The following hosts are used:

  • cman                     192.168.99.224
  • cmandb                192.168.100.225
  • client                     192.168.99.31

The networks in use are:

  • Public network:  192.168.99.0/24
  • Private network: 192.168.100.0/24

As you can see CMANDB is on a different network than the other hosts.

Connection manager has been installed on host “cman”, with IP 192.168.99.224. The listener process has been configured to listen on port 1821. The corresponding cman.ora file has been configured as follows in $CLIENT_HOME/network/admin:

cman1 =
(configuration=
(address=(protocol=tcp)(host=192.168.99.224)(port=1821))
(rule_list=
(rule=(src=*)(dst=127.0.0.1)(srv=cmon)(act=accept))
(rule=(src=192.168.99.0/24)(dst=192.168.100.225)(srv=*)(act=accept))
)
)

The file has been left in this minimalistic state deliberately. The only connection possible is to the database host. The cmon service must be allowed or otherwise the startup of the connection manager processes will fail.

The gateway host has 2 network interfaces, one for each network:

[root@cman ~]# ifconfig
eth0      Link encap:Ethernet  HWaddr 00:16:3E:F2:34:56
inet addr:192.168.99.224  Bcast:192.168.99.255  Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
RX packets:1209 errors:0 dropped:0 overruns:0 frame:0
TX packets:854 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:105895 (103.4 KiB)  TX bytes:147462 (144.0 KiB)
Interrupt:17

eth1      Link encap:Ethernet  HWaddr 00:16:3E:52:4A:56
inet addr:192.168.100.224  Bcast:192.168.100.255  Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
RX packets:334 errors:0 dropped:0 overruns:0 frame:0
TX packets:151 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:36425 (35.5 KiB)  TX bytes:22141 (21.6 KiB)
Interrupt:16

Its routing table is defined as follows:

[root@cman ~]# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
192.168.100.0   0.0.0.0         255.255.255.0   U     0      0        0 eth1
192.168.101.0   0.0.0.0         255.255.255.0   U     0      0        0 eth2
192.168.99.0    0.0.0.0         255.255.255.0   U     0      0        0 eth0
169.254.0.0     0.0.0.0         255.255.0.0     U     0      0        0 eth2
[root@cman ~]

Configuration on host “CMANDB”

Note that host 192.168.100.225 is on the private network! The database CMANDB has its local and remote listener configured using the below entries in tnsnames.ora:

[oracle@cmandb admin]$ cat tnsnames.ora

[oracle@cmandb admin]$ cat tnsnames.ora
LOCAL_CMANDB =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(PORT = 1521)(HOST = 192.168.100.225))
)
)

REMOTE_CMANDB =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(PORT = 1821)(HOST = 192.168.99.224))
(ADDRESS = (PROTOCOL = TCP)(PORT = 1521)(HOST = 192.168.100.225))
)
)

CMAN_LSNR =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(PORT = 1521)(HOST = 192.168.99.224))
)
)
  • local listener is set to local_cmandb
  • remote listener is set to remote_cmandb
  • CMAN_LSNR is used for a test to verify that a connection to the CMAN listener is possible from this host

The host had only one network interface, connecting to the CMAN host:

[root@cmandb ~]# ifconfig
eth0      Link encap:Ethernet  HWaddr 00:16:3E:12:14:51
inet addr:192.168.100.225  Bcast:192.168.100.255  Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
RX packets:627422 errors:0 dropped:0 overruns:0 frame:0
TX packets:456584 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:2241758430 (2.0 GiB)  TX bytes:32751153 (31.2 MiB)

The only change to the system was the addition of a default gateway. Unfortunately the CMAN process cannot listen on more than one IP address:

# route add default gw 192.168.100.224 eth0

The following routing table was in use during the testing:

[root@cmandb ~]# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
192.168.100.0   0.0.0.0         255.255.255.0   U     0      0        0 eth0
169.254.0.0     0.0.0.0         255.255.0.0     U     0      0        0 eth0
0.0.0.0         192.168.100.224 0.0.0.0         UG    0      0        0 eth0

After the database listener parameters have been changed to LOCAL_CMANDB and REMOTE_CMANDB, the alert log on the CMAN host recorded a number of service registrations. This is important as it allows Connection Manager to hand the connection request off to the database:

[oracle@cman trace]$ grep cmandb *
08-JUL-2011 10:19:55 * service_register * cmandb * 0
08-JUL-2011 10:25:28 * service_update * cmandb * 0
08-JUL-2011 10:35:28 * service_update * cmandb * 0
[...]
08-JUL-2011 11:15:10 * service_update * cmandb * 0
08-JUL-2011 11:15:28 * service_update * cmandb * 0
[oracle@cman trace]$

Additionally, the CMAN processes now know about the database service CMANDB:

CMCTL:cman1> show services
Services Summary...
Proxy service "cmgw" has 1 instance(s).
Instance "cman", status READY, has 2 handler(s) for this service...
Handler(s):
"cmgw001" established:0 refused:0 current:0 max:256 state:ready

(ADDRESS=(PROTOCOL=tcp)(HOST=127.0.0.1)(PORT=23589))
"cmgw000" established:0 refused:0 current:0 max:256 state:ready

(ADDRESS=(PROTOCOL=tcp)(HOST=127.0.0.1)(PORT=31911))
Service "cmandb" has 1 instance(s).
Instance "cmandb", status READY, has 1 handler(s) for this service...
Handler(s):
"DEDICATED" established:0 refused:0 state:ready
REMOTE SERVER
(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(PORT=1521)(HOST=192.168.100.225)))
Service "cmandbXDB" has 1 instance(s).
Instance "cmandb", status READY, has 1 handler(s) for this service...
Handler(s):
"D000" established:0 refused:0 current:0 max:1022 state:ready
DISPATCHER 
(ADDRESS=(PROTOCOL=tcp)(HOST=cmandb.localdomain)(PORT=50347))
Service "cmon" has 1 instance(s).
Instance "cman", status READY, has 1 handler(s) for this service...
Handler(s):
"cmon" established:1 refused:0 current:1 max:4 state:ready

(ADDRESS=(PROTOCOL=tcp)(HOST=127.0.0.1)(PORT=59541))
The command completed successfully.
CMCTL:cman1>

Connectivity Test

With the setup completed it was time to perform a test from a third host on the (public) network. Its IP address is 192.168.99.31. The below TNSnames entries were created:

[oracle@client admin]$ cat tnsnames.ora
CMAN =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PORT=1821)(HOST=192.168.99.224)(PROTOCOL=TCP))
)
(CONNECT_DATA =
(SERVICE_NAME = cmandb)
(SERVER = DEDICATED)
)
)

DIRECT =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PORT=1521)(HOST=192.168.100.225)(PROTOCOL=TCP))
)
(CONNECT_DATA =
(SERVICE_NAME = cmandb)
(SERVER = DEDICATED)
)
)

The CMAN entry uses the Connection Manager gateway host to connect to database CMANDB, whereas the DIRECT entry tries to bypass the latter. A tnsping should show whether or not this is possible.

[oracle@client admin]$ tnsping direct
TNS Ping Utility for Linux: Version 11.2.0.1.0 - Production on 08-JUL-2011 11:19:47

Copyright (c) 1997, 2009, Oracle.  All rights reserved.

Used parameter files:

Used TNSNAMES adapter to resolve the alias

Attempting to contact (DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PORT=1521)(HOST=192.168.100.225)(PROTOCOL=TCP))) (CONNECT_DATA = (SERVICE_NAME = cmandb) (SERVER = DEDICATED)))

TNS-12543: TNS:destination host unreachable

[oracle@client admin]$ tnsping cman

TNS Ping Utility for Linux: Version 11.2.0.1.0 - Production on 08-JUL-2011 10:53:27

Copyright (c) 1997, 2009, Oracle.  All rights reserved.

Used parameter files:

Used TNSNAMES adapter to resolve the alias

Attempting to contact (DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PORT=1821)(HOST=192.168.99.224)(PROTOCOL=TCP))) (CONNECT_DATA = (SERVICE_NAME = cmandb) (SERVER = DEDICATED)))

OK (0 msec)

[oracle@client admin]$

Now that proves that a direct connection is impossible, and also that the connection manager’s listener is working. A tnsping doesn’t imply that a connection is possible though, this requires an end to end test with SQL*Plus:

[oracle@client admin]$ sqlplus system/xxx@cman

SQL*Plus: Release 11.2.0.1.0 Production on Fri Jul 8 10:55:39 2011

Copyright (c) 1982, 2009, Oracle.  All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> select instance_name,host_name from v$instance;

INSTANCE_NAME
----------------
HOST_NAME
----------------------------------------------------------------
cmandb
cmandb.localdomain

The successful connection is also recorded in the CMAN log file:

Fri Jul 08 10:55:39 2011

08-JUL-2011 10:55:39 * (CONNECT_DATA=(SERVICE_NAME=cmandb)(SERVER=DEDICATED)(CID=(PROGRAM=sqlplus@client)(HOST=client)(USER=oracle))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.99.31)(PORT=16794)) * establish * cmandb * 0

(LOG_RECORD=(TIMESTAMP=08-JUL-2011 10:55:39)(EVENT=Ready)(CONN NO=0))

Additional date for my Grid Infrastructure seminar in the UK

I was very pleasently surprised that Oracle University are offering another day for my “Grid Infrastructure and Database High Availability Deep Dive” seminar. In addition to the immenent seminars in June (I blogged about them earlier), this one is in London, England. For anyone interested, here is the link:

http://education.oracle.com/pls/web_prod-plq-dad/db_pages.getCourseDesc?dc=D70365_1439069&p_org_id=28&lang=US

The date has been set to October 10th, so there is plenty of time still, but nevertheless I hope to see you there!


Tuning Log and Trace levels for Clusterware 11.2

With the introduction of Clusterware 11.2 a great number of command line tools have either been deprecated ($ORA_CRS_HOME/bin/crs_* and others) or merged into other tools. This is especially true for crsctl, which is now the tool to access and manipulate low level resources in Clusterware.

This also implies that some of the notes on Metalink are no longer applicable to Clusterware 11.2, such as the one detailing how to get more detailed information in the logs. Not that the log information wasn’t already rather comprehensive if you asked me…

And here comes a warning: don’t change the log levels unless you have a valid reason, or under the instructions of support. Higher log levels than the defaults tend to generate too much data, filling up the GRID_HOME and potentially killing the node.

Log File Location

The location for logs in Clusterware hasn’t changed much since the unified log structure was introduced in 10.2 and documented in “CRS and 10g/11.1 Real Application Clusters (Doc ID 259301.1)”. It has been extended though, and quite dramatically so in 11.2, which is documented as well in one of the better notes from support: “11gR2 Clusterware and Grid Home – What You Need to Know (Doc ID 1053147.1)”

The techniques for getting debug and trace information as described for example in “Diagnosability for CRS / EVM / RACG (Doc ID 357808.1)” doesn’t really apply any more as the syntax changed.

Getting Log Levels in 11.2

If you are interested at which log level a specific Clusterware resource operates, you can use the crsctl get log resource resourceName call, as in this example:

# crsctl get log res ora.asm
Get Resource ora.asm Log Level: 1

Don’t forget to apply the “-init” flag when you want to query resources which are part of the lower stack:

[root@lonengbl312 ~]# crsctl get log res ora.cssd
CRS-4655: Resource ‘ora.cssd’ could not be found.
CRS-4000: Command Get failed, or completed with errors.
[root@lonengbl312 ~]# crsctl get log res ora.cssd –init
Get Resource ora.cssd Log Level: 1

Interestingly, most Clusterware daemons start leaving a detailed message in their log file as to which module uses what logging level. Take CSSD for example:

2011-06-02 09:06:14.847: [    CSSD][1531968800]clsu_load_ENV_levels: Module = CSSD, LogLevel = 2, TraceLevel = 0
2011-06-02 09:06:14.847: [    CSSD][1531968800]clsu_load_ENV_levels: Module = GIPCNM, LogLevel = 2, TraceLevel = 0
2011-06-02 09:06:14.847: [    CSSD][1531968800]clsu_load_ENV_levels: Module = GIPCGM, LogLevel = 2, TraceLevel = 0
2011-06-02 09:06:14.847: [    CSSD][1531968800]clsu_load_ENV_levels: Module = GIPCCM, LogLevel = 2, TraceLevel = 0
2011-06-02 09:06:14.847: [    CSSD][1531968800]clsu_load_ENV_levels: Module = CLSF, LogLevel = 0, TraceLevel = 0
2011-06-02 09:06:14.847: [    CSSD][1531968800]clsu_load_ENV_levels: Module = SKGFD, LogLevel = 0, TraceLevel = 0
2011-06-02 09:06:14.847: [    CSSD][1531968800]clsu_load_ENV_levels: Module = GPNP, LogLevel = 1, TraceLevel = 0
2011-06-02 09:06:14.847: [    CSSD][1531968800]clsu_load_ENV_levels: Module = OLR, LogLevel = 0, TraceLevel = 0

This doesn’t match the output of the previous command. Now how do you check this on the command line? crsctl to the rescue again-it now has a “log” option:

# crsctl get log -h
Usage:
  crsctl get {log|trace} {mdns|gpnp|css|crf|crs|ctss|evm|gipc} ",..."
 where
   mdns          multicast Domain Name Server
   gpnp          Grid Plug-n-Play Service
   css           Cluster Synchronization Services
   crf           Cluster Health Monitor
   crs           Cluster Ready Services
   ctss          Cluster Time Synchronization Service
   evm           EventManager
   gipc          Grid Interprocess Communications
   , ...    Module names ("all" for all names)

  crsctl get log res 
 where
        Resource name

That’s good information, but the crucial bit about the and other parameters is missing. The next question I asked myself was “how do I find out which sub-modules make are part of say, crsd. Apart from looking at the trace file”. That information used to be documented in the OTN reference, but it’s now something you can evaluate from Clusterware as well:

[root@lonengbl312 ~]# crsctl lsmodules -h
Usage:
  crsctl lsmodules {mdns|gpnp|css|crf|crs|ctss|evm|gipc}
 where
   mdns  multicast Domain Name Server
   gpnp  Grid Plug-n-Play Service
   css   Cluster Synchronization Services
   crf   Cluster Health Monitor
   crs   Cluster Ready Services
   ctss  Cluster Time Synchronization Service
   evm   EventManager
   gipc  Grid Interprocess Communications

Back to the question as to which modules make up CSSD, I got this answer from Clusterware, matching the log file output:

# crsctl lsmodules css
List CSSD Debug Module: CLSF
List CSSD Debug Module: CSSD
List CSSD Debug Module: GIPCCM
List CSSD Debug Module: GIPCGM
List CSSD Debug Module: GIPCNM
List CSSD Debug Module: GPNP
List CSSD Debug Module: OLR
List CSSD Debug Module: SKGFD

To get detailed information about the log level of CSSD:module, I can now finally use the crsctl get {log|trace} daemon:module command. My CSSD logfile stated the CSSD:OLR had log level 0 and trace level 0, confirmed by Clusterware:

# crsctl get log css “OLR”
Get CSSD Module: OLR  Log Level: 0

Note that the name of the module has to be in upper case. If you are lazy like me you could use the “all” keyword instead of the module name to list all the information in one command:

# crsctl get log css all
Get CSSD Module: CLSF  Log Level: 0
Get CSSD Module: CSSD  Log Level: 2
Get CSSD Module: GIPCCM  Log Level: 2
Get CSSD Module: GIPCGM  Log Level: 2
Get CSSD Module: GIPCNM  Log Level: 2
Get CSSD Module: GPNP  Log Level: 1
Get CSSD Module: OLR  Log Level: 0
Get CSSD Module: SKGFD  Log Level: 0

Setting Log Levels

Let’s have a look how to increase log levels for the critical components. I am picking CSSD here, simply because I want to know if Oracle records the chicken-and-egg problem when having voting files in ASM in the ocssd.log file. (CSSD needs to voting files to start, and ASM needs CSSD to run, but if the voting files are in ASM, then we have gone full circle). With all the information provided in the section above, that’s actually quite simple to do.

# crsctl set log -h
Usage:
  crsctl set {log|trace} {mdns|gpnp|css|crf|crs|ctss|evm|gipc} "=,..."
 where
   mdns          multicast Domain Name Server
   gpnp          Grid Plug-n-Play Service
   css           Cluster Synchronization Services
   crf           Cluster Health Monitor
   crs           Cluster Ready Services
   ctss          Cluster Time Synchronization Service
   evm           EventManager
   gipc          Grid Interprocess Communications
   , ...    Module names ("all" for all names)
   , ...     Module log/trace levels

  crsctl set log res =
 where
      Resource name
                     Agent log levels

The object I would like to research is the voting file discovery. I know that the ASM disks have an entry in the header indicating whether or not a disk contains a voting file, in the kfdhdb.vfstart and kfdhdb.vfend fields as shown here:

# kfed read /dev/oracleasm/disks/OCR1  | grep kfdhdb.vf
kfdhdb.vfstart:                     256 ; 0x0ec: 0×00000100
kfdhdb.vfend:                       288 ; 0x0f0: 0×00000120

With the default log information, I can see the voting disk discovery happening as shown in this cut down version of the ocssd.log:

2011-06-02 09:06:19.941: [    CSSD][1078901056]clssnmvDDiscThread: using discovery string  for initial discovery
2011-06-02 09:06:19.941: [   SKGFD][1078901056]Discovery with str::
2011-06-02 09:06:19.941: [   SKGFD][1078901056]UFS discovery with ::
2011-06-02 09:06:19.941: [   SKGFD][1078901056]OSS discovery with ::
2011-06-02 09:06:19.941: [   SKGFD][1078901056]Discovery with asmlib :ASM:/opt/oracle/extapi/64/asm/orcl/1/libasm.so: str ::
2011-06-02 09:06:19.941: [   SKGFD][1078901056]Fetching asmlib disk :ORCL:OCR1:

2011-06-02 09:06:19.942: [   SKGFD][1078901056]Handle 0x58d2a60 from lib :ASM:/opt/oracle/extapi/64/asm/orcl/1/libasm.so: for disk :ORCL:OCR1:
2011-06-02 09:06:19.942: [   SKGFD][1078901056]Handle 0x58d3290 from lib :ASM:/opt/oracle/extapi/64/asm/orcl/1/libasm.so: for disk :ORCL:OCR2:
2011-06-02 09:06:19.942: [   SKGFD][1078901056]Handle 0x58d3ac0 from lib :ASM:/opt/oracle/extapi/64/asm/orcl/1/libasm.so: for disk :ORCL:OCR3:

2011-06-02 09:06:19.953: [    CSSD][1078901056]clssnmvDiskVerify: discovered a potential voting file
2011-06-02 09:06:19.953: [   SKGFD][1078901056]Handle 0x5c22670 from lib :ASM:/opt/oracle/extapi/64/asm/orcl/1/libasm.so: for disk  :ORCL:OCR1:
2011-06-02 09:06:19.954: [    CSSD][1078901056]clssnmvDiskVerify: Successful discovery for disk ORCL:OCR1, UID 4700d9a1-93094ff5-bf8b96e1-7efdb883, Pending CIN 0:1301401766:0, Committed CIN 0:1301401766:0
2011-06-02 09:06:19.954: [   SKGFD][1078901056]Lib :ASM:/opt/oracle/extapi/64/asm/orcl/1/libasm.so: closing handle 0x5c22670 for disk :ORCL:OCR1:

2011-06-02 09:06:19.956: [    CSSD][1078901056]clssnmvDiskVerify: Successful discovery of 3 disks
2011-06-02 09:06:19.956: [    CSSD][1078901056]clssnmCompleteInitVFDiscovery: Completing initial voting file discovery
2011-06-02 09:06:19.957: [    CSSD][1078901056]clssnmCompleteVFDiscovery: Completing voting file discovery

Now I’m setting the new log level for SKGFD, as this seems to be responsible for the disk discovery.

# crsctl set log css “SKGFD:5″
Set CSSD Module: SKGFD  Log Level: 5

Does it make a difference? Restarting Clusterware should give more clues. Looking at the logfile I could tell there was no difference. I tried using the trace level now for the same module:

# crsctl get trace css SKGFD
Get CSSD Module: SKGFD  Trace Level: 5

Interestingly that wasn’t picked up!

$ grep “Module = SKGFD” ocssd.log | tail -n1
2011-06-02 11:31:06.707: [    CSSD][427846944]clsu_load_ENV_levels: Module = SKGFD, LogLevel = 2, TraceLevel = 0

So I decided to leave it at this point as I was running out of time for this test. I’d be interested if anyone has successfully raised the log level for this component. On the other hand, I wonder if it is ever necessary to do so. I can confirm however that raising the log level for the CSSD module created a lot more output, actually so much that it seems to equal a system wide SQL trace-so don’t change log levels unless support directs you to.


A more user friendly multipath.conf

During some recent work I did involving a stretched 11.2.0.2 RAC for a SAP implementation at a customer site I researched TimeFinder/Clone backups. As part of this exercise I have been able to experiment with RHEL (OEL) 5.6 and the new device mapper multipath package on the mount host. I have been very pleasantly surprise about this new feature which I’d like to share.

Background of this article

Device Mapper Multipath is the “native” Linux multipathing software, as opposed to vendor-supplied multipathing such as EMC’s Power Path or Hitachi’s HDLM.

My customer’s setup is rather unique for a SAP environment as it uses Oracle Enterprise Linux and not Solaris/SPARC or AIX on the Power platform with an active/passive solution. Well if that doesn’t make it sound unique, the fact that there is a plan to run Oracle 11.2.0.2 RAC potentially across sites using ASM and ACFS certainly makes this deployment stand out from the rest.

The reason I mention this is simple-it requires a lot of engineering effort to certify components for this combination. For example: it was not trivial to get vendor support for Solutions Enabler the storage engineering uses for connectivity with the VMAX arrays, and so on. After all, Oracle Enterprise is a fairly new platform and Red Hat certainly has an advantage when it comes to vendor certification.

What hasn’t been achieved was a certification of the EMC Power Path software for use with SAP on Linux, for reasons unknown to me. The result was simple: the setup will use the device-mapper multipath package that comes with the Linux distribution.

Configuring multipath

Now with this said I started looking at MOS to get relevant support notes about Linux’s native multipath and found some. The summary of this research is available on this blog, I have written about it in these articles:

What I didn’t know up to date was the fact that the multipath.conf file allows you to define the ownership and mode of a device. As an ASM user this is very important to me. Experience taught me that incorrect permissions are one of the main reasons for ASM failing to start. Remember that root owns block devices by default.

Consider this example:

multipaths {
...
multipath {
wwid        360a98000486e58526c34515944703277
alias       ocr01
mode        660
uid         501
gid         502
}
...
}

The above entry in the multipaths section translates into English as follows:

  • If you encounter a device with the WWID 36a…277
  • Give it an alias name of “OCR01” in /dev/mapper
  • Set the mode to 660 (i.e. rw-rw—)
  • Assign device ownership to the user with UID 501 (maps to “grid” on my system)
  • Assign the group of the device to 502 (maps to “asmdba” on my system)

The path settings are defined globally and don’t need to be mentioned explicitly for each device unless you prefer to override them. I like to use an alias although it isn’t really necessary since ASM relies on a 4k header in a block device to store its identity. If you don’t chose to alias a device I recommend you use the user friendly name instead, mainly for aesthetic reasons.

Why is this really cool? For me it implies two things:

  • I can now be independent of ASMLib which provided device name stability and set the permissions on the block devices correctly for ASM
  • I don’t need to create udev rules to set the permissions (or /etc/rc.local or whichever other way you chose before to set permissions)

Nice! One less headache, as I have to say that I didn’t really like udev…

How to use the RAC FAN Java API

I researched an interesting new feature available with Oracle 11g R2, the so called RAC FAN API when writing the workload management chapter for the RAC book. The RAC FAN API is documented in Oracle® Database JDBC Developer’s Guide, 11g Release 2 (11.2) available online, but when it came to the initial documentation following the 11.2.0.1 release on Linux it was pretty useless. The good news is that it improved!

The RAC FAN Java API

The aim of this API is to allow a Java application to listen to FAN events by creating a subscription to the RAC nodes’ ONS processes. The application then registers a FANListener, based on the subscription, which can pick up instances of the following events:

  • ServiceDownEvent
  • NodeDownEvent
  • LoadAdvisoryEvent

All of these are in the oracle.simplefan namespace, the javadoc reference of which you can find in the official documenation.

When it was initially released in 11.2.0.1 I tried to get the FAN subscription to work without any luck, the registration with the ONS didn’t work. Schematically, this is how it should work:

The application (shown on the top) requires an Oracle client for the ONS configuration and libraries, and the setup is very similar to how you’d set up FCF. With the ons.config file pointing to the RAC nodes’ ONS processes you should be able to read FAN events as they occur on the source system. The ons.config in $CLIENT_HOME/opmn/conf/ons.config has to contain the local and remote ONS port, as well as all the RAC nodes with their remote ONS port. When subscribing to the events, you specify the service name you’d like to receive events for.

My 4 node RAC cluster is comprised of node rac11gr2drnode{1,2,3}, with local ONS listing on port 6200 for external, and port 6100 for internal requests. This can be checked using the onsctl debug command on any cluster node.. The client’s ons.config contains these lines:

localport:6100
remoteport:6200
nodes=rac11gr2drnode1:6200, rac11gr2drnode2:6200,rac11gr2drnode3:6200, rac11gr2drnode4:6200

Unfortunately I couldn’t subscribe to the remote ONS-I know that my setup is valid because of my UCP/FCF testing I did previously (check my earlier blog postings about how to get started with UCP and Tomcat 6). Or maybe it doesn’t work on Windows, who knows? Trying to subscribe to the remote ONS my application bails out with the following error:

Exception in thread “main” oracle.ons.SubscriptionException: Subscription request timed out after 30000 millseconds.  Possible causes: OPMN may not be running, you may have an OPMN running in an alternate ORACLE_HOME using duplicate port values, or OPMN may be misconfigured.
at oracle.ons.SubscriptionNotification.waitForReply(SubscriptionNotification.java:83)
at oracle.ons.ONS.addSubscriber(ONS.java:956)
at oracle.ons.Subscriber.realStartup(Subscriber.java:103)
at oracle.ons.Subscriber.(Subscriber.java:79)
at oracle.simplefan.impl.FanSubscription.subscribe(FanSubscription.java:228)
at oracle.simplefan.impl.FanSubscription.createFanSubscription(FanSubscription.java:46)
at oracle.simplefan.impl.FanManager.subscribe(FanManager.java:120)
at de.mbh.TestClass.(TestClass.java:21)
at de.mbh.TestClass.main(TestClass.java:53)

I tried my favourite search engine on the Exception but couldn’t find any useful information. This is not a big deal for me, I can start a local ONS on the client. This is shown here:

C:\oracle\product\11.2.0\client_1\opmn\bin>onsctl ping
ons is not running ...

C:\oracle\product\11.2.0\client_1\opmn\bin>onsctl start
onsctl start: ons started

C:\oracle\product\11.2.0\client_1\opmn\bin>onsctl debug
HTTP/1.1 200 OK
Content-Length: 3627
Content-Type: text/html
Response:

== client:6200 5872 11/05/26 11:47:16 ==
Home: c:\oracle\product\11.2.0\client_1

======== ONS ========
IP ADDRESS                   PORT    TIME   SEQUENCE  FLAGS
--------------------------------------- ----- -------- -------- --------
10.xxx.xx.98  6200 4dde2fb0 00000006 00000008

Listener:

TYPE                BIND ADDRESS               PORT  SOCKET
-------- --------------------------------------- ----- ------
Local                                  127.0.0.1  6100    344
Remote                                       any  6200    352

Servers: (3)

INSTANCE NAME                  TIME   SEQUENCE  FLAGS     DEFER
---------------------------------------- -------- -------- -------- ----------
dbInstance_rac11gr2drnode1_6200          4d53d6a4 00029496 00000002          0
10.xxx.xx.155 6200
dbInstance_rac11gr2drnode2_6200          4d3eb9e2 0006cbeb 00000002          0
10.xxx.xx157 6200
dbInstance_rac11gr2drnode3_6200 4d5937d6 0001990f 00000002          0
10.xxx.xx.158 6200

Connection Topology: (4)

IP                      PORT   VERS  TIME
--------------------------------------- ----- ----- --------
10.xxx.xx.158  6200     4 4dde2fb2
**                           10.xxx.xx.155 6200
**                           10.xxx.xx.157 6200
**                            10.xxx.xx.98 6200
10.xxx.xx.157  6200     4 4dde2fb2
**                           10.xxx.xx.155 6200
**                           10.xxx.xx.158 6200
**                            10.xxx.xx.98 6200
10.xxx.xx.155  6200     4 4dde2fb2
**                            10.xxx.xx.98 6200
**                           10.xxx.xx.157 6200
**                           10.xxx.xx.158 6200
10.xxx.xx.98  6200     4 4dde2fb2=
**                           10.xxx.xx.155 6200
**                           10.xxx.xx.158 6200
**                           10.xxx.xx.157 6200

Server connections:

ID            CONNECTION ADDRESS              PORT  FLAGS  SENDQ REF WSAQ
-------- --------------------------------------- ----- ------ ----- --- ----
0                           10.xxx.xx.155  6200 010405 00000 001  ---
1                           10.xxx.xx.156  6200 002405 00000 001
2                           10.xxx.xx.157  6200 010405 00000 001  --
3                           10.xxx.xx.158  6200 010405 00000 001  ---

Client connections:

ID            CONNECTION ADDRESS              PORT  FLAGS  SENDQ REF SUB W
-------- --------------------------------------- ----- ------ ----- --- --- -
4                                internal     0 01008a 00000 001 002
request                               127.0.0.1  6100 03201a 00000 001 000

Worker Ticket: 28/28, Last: 11/05/26 11:47:15

THREAD   FLAGS
-------- --------
120 00000012
124 00000012
128 00000012

Resources:

Notifications:
Received: Total 12 (Internal 6), in Receive Q: 0
Processed: Total 12, in Process Q: 0

Pool Counts:
Message: 1, Link: 1, Ack: 1, Match: 1

C:\oracle\product\11.2.0\client_1\opmn\bin>

With a local ONS started on my client, I can actually subscribe to the ONS and make use of the events. The easiest way is to simply decode the load balancing events, as I did in my code, shown below (modified version of the code in the Oracle documentation to make it work):

package de.mbh;

import oracle.simplefan.FanSubscription;
import oracle.simplefan.FanEventListener;
import oracle.simplefan.FanManager;
import oracle.simplefan.LoadAdvisoryEvent;
import oracle.simplefan.NodeDownEvent;
import oracle.simplefan.ServiceDownEvent;
import java.util.Properties;

public class TestClass {

TestClass() {
System.out.println("Hello");

Properties p = new Properties();
p.put("serviceName", "OEMSRV");

System.setProperty("oracle.ons.oraclehome", "c:\\oracle\\product\\11.2.0\\client_1");
System.out.println(System.getProperty("oracle.ons.oraclehome"));

FanSubscription sub = FanManager.getInstance().subscribe(p);

System.out.println("I'm subscribed!");

sub.addListener(new FanEventListener() {

public void handleEvent(ServiceDownEvent arg0) {
System.out.println("Service Down registered!");
}

public void handleEvent(NodeDownEvent arg0) {
System.out.println("Node Down Event Registered");
}

public void handleEvent(LoadAdvisoryEvent arg0) {
System.out.println("Just got a Load Advisory event");

System.out.println("originating database: " + arg0.getDatabaseUniqueName());
System.out.println("originating instance: " + arg0.getInstanceName());
System.out.println("Service Quality     : " + arg0.getServiceQuality());
System.out.println("Percent             : " + arg0.getPercent());
System.out.println("Service Name        : " + arg0.getServiceName());
System.out.println("Service Quality     : " + arg0.getServiceQuality());
System.out.println("Observed at         : " + arg0.getTimestamp() + "\n\n");
} } );
}

/**
* @param args
*/
public static void main(String[] args) {
// TODO Auto-generated method stub
TestClass tc = new TestClass();

int i = 0;
while ( i < 100000)  {
try {
Thread.sleep(100);
i++;
} catch (Exception e) {
System.out.println(e);
}
}

System.out.println("execution ended");
}

}

When compiling or executing the code, you need the simplefan.jar and ons.jar files in your classpath, and chances are that you need ojdbc6.jar as well.

Starting the application reveals Load Balancing events being read:

Hello
c:\oracle\product\11.2.0\client_1
I'm subscribed!
Just got a Load Advisory event
originating database: LNGDS1RD
originating instance: dbInstance_rac11gr2drnode3_6200
Service Quality     : -1
Percent             : 51
Service Name        : OEMSRV
Service Quality     : -1
Observed at         : Thu May 26 11:56:03 BST 2011

Just got a Load Advisory event
originating database: LNGDS1RD
originating instance: dbInstance_rac11gr2drnode3_6200
Service Quality     : -1
Percent             : 49
Service Name        : OEMSRV
Service Quality     : -1
Observed at         : Thu May 26 11:56:03 BST 2011

This proves that you can write your own connection cache, which allows you to react to *down events and rebalance your session differently.

Offloading production backups to a different storage array

For quite a while Oracle DBAs have performed split mirror backups using special devices called “Business Continuance Volumes” or BCVs for short. A BCV is a special mirror copy of a LUN on the same storage array as the primary copy.

In a BCV backup scenario, the storage administrator (usually) “splits” the mirror after putting the database into hot backup mode. After the mirror is split, the database is taken out of hot backup mode and resumes normal operation. A new Oracle instance on a different host can be mounted using the split mirror copy of the database for backups. The use of this technology for refreshing a test environment is out of scope of this article. The below figure demonstrates the idea:

The advantage of such an approach is that the backup operation, initiated from the mount host, should not impact the performance of the production database. Once the backup is complete, the BCV for the ARCH diskgroup should be re-synchronised with the source LUN, whereas the DATA disk group should not. This allows us to quickly recover from problems with the primary LUNs-more on that on a later post.

One Step Further

On my current site this process has been refined. One of the requirements was that SRDF should be used as a means for disaster recovery. I should say that SRDF, or Symmetrix Remote Data Facility is my customer’s preferred method for DR, and I do by no means want to advertise for EMC here; it just so happened that I was working on EMC storage for this project.

I should also note that Data Guard cannot be used due to an application constraint (an ACFS file system is integral part of the application and must be backed up together with the database).

All storage is presented by ASM, which in many ways makes life easier for me. The ASM LUNs or “disks” will have all the required information in the “disk” header. So after the cloned LUNs have been presented to the mount host, all I need to do is make them available to the OS, and optionally run an “oracleasm scandisks” as root to detect them. From then on I should be able to simply mount the disk group (either via sql*plus or srvctl in 11.2). The actual backup requires a few more steps, these are shown below.

Before going further into detail let’s have a look at the architecture first:

Two data centres are in use: the local one is used for production in normal operations, including backups. One of the design requirements was that backups can be taken on either data centre, in respect to the DR situation.

Split mirror backups as shown in the above figure are taken on the local data centre in normal operations. In case of DR, the remote data centre will be configured to take backups. For this to happen, it is necessary to clone the replicated LUNs (that would be activated in the DR event) much as it’s done for the local data centre’s split mirror backups. As an added advantage, the clones can be used to create pre- and post batch “backups” that would be activated in case of a horrible failure of the batch/end-of-year processing.

Taking the Backups

To be able to take a backup of the split mirror, a few things are necessary. Most of these are documented in MOS note “RMAN and Split Mirror Disk Backups [ID 302615.1]”. You certainly require a recovery catalogue database in the first place. As a first step you register the database in the catalog. You perform this step connect to the production database and recovery catalog, as shown in the below example:

$ rman target / catalog=rman/rmanPwd@rccat

Recovery Manager: Release 11.2.0.2.0 - Production on Fri May 20 10:39:00 2011

Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.

connected to target database: PROD (DBID=1796248120)
connected to recovery catalog database

RMAN> register database;

database registered in recovery catalog
starting full resync of recovery catalog
full resync complete

RMAN> exit

If you like, you can configure defaults at this stage, I opted for the following (still connected to the primary database and recovery catalog):

RMAN> configure retention policy to redundancy 3;

new RMAN configuration parameters:
CONFIGURE RETENTION POLICY TO REDUNDANCY 3;
new RMAN configuration parameters are successfully stored

RMAN> CONFIGURE DEFAULT DEVICE TYPE TO SBT_TAPE;

new RMAN configuration parameters:
CONFIGURE DEFAULT DEVICE TYPE TO 'SBT_TAPE';
new RMAN configuration parameters are successfully stored

RMAN> configure controlfile autobackup off;

new RMAN configuration parameters:
CONFIGURE CONTROLFILE AUTOBACKUP OFF;
new RMAN configuration parameters are successfully stored

RMAN> CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE sbt_tape to '%F';

new RMAN configuration parameters:
CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE 'SBT_TAPE' TO '%F';
new RMAN configuration parameters are successfully stored

RMAN> configure device type sbt_tape parallelism 4 backup type to backupset;

new RMAN configuration parameters:
CONFIGURE DEVICE TYPE 'SBT_TAPE' PARALLELISM 4 BACKUP TYPE TO BACKUPSET;
new RMAN configuration parameters are successfully stored

RMAN> exit

Now switch over to the mount host for some real work.

The high level steps for taking a backup of the “split mirror” are:

  1. Ensure that the cloned LUNs are presented to the mount host’s operating system, including appropriate zoning on the fabric
  2. Ensure that the multi-pathing solution of choice is correctly configured for all paths to the cloned LUNs
  3. Ensure that the ASM disks are known to the local ASM instance. That may including running an /etc/init.d/oracleasm scandisks as root, or putting the relevant rules into /etc/udev/rules.d/

These steps have been quite generic and will depend on your OS and storage stack. I won’t go into detail, but you might find some relevant bits and pieces on my blog if you are interested.

The next step is to make the ASM disks known to ASM. Initially that has to be done via the command line, from 11.1 onwards you have to connect as SYSASM:

SQL> alter  diskgroup DGName mount;

For Oracle 11.2 this automatically creates a resource for diskgroup DGName in the OCR which is very convenient (especially in RAC environments). Next time, all you need to do is to execute “srvctl start diskgroup -g DGName” as the grid software owner.

For pre 11.2 environments you might want to consider updating the “asm_diskstring” initialisation parameter accordingly.

Once the ASM disk groups your database requires are mounted in ASM, it’s time to mount the database. If not done so already, register the database in the OCR and ensure that you add the spfile option as well, as in this example (Oracle RAC users would add instances additionally):

$ srvctl add database -d PROD -o $ORACLE_HOME -c SINGLE -p ‘+data/PROD/spfilePROD.ora’ \
> -s mount -a DATA,ARCH

As per said MOS document, it is required to start the database using a BACKUP controlfile on the mount host. Otherwise you’d end up with these RMAN errors after the first backup:

RMAN-3014: Implicit resync of recovery catalog failed
RMAN-6038: Recovery catalog package detected an error
RMAN-20035: Invalid high RECID error

To do so, create a backup controlfile on the primary database before splitting the mirror. The process is very much the same you would for a physical standby database:

SQL> alter database backup controlfile to ‘/tmp/backup.ctl’;

Database altered.

This controlfile now needs to be made available to the mount host-an elegant way would be to use DBMS_FILETRANSFER to perform this task, or asmcmd’s copy command.

Once it’s on the mount host, say in /tmp//backup.ctl, it needs to be made available to Oracle. The easiest way is to use RMAN for this:

Connect to the mirror instance as SYSDBA; do not connect to the recovery catalog.

$ rman target /

Recovery Manager: Release 11.2.0.2.0 - Production on Fri May 20 10:53:40 2011

Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.

connected to target database: PROD (DBID=1796248120, not open)

RMAN> shutdown immediate

using target database control file instead of recovery catalog
database dismounted
Oracle instance shut down

RMAN> startup nomount
connected to target database (not started)
Oracle instance started

Total System Global Area    9219969024 bytes

Fixed Size                     2234056 bytes
Variable Size               4630513976 bytes
Database Buffers            4563402752 bytes
Redo Buffers                  23818240 bytes

RMAN> restore controlfile from '/tmp/PROD/ backup.ctl';

Starting restore at 20-MAY-11
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=200 instance=PROD device type=DISK

channel ORA_DISK_1: copied control file copy
output file name=+ARCH/prod/controlfile/current.256.737725307
output file name=+DATA/prod/controlfile/current.256.737725307
Finished restore at 20-MAY-11

If you were really desperate to back the CURRENT controlfile (which is part of the clone) up, you could have done this prior to the restore of the backup controlfile. You must not connect to the recovery catalog in this case-see MOS note 1111369.1 for more information.

Querying V$DATABASE.CONTROLFILE_TYPE should now return “BACKUP”. With the setup completed, you are ready to back the database up from the cloned LUNs. Connect to RMAN again, using the backup instance and the recovery catalog and initiate the backup. For example:

run {
allocate channel t1 type sbt_tape parms 'ENV=(TDPO_OPTFILE=/u01/app/oracle/product/admin/PROD/tdp/tdpo_PROD.opt)';
allocate channel t2 type sbt_tape parms 'ENV=(TDPO_OPTFILE=/u01/app/oracle/product/admin/PROD/tdp/tdpo_PROD.opt)';
backup database;
}

This completes the backup of the database. One caveat exists: the control file is not backed up as part of this process-RMAN doesn’t back a “backup” controlfile up:

$ rman target / catalog rman/rmanPwd@rccat

Recovery Manager: Release 11.2.0.2.0 - Production on Tue May 24 12:51:22 2011

Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.

connected to target database: GB1 (DBID=1796248120, not open)
connected to recovery catalog database

RMAN> CONFIGURE CONTROLFILE AUTOBACKUP ON;

new RMAN configuration parameters:
CONFIGURE CONTROLFILE AUTOBACKUP ON;
new RMAN configuration parameters are successfully stored

RMAN> run {
2> allocate channel t1 type sbt_tape parms
3> 'ENV=(TDPO_OPTFILE=/u01/app/oracle/product/admin/PROD/tdp/tdpo_PROD.opt)';
3> backup tablespace USERS;
4> }

allocated channel: t1
channel t1: SID=199 instance=PROD device type=SBT_TAPE
channel t1: Data Protection for Oracle: version 5.5.1.0

Starting backup at 24-MAY-11
channel t1: starting full datafile backup set
channel t1: specifying datafile(s) in backup set
input datafile file number=00006 name=+DATA/prod/datafile/users.264.737727697
channel t1: starting piece 1 at 24-MAY-11
channel t1: finished piece 1 at 24-MAY-11
piece handle=1cmd4ott_1_1 tag=TAG20110524T125316
comment=API Version 2.0,MMS Version 5.5.1.0
channel t1: backup set complete, elapsed time: 00:00:01
Finished backup at 24-MAY-11

RMAN-06497: WARNING: control file is not current, control file AUTOBACKUP skipped
released channel: t1

RMAN>  exit

It’s not difficult at all to get around this problem. As part of the regular archive log backups you are performing on the production database anyway, you add a “backup current controlfile” and a command to resync the catalog (“resync catalog”).

At this stage I should note that the backup process wouldn’t be different in principle, if additional backups were to be taken on the DR site (licensing questions aside). Instead of having to clone the primary LUNs on the local data centre, the storage admins would clone the replicated LUNs (the “R2s” in EMC-talk) and bring these clones up on the DR mount host.

Summary

The process described above approach was interesting from my personal point of view as I haven’t used this concept before. I’m an Oracle DBA, and I feel most comfortable when I have things in hand, relying on another team for doing BAU database tasks is a new experience.

The process description deliberately left product names out unless they were part of the command output. The concept is quite universal and is by no means tied down to a specific vendor.

One of the requirements is that your tape solution can talk to the primary host(s) and the mount host in the local and remote data centres. Although I have seen sites (including rather large ones ) where the DR site was not configured to take backups. Although you could argue that you shouldn’t operate from your DR site for long, that’s by no means an excuse for not having backups. Let’s suggest you run into a block corruption, how would you recover from that without a backup? But then the storage team usually argue that block corruption doesn’t happen on their high end arrays.

References

  • RMAN and Split Mirror Disk Backups [ID 302615.1]
  • RMAN Backup of Controlfile fails from Split Mirror / BCV Copy. Is there any way to take controlfile backup from Split Mirror / BCV so that the Production DB is not used for Controlfile backup? [ID 1111369.1]
  • Supported Backup, Restore and Recovery Operations using Third Party Snapshot Technologies [ID 604683.1]

A quick word of warning about database PSU 11.2.0.2.2

I am playing around with the Grid Infrastructure 11.2.0.2 PSU 2 and found an interesting note on My Oracle Support regarding the Patch Set Update. This reminds me that it’s always a good idea to search for a patch number on Metalink before applying a PSU. It also seems to be a good idea to wait for a few days before trying a PSU (or maybe CPU) on your DEV environment for the first time (and don’t even think about applying a PSU on production without thorough testing!)

OK, back to the story: there is a known issue with the patchset which has to do with the change in the Mutex behaviour which the PSU was intended to fix. To quote MOS note “Oracle Database Patch Set Update 11.2.0.2.2 Known Issues (Doc ID 1291879.1)”, Patch 12431716 Is a Recommended Patch for 11.2.0.2.2. In fact, Oracle strongly recommends you to apply the patch to fix Bug 12431716 – Unexpected change in mutex wait behavior in 11.2.0.2.2 PSU (higher CPU possible).

In a nutshell, not applying the patch can cause your system to suffer from excessive CPU usage and more than expected mutex contention. More information can be found in the description of Bug 12431716  Mutex waits may cause higher CPU usage in 11.2.0.2.2 PSU / GI PSU which is worth reading.

Besides this, the PSU was applied without any problems to my four node cluster, I just wish there was a way to roll out a new version of opatch to all cluster node’s $GRID_HOME and $ORACLE_HOME in one command. The overall process for the PSU is the same as already described in my previous post about Bundle Patch 3:

  1. Get the latest version of OPatch
  2. Deploy OPatch to $GRID_HOME and $ORACLE_HOME (ensure permissions are set correctly for the OPatch in $GRID_HOME!)
  3. Unzip the PSU (Bug 11724916 – 11.2.0.2.2 Patch Set Update (PSU) (Doc ID 11724916.8)), for example to /tmp/PSU
  4. Change directory to where you unzipped (/tmp/PSU) and become root
  5. Ensure that $GRID_HOME/OPatch is part of the path
  6. Read the readme
  7. Create an OCM response file and save it to say, /tmp/ocm.rsp
  8. Start the patch as root: opatch auto and supply the full path to the OCM response file (/tmp/ocm.rsp)
  9. Apply the beforementioned one-off patch

Then wait, and after a little while you spend trailing the logfile in $GRID_HOME/cfgtoollogs/ and having a coffee the process eventually finishes. Repeat on each node and you’re done. I’m really happy there aren’t these long readme files anymore with 8 steps to be performed, partially as root, partially as CRS owner/RDBMS owner. It reduces tge tune ut takes to apply a PSU significantly.

Happy patching!

Pro Oracle Database 11g RAC on Linux is available for Kindle

Addmittedly I haven’t checked for a little while, but an email by my co-author Steve Show prompted me to go to the Amazon website and look it up.

And yes, it’s reality! Our book is now finally available as a kindle version, how great is that?!?

There isn’t really a lot more to say about this subject. I’ll wonder how many techies are intersted in the kindle version after the PDF has been out for quite a while. If you read this and decide to get the kindle version, could you please let me know how you liked it?  Personally I think the book is well suited for the Amazon reader as it’s mostly text which suits the device well.

Happy reading!

Slight 11.2.0.2 caveat with TTS and missing XMLDB

A small post about a problem I encountered a little while ago, quite interesting nevertheless.

The background story is that-for security reasons-my customer’s databases are all created with “Oracle Database Catalog Views” and “Oracle Database Packages and Types” only. Anything else will have to be requested by the application team. I like the idea in principle, and it is certainly a lot better than all those databases with UltraSearch, Spatial and Text installed which open the door to vulnerabilities and exploits. Have a look at the problems fixed by Critical Patch Updates and you will know what I am talking about.

As part of a project I have been asked to transport a tablespace from such a database to another host. Everything is fine until you try to export the tablespace metadata with expdp. As you know, anyone using traditional exp is a bit backward ;)

[oracle@node1 ~] $ expdp / dumpfile=tts_metadata.dmp directory=data_pump_dir logfile=tts_metadata.log \
> transport_tablespaces='TTS_TEST'
Export: Release 11.2.0.2.0 - Production on Tue May 10 16:05:49 2011
Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.
Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
Starting "ORACLE"."SYS_EXPORT_TRANSPORTABLE_01":  /******** dumpfile=tts_metadata.dmp logfile=tts_metadata.log
  directory=data_pump_dir transport_tablespaces=TTS_TEST
Processing object type TRANSPORTABLE_EXPORT/PLUGTS_BLK
ORA-39126: Worker unexpected fatal error in KUPW$WORKER.UNLOAD_METADATA [PLUGTS_BLK]
ORA-00904: "Q"."TABOBJ_NUM": invalid identifier
ORA-06512: at "SYS.DBMS_SYS_ERROR", line 95
ORA-06512: at "SYS.KUPW$WORKER", line 8358
----- PL/SQL Call Stack -----
  object      line  object
  handle    number  name
0x15e53a900     19208  package body SYS.KUPW$WORKER
0x15e53a900      8385  package body SYS.KUPW$WORKER
0x15e53a900      2935  package body SYS.KUPW$WORKER
0x15e53a900      9054  package body SYS.KUPW$WORKER
0x15e53a900      1688  package body SYS.KUPW$WORKER
0x155a4e638         2  anonymous block
Job "ORACLE"."SYS_EXPORT_TRANSPORTABLE_01" stopped due to fatal error at 16:05:57
[oracle@node1 ~]

This is due to bug 10185688, which is fixed in 12.1, 11.2.0.3.0, or in one-off patch 10185688. Oracle 12.1 and 11.2.0.3 are not available at the time of this writing.

You will only encounter this problem if your database doesn’t have XML and dependent options compiled in. I wonder how many databases out there have options in DBA_SERVER_REGISTRY which aren’t used. When I was working at IGEFI we only introduced XMLDB to Multifonds after customers requested the ability to load XML files, prior to the move from 9i to 10g ( in 2006 that was). I still don’t understand why you’d want to prefer XML input files rather than CSV for external tables but that wasn’t my call either.

By the way, the old backward exp command doesn’t have this restriction!

[oracle@node1 ~] $ exp file=tts.dmp log=tts.log TRANSPORT_TABLESPACE=y TABLESPACES=TTS_TEST
Export: Release 11.2.0.2.0 - Production on Tue May 10 16:20:56 2011
Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.
Username: / as sysdba
Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Tes
Export done in US7ASCII character set and AL16UTF16 NCHAR character set
server uses AL32UTF8 character set (possible charset conversion)
Note: table data (rows) will not be exported
About to export transportable tablespace metadata...
For tablespace TTS_TEST ...
. exporting cluster definitions
. exporting table definitions
. . exporting table                           TEST
. exporting referential integrity constraints
. exporting triggers
. end transportable tablespace metadata export
Export terminated successfully without warnings.

Not a problem then, but quite interesting.