Search

Top 60 Oracle Blogs

Recent comments

11g Release 1

Little things worth knowing: auto-DOP or hint-which one takes precedence?

This is just another short post about one of the little things worth knowing. Assume you are on 12.1.0.2 for example and you want to know if parallel_degree_limit reliably keeps the degree of parallelism at bay, for all sessions and queries across the board. After all, every machine has finite CPU (and other) resources that shouldn’t be exhausted. Setting parallel_degree_limit of course only has an effect if you set parallel_degree_policy to limited, auto or adaptive. The latter is the Big Knob to enable Automatic Degree Of Parallelism (DOP), and the post assumes that Auto DOP is enabled for the session. In this configuration, what happens to hinted statements?

Documentation

The documentation (12c Reference Guide, section on parallel_degree_limit) states that you can provide an integer in addition to “CPU” and “IO”:

WARNING: FAILED to load library: /opt/oracle/extapi/64/asm/orcl/1/libasm.so

This post is long overdue, and it’s as much a reminder to myself as it is a post for you to find when scratching your head.

Here’s the situation: Oracle Restart on an Oracle Enterprise Linux 5.5 system with ASMLib in use. I have installed 11.2.0.2 plus the latest PSU to bring database and Grid Infrastructure to 11.2.0.2.5-so far so good. I have used the non-GUI approach and everything seemed ok until I tried to create a database for some benchmarking. Here’s the surprising output:

A more user friendly multipath.conf

During some recent work I did involving a stretched 11.2.0.2 RAC for a SAP implementation at a customer site I researched TimeFinder/Clone backups. As part of this exercise I have been able to experiment with RHEL (OEL) 5.6 and the new device mapper multipath package on the mount host. I have been very pleasantly surprise about this new feature which I’d like to share.

Background of this article

Device Mapper Multipath is the “native” Linux multipathing software, as opposed to vendor-supplied multipathing such as EMC’s Power Path or Hitachi’s HDLM.

My customer’s setup is rather unique for a SAP environment as it uses Oracle Enterprise Linux and not Solaris/SPARC or AIX on the Power platform with an active/passive solution. Well if that doesn’t make it sound unique, the fact that there is a plan to run Oracle 11.2.0.2 RAC potentially across sites using ASM and ACFS certainly makes this deployment stand out from the rest.

The reason I mention this is simple-it requires a lot of engineering effort to certify components for this combination. For example: it was not trivial to get vendor support for Solutions Enabler the storage engineering uses for connectivity with the VMAX arrays, and so on. After all, Oracle Enterprise is a fairly new platform and Red Hat certainly has an advantage when it comes to vendor certification.

What hasn’t been achieved was a certification of the EMC Power Path software for use with SAP on Linux, for reasons unknown to me. The result was simple: the setup will use the device-mapper multipath package that comes with the Linux distribution.

Configuring multipath

Now with this said I started looking at MOS to get relevant support notes about Linux’s native multipath and found some. The summary of this research is available on this blog, I have written about it in these articles:

What I didn’t know up to date was the fact that the multipath.conf file allows you to define the ownership and mode of a device. As an ASM user this is very important to me. Experience taught me that incorrect permissions are one of the main reasons for ASM failing to start. Remember that root owns block devices by default.

Consider this example:

multipaths {
...
multipath {
wwid        360a98000486e58526c34515944703277
alias       ocr01
mode        660
uid         501
gid         502
}
...
}

The above entry in the multipaths section translates into English as follows:

  • If you encounter a device with the WWID 36a…277
  • Give it an alias name of “OCR01” in /dev/mapper
  • Set the mode to 660 (i.e. rw-rw—)
  • Assign device ownership to the user with UID 501 (maps to “grid” on my system)
  • Assign the group of the device to 502 (maps to “asmdba” on my system)

The path settings are defined globally and don’t need to be mentioned explicitly for each device unless you prefer to override them. I like to use an alias although it isn’t really necessary since ASM relies on a 4k header in a block device to store its identity. If you don’t chose to alias a device I recommend you use the user friendly name instead, mainly for aesthetic reasons.

Why is this really cool? For me it implies two things:

  • I can now be independent of ASMLib which provided device name stability and set the permissions on the block devices correctly for ASM
  • I don’t need to create udev rules to set the permissions (or /etc/rc.local or whichever other way you chose before to set permissions)

Nice! One less headache, as I have to say that I didn’t really like udev…

ASM normal redundancy and failure groups spanning SANs

Julian Dyke has started an interesting thread on the Oak Table mailing list after the latest UKOUG RAC and HA SIG. Unfortunately I couldn’t attend that event, I wish I had, and I knew it would be great.

Anyway, the question revolved around an ASM disk group created with normal redundancy spanning two storage arrays. This should in theory protect against the failure of an array, although at a high price. All ASM disks exported from an array would be 1 failure group. Remember that disks in a failure group all fail if the supporting infrastructure (network, HBA, controller etc) fails. So what would happen with such a setup, if you followed these steps:

  • Shutdown the array for failure group 2
  • Stop the database
  • Shutdown the second array – failure group 1
  • Do some more maintenance…
  • Startup failgroup B SAN
  • Start the database
  • Startup failgroup A SAN

ASM can tolerate the failure of one failgroup (capacity permitting), so the failure of failgroup 2 should not bring the disk group down, which would result in immediate loss of service. But what happens if it comes up after the data in the other failure group has been modified? Will there be data corruption?

Replaying

To simulate two storage arrays my distinguished filer01 and filer02 OpenFiler appliances have been used, each exporting 2 approx. 4G “LUNS” to my database host. At this time I only had access to my 11.1.0.7 2 node RAC system, if time permits I’ll repeat this with 10.2.0.5 and 11.2.0.2. The RAC cluster in the SIG presentation was 10.2. I am skipping the bit about the LUN creation and presentation to the hosts, and assume the following setup:

[root@rac11gr1node1 ~]# iscsiadm -m session
tcp: [1] 192.168.99.51:3260,1 iqn.2006-01.com.openfiler:filer02DiskB
tcp: [2] 192.168.99.50:3260,1 iqn.2006-01.com.openfiler:filer01DiskA
tcp: [3] 192.168.99.51:3260,1 iqn.2006-01.com.openfiler:filer02DiskA
tcp: [4] 192.168.99.50:3260,1 iqn.2006-01.com.openfiler:filer01DiskB

192.168.99.50 is my first openfiler instance, .192.168.99.51 the second. As you can see each export DISKA and DISKB. Mapped to the hosts, this is the target mapping (use iscsiadm –mode session –print 3 to find out):

  • filer02DiskB: /dev/sda
  • filer01DiskA: /dev/sdb
  • filer02DiskA: /dev/sdc
  • filer01DiskB: /dev/sdd

I am using ASMLib (as always on the lab) to label these disks:

[root@rac11gr1node1 ~]# oracleasm listdisks
DATA1
DATA2
FILER01DISKA
FILER01DISKB
FILER02DISKA
FILER02DISKB

DATA1 and DATA2 will not play a role in this article, I’m interested in the other disks. Assuming that the scandisks command completed on all nodes, I can add the disks to the new diskgroup:

SQL> select path from v$asm_disk

PATH
--------------------------------------------------------------------------------
ORCL:FILER01DISKA
ORCL:FILER01DISKB
ORCL:FILER02DISKA
ORCL:FILER02DISKB
ORCL:DATA1
ORCL:DATA2

Let’s create the diskgroup. The important part is to create failure groups per storage array. By the way this is not different from extended distance RAC!

SQL> create diskgroup fgtest normal redundancy
 2  failgroup filer01 disk 'ORCL:FILER01DISKA', 'ORCL:FILER01DISKB'
 3  failgroup filer02 disk 'ORCL:FILER02DISKA', 'ORCL:FILER02DISKB'
 4  attribute 'compatible.asm'='11.1';

Diskgroup created.

With that done let’s have a look at the asm disk information:

SQL> select MOUNT_STATUS,HEADER_STATUS,STATE,REDUNDANCY,FAILGROUP,PATH from v$asm_disk where group_number=2;

MOUNT_S HEADER_STATU STATE    REDUNDA FAILGROUP                      PATH
------- ------------ -------- ------- ------------------------------ --------------------
CACHED  MEMBER       NORMAL   UNKNOWN FILER01                        ORCL:FILER01DISKA
CACHED  MEMBER       NORMAL   UNKNOWN FILER01                        ORCL:FILER01DISKB
CACHED  MEMBER       NORMAL   UNKNOWN FILER02                        ORCL:FILER02DISKA
CACHED  MEMBER       NORMAL   UNKNOWN FILER02                        ORCL:FILER02DISKB

I have set the disk repair time to 24 hours and raised compatible parameters for RDBMS and ASM to 11.1, resulting in these attributes:

SQL> select * from v$asm_attribute

NAME                           VALUE                GROUP_NUMBER ATTRIBUTE_INDEX ATTRIBUTE_INCARNATION READ_ON SYSTEM_
------------------------------ -------------------- ------------ --------------- --------------------- ------- -------
disk_repair_time               3.6h                            2               0                     1 N       Y
au_size                        1048576                         2               8                     1 Y       Y
compatible.asm                 11.1.0.0.0                      2              20                     1 N       Y
compatible.rdbms               11.1.0.0.0                      2              21                     1 N       Y

Unlike 11.2 where disk groups are managed as Clusterware resource, 11.1 requires you to manually start them or append the new disk group to ASM_DISKGORUPS. You should query gv$asm_diskgroup.state to ensure the new diskgroup is mounted on all cluster nodes.

I need some data! A small demo database can be restored to the new failure group to provide some experimental playground. This is quite easily done by using an RMAN duplicate with the correct {db|log}_file_name_convert parameter set.

Mirror

The diskgroup is created with normal redundancy, which means that ASM will create a mirror for every primary extent, taking failure groups into consideration. I wanted to ensure that the data is actually mirrored on the new disk group, which has group number 2.I need to get this information from V$ASM_FILE and V$ASM_ALIAS:

SQL> select * from v$asm_file where group_number = 2

GROUP_NUMBER FILE_NUMBER COMPOUND_INDEX INCARNATION BLOCK_SIZE     BLOCKS      BYTES      SPACE TYPE                 REDUND STRIPE CREATION_ MODIFICAT R
------------ ----------- -------------- ----------- ---------- ---------- ---------- ---------- -------------------- ------ ------ --------- --------- -
 2         256       33554688   747669775      16384       1129   18497536   78643200 CONTROLFILE          HIGH   FINE   05-APR-11 05-APR-11 U
 2         257       33554689   747669829       8192      69769  571547648 1148190720 DATAFILE             MIRROR COARSE 05-APR-11 05-APR-11 U
 2         258       33554690   747669829       8192      60161  492838912  990904320 DATAFILE             MIRROR COARSE 05-APR-11 05-APR-11 U
 2         259       33554691   747669829       8192      44801  367009792  739246080 DATAFILE             MIRROR COARSE 05-APR-11 05-APR-11 U
 2         260       33554692   747669831       8192      25601  209723392  424673280 DATAFILE             MIRROR COARSE 05-APR-11 05-APR-11 U
 2         261       33554693   747669831       8192        641    5251072   12582912 DATAFILE             MIRROR COARSE 05-APR-11 05-APR-11 U
 2         262       33554694   747670409        512     102401   52429312  120586240 ONLINELOG            MIRROR FINE   05-APR-11 05-APR-11 U
 2         263       33554695   747670409        512     102401   52429312  120586240 ONLINELOG            MIRROR FINE   05-APR-11 05-APR-11 U
 2         264       33554696   747670417        512     102401   52429312  120586240 ONLINELOG            MIRROR FINE   05-APR-11 05-APR-11 U
 2         265       33554697   747670417        512     102401   52429312  120586240 ONLINELOG            MIRROR FINE   05-APR-11 05-APR-11 U
 2         266       33554698   747670419       8192       2561   20979712   44040192 TEMPFILE             MIRROR COARSE 05-APR-11 05-APR-11 U

11 rows selected.

SQL> select * from v$asm_alias where group_NUMBER=2

NAME                           GROUP_NUMBER FILE_NUMBER FILE_INCARNATION ALIAS_INDEX ALIAS_INCARNATION PARENT_INDEX REFERENCE_INDEX A S
------------------------------ ------------ ----------- ---------------- ----------- ----------------- ------------ --------------- - -
RAC11G                                    2  4294967295       4294967295           0                 3     33554432        33554485 Y Y
CONTROLFILE                               2  4294967295       4294967295          53                 3     33554485        33554538 Y Y
current.256.747669775                     2         256        747669775         106                 3     33554538        50331647 N Y
DATAFILE                                  2  4294967295       4294967295          54                 1     33554485        33554591 Y Y
SYSAUX.257.747669829                      2         257        747669829         159                 1     33554591        50331647 N Y
SYSTEM.258.747669829                      2         258        747669829         160                 1     33554591        50331647 N Y
UNDOTBS1.259.747669829                    2         259        747669829         161                 1     33554591        50331647 N Y
UNDOTBS2.260.747669831                    2         260        747669831         162                 1     33554591        50331647 N Y
USERS.261.747669831                       2         261        747669831         163                 1     33554591        50331647 N Y
ONLINELOG                                 2  4294967295       4294967295          55                 1     33554485        33554644 Y Y
group_1.262.747670409                     2         262        747670409         212                 1     33554644        50331647 N Y
group_2.263.747670409                     2         263        747670409         213                 1     33554644        50331647 N Y
group_3.264.747670417                     2         264        747670417         214                 1     33554644        50331647 N Y
group_4.265.747670417                     2         265        747670417         215                 1     33554644        50331647 N Y
TEMPFILE                                  2  4294967295       4294967295          56                 1     33554485        33554697 Y Y
TEMP.266.747670419                        2         266        747670419         265                 1     33554697        50331647 N Y

My USERS tablespace which I am interested in most has file number 261-I chose it for this example as it’s only 5M in size. Taking my 1 MB allocation unit into account, it means I don’t have to trawl through thousands of line of output when getting the extent map.

Credit where credit is due-the next queries are partly based on the excellent work by Luca Canali from CERN, who has looked at ASM internals for a while. Make sure you have a look at the excellent reference available here: https://twiki.cern.ch/twiki/bin/view/PDBService/ASM_Internals. So to answer the question if the extents making up my users tablespace we need to have a look at the X$KFFXP, i.e. file extent pointers view:

SQL> select GROUP_KFFXP,DISK_KFFXP,AU_KFFXP from x$kffxp where number_kffxp=261 and group_kffxp=2 order by disk_kffxp;

GROUP_KFFXP DISK_KFFXP   AU_KFFXP
----------- ---------- ----------
 2          0        864
 2          0        865
 2          0        866
 2          1        832
 2          1        831
 2          1        833
 2          2        864
 2          2        866
 2          2        865
 2          3        832
 2          3        833
 2          3        831

12 rows selected.

As you can see, I have a number of extents, all evenly spread over my disks. I can verify that this information is correct by querying the X$KFDAT view as well which contains similar information, but more related to the disk  to AU mapping

SQL> select GROUP_KFDAT,NUMBER_KFDAT,AUNUM_KFDAT from x$kfdat where fnum_kfdat = 261 and group_kfdat=2

GROUP_KFDAT NUMBER_KFDAT AUNUM_KFDAT
----------- ------------ -----------
 2            0         864
 2            0         865
 2            0         866
 2            1         831
 2            1         832
 2            1         833
 2            2         864
 2            2         865
 2            2         866
 2            3         831
 2            3         832
 2            3         833

12 rows selected.

OK so I am confident that my data is actually mirrored-otherwise the following test would not make any sense. I have double checked that the disks in failgroup FILER01 actually belong to my OpenFiler “filer01″, and the same for filer02. Going back to the original scenario:

Shut down Filer02

This will take down all the disks of failure group B. Two minutes after taking the filer down I checked if it was indeed shut down:

martin@dom0:~> sudo xm list | grep filer
filer01                                    183   512     1     -b----   1159.6
filer02                                          512     1              1179.6
filer03                                    185   512     1     -b----   1044.4

Yes, no doubt about it-it’s down. What would the effect be? Surely I/O errors, but I wanted to enforce a check. Connected to +ASM2 I issued the “select * from v$asm_disk” command. This caused quite significant logging in the instance’s alert.log:

NOTE: ASMB process exiting due to lack of ASM file activity for 5 seconds
Wed Apr 06 17:17:39 2011
WARNING: IO Failed. subsys:/opt/oracle/extapi/64/asm/orcl/1/libasm.so dg:0, diskname:ORCL:FILER02DISKA disk:0x0.0x97954459 au:0
 iop:0x2b2997b61330 bufp:0x2b29977b3e00 offset(bytes):0 iosz:4096 operation:1(Read) synchronous:0
 result: 4 osderr:0x3 osderr1:0x2e pid:6690
WARNING: IO Failed. subsys:/opt/oracle/extapi/64/asm/orcl/1/libasm.so dg:0, diskname:ORCL:FILER02DISKB disk:0x1.0x9795445a au:0
 iop:0x2b2997b61220 bufp:0x2b29977b0200 offset(bytes):0 iosz:4096 operation:1(Read) synchronous:0
 result: 4 osderr:0x3 osderr1:0x2e pid:6690
Wed Apr 06 17:17:58 2011
WARNING: IO Failed. subsys:/opt/oracle/extapi/64/asm/orcl/1/libasm.so dg:0, diskname:ORCL:FILER02DISKB disk:0x1.0x9795445a au:0
 iop:0x2b2997b61440 bufp:0x2b29977a6400 offset(bytes):0 iosz:4096 operation:1(Read) synchronous:0
 result: 4 osderr:0x3 osderr1:0x2e pid:6690
WARNING: IO Failed. subsys:/opt/oracle/extapi/64/asm/orcl/1/libasm.so dg:0, diskname:ORCL:FILER02DISKA disk:0x0.0x97954459 au:0
 iop:0x2b2997b61550 bufp:0x2b29977b1600 offset(bytes):0 iosz:4096 operation:1(Read) synchronous:0
 result: 4 osderr:0x3 osderr1:0x2e pid:6690
Wed Apr 06 17:18:03 2011
WARNING: Disk (FILER02DISKA) will be dropped in: (86400) secs on ASM inst: (2)
WARNING: Disk (FILER02DISKB) will be dropped in: (86400) secs on ASM inst: (2)
GMON SlaveB: Deferred DG Ops completed.
Wed Apr 06 17:19:26 2011
WARNING: IO Failed. subsys:/opt/oracle/extapi/64/asm/orcl/1/libasm.so dg:0, diskname:ORCL:FILER02DISKB disk:0x1.0x9795445a au:0
 iop:0x2b2997b61550 bufp:0x2b29977b1600 offset(bytes):0 iosz:4096 operation:1(Read) synchronous:0
 result: 4 osderr:0x3 osderr1:0x2e pid:6690
WARNING: IO Failed. subsys:/opt/oracle/extapi/64/asm/orcl/1/libasm.so dg:0, diskname:ORCL:FILER02DISKA disk:0x0.0x97954459 au:0
 iop:0x2b2997b61440 bufp:0x2b29977b0200 offset(bytes):0 iosz:4096 operation:1(Read) synchronous:0
 result: 4 osderr:0x3 osderr1:0x2e pid:6690
Wed Apr 06 17:20:10 2011
WARNING: IO Failed. subsys:/opt/oracle/extapi/64/asm/orcl/1/libasm.so dg:0, diskname:ORCL:FILER02DISKA disk:0x0.0x97954459 au:0
 iop:0x2b2997b61000 bufp:0x2b29977a6400 offset(bytes):0 iosz:4096 operation:1(Read) synchronous:0
 result: 4 osderr:0x3 osderr1:0x2e pid:6690
WARNING: IO Failed. subsys:/opt/oracle/extapi/64/asm/orcl/1/libasm.so dg:0, diskname:ORCL:FILER02DISKB disk:0x1.0x9795445a au:0
 iop:0x2b2997b61550 bufp:0x2b29977b1600 offset(bytes):0 iosz:4096 operation:1(Read) synchronous:0
 result: 4 osderr:0x3 osderr1:0x2e pid:6690
Wed Apr 06 17:21:07 2011
WARNING: Disk (FILER02DISKA) will be dropped in: (86217) secs on ASM inst: (2)
WARNING: Disk (FILER02DISKB) will be dropped in: (86217) secs on ASM inst: (2)
GMON SlaveB: Deferred DG Ops completed.
Wed Apr 06 17:27:15 2011
WARNING: Disk (FILER02DISKA) will be dropped in: (85849) secs on ASM inst: (2)
WARNING: Disk (FILER02DISKB) will be dropped in: (85849) secs on ASM inst: (2)
GMON SlaveB: Deferred DG Ops completed.

The interesting lines are “all mirror sides found readable, no repair required”. So taking down the failgroup didn’t cause an outage. The other ASM instance complained as well a little later:

2011-04-06 17:16:58.393000 +01:00
NOTE: initiating PST update: grp = 2, dsk = 2, mode = 0x15
NOTE: initiating PST update: grp = 2, dsk = 3, mode = 0x15
kfdp_updateDsk(): 24
kfdp_updateDskBg(): 24
WARNING: IO Failed. subsys:/opt/oracle/extapi/64/asm/orcl/1/libasm.so dg:2, diskname:ORCL:FILER02DISKA disk:0x2.0x97a6d9f7 au:1
 iop:0x2b9ea4855e70 bufp:0x2b9ea4850a00 offset(bytes):1052672 iosz:4096 operation:2(Write) synchronous:1
 result: 4 osderr:0x3 osderr1:0x2e pid:870
NOTE: group FGTEST: updated PST location: disk 0000 (PST copy 0)
2011-04-06 17:17:03.508000 +01:00
NOTE: ASMB process exiting due to lack of ASM file activity for 5 seconds
NOTE: PST update grp = 2 completed successfully
NOTE: initiating PST update: grp = 2, dsk = 2, mode = 0x1
NOTE: initiating PST update: grp = 2, dsk = 3, mode = 0x1
kfdp_updateDsk(): 25
kfdp_updateDskBg(): 25
2011-04-06 17:17:07.454000 +01:00
NOTE: group FGTEST: updated PST location: disk 0000 (PST copy 0)
NOTE: PST update grp = 2 completed successfully
NOTE: cache closing disk 2 of grp 2: FILER02DISKA
NOTE: cache closing disk 3 of grp 2: FILER02DISKB
SUCCESS: extent 0 of file 267 group 2 repaired by offlining the disk
NOTE: repairing group 2 file 267 extent 0
SUCCESS: extent 0 of file 267 group 2 repaired - all mirror sides found readable, no repair required
2011-04-06 17:19:04.526000 +01:00
GMON SlaveB: Deferred DG Ops completed.
2011-04-06 17:22:07.487000 +01:00
GMON SlaveB: Deferred DG Ops completed.

No interruption of service though, which is good-the GV$ASM_CLIENT view reported all database instances still connected.

SQL> select * from gv$asm_client;

 INST_ID GROUP_NUMBER INSTANCE_NAME                                                    DB_NAME  STATUS       SOFTWARE_VERSIO COMPATIBLE_VERS
---------- ------------ ---------------------------------------------------------------- -------- ------------ --------------- ---------------
 2            2 rac11g2                                                          rac11g   CONNECTED    11.1.0.7.0      11.1.0.0.0
 1            2 rac11g1                                                          rac11g   CONNECTED    11.1.0.7.0      11.1.0.0.0

The result in the V$ASM_DISK view was as follows:

SQL> select name,state,header_status,path from v$asm_disk;

NAME                           STATE    HEADER_STATU PATH                                               FAILGROUP
------------------------------ -------- ------------ -------------------------------------------------- ------------------------------
 NORMAL   UNKNOWN      ORCL:FILER02DISKA
 NORMAL   UNKNOWN      ORCL:FILER02DISKB
DATA1                          NORMAL   MEMBER       ORCL:DATA1                                         DATA1
DATA2                          NORMAL   MEMBER       ORCL:DATA2                                         DATA2
FILER01DISKA                   NORMAL   MEMBER       ORCL:FILER01DISKA                                  FILER01
FILER01DISKB                   NORMAL   MEMBER       ORCL:FILER01DISKB                                  FILER01
FILER02DISKA                   NORMAL   UNKNOWN                                                         FILER02
FILER02DISKB                   NORMAL   UNKNOWN                                                         FILER02

8 rows selected.

As I expected the disks for failgroup filer02 are gone, and so is the information about the failure group. My disk repair time should be high enough to protect me from having to rebuild the whole disk group. Now I’m really curious if my database can become corrupted-I’ll increase the SCN.

[oracle@rac11gr1node1 ~]$ . setsid.sh rac11g
[oracle@rac11gr1node1 ~]$ sq

SQL*Plus: Release 11.1.0.7.0 - Production on Wed Apr 6 17:24:18 2011

Copyright (c) 1982, 2008, Oracle.  All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options

SQL> select current_scn from v$database;

CURRENT_SCN
-----------
 5999304

SQL> begin
 2   for i in 1..5 loop
 3    execute immediate 'alter system switch logfile';
 4   end loop;
 5  end;
 6  /

PL/SQL procedure successfully completed.

SQL> select current_scn from v$database;

CURRENT_SCN
-----------
 5999378

SQL>

Back to the test case.

Stop the Database

[oracle@rac11gr1node1 ~]$ srvctl stop database -d rac11g
[oracle@rac11gr1node1 ~]$ srvctl status database -d rac11g
Instance rac11g2 is not running on node rac11gr1node1
Instance rac11g1 is not running on node rac11gr1node2

Done-this part was simple. Next they stopped their first filer. To prevent bad things from happening I’ll shut down ASM on all nodes. I hope that doesn’t invalidate the test but I can’t see how ASM would not get a problem if the other failgroup went down as well.

Shut down Filer01 and start Filer02

Also quite simple. Shutting down this filer will allow me to follow the story. After filer01 was down I started filer02. I’m curious as to how ASM will react. I have deliberately NOT put disk group FGTEST into the ASM_DISKSTRING, I want to start it manually to get a better understanding of what happens.

After having started ASM on both nodes, I queried V$ASM_DISK and tried to mount the disk group:

SQL> select disk_number,name,state,header_status,path,failgroup from v$asm_disk;

DISK_NUMBER NAME                           STATE    HEADER_STATU PATH                                               FAILGROUP
----------- ------------------------------ -------- ------------ -------------------------------------------------- ------------------------------
 0                                NORMAL   MEMBER       ORCL:FILER02DISKA
 1                                NORMAL   MEMBER       ORCL:FILER02DISKB
 2                                NORMAL   UNKNOWN      ORCL:FILER01DISKA
 3                                NORMAL   UNKNOWN      ORCL:FILER01DISKB
 0 DATA1                          NORMAL   MEMBER       ORCL:DATA1                                         DATA1
 1 DATA2                          NORMAL   MEMBER       ORCL:DATA2                                         DATA2

6 rows selected.

Ooops, now they are both gone….

SQL> alter diskgroup fgtest mount;
alter diskgroup fgtest mount
*
ERROR at line 1:
ORA-15032: not all alterations performed
ORA-15040: diskgroup is incomplete
ORA-15042: ASM disk "1" is missing
ORA-15042: ASM disk "0" is missing
ORA-15080: synchronous I/O operation to a disk failed
ORA-15080: synchronous I/O operation to a disk failed

OK, I have a problem here. Both ASM instances report I/O errors with the FGTEST diskgroup, and I can’t mount it. That means I can’t mount the database either-in a way it proves I won’t have corruption. But neither will I have a database, what is worse?

Can I get around this problem?

I think I’ll have to start filer01 and see if that makes a difference. Hopefully I can recover my system with the information in failgroup filer01. Soon after filer01 came online I tried the query against v$asmdisk again and tried to mount it.

SQL> select disk_number,name,state,header_status,path,failgroup from v$asm_disk;

DISK_NUMBER NAME                           STATE    HEADER_STATU PATH                                               FAILGROUP
----------- ------------------------------ -------- ------------ -------------------------------------------------- ------------------------------
 0                                NORMAL   MEMBER       ORCL:FILER02DISKA
 1                                NORMAL   MEMBER       ORCL:FILER02DISKB
 2                                NORMAL   MEMBER       ORCL:FILER01DISKA
 3                                NORMAL   MEMBER       ORCL:FILER01DISKB
 0 DATA1                          NORMAL   MEMBER       ORCL:DATA1                                         DATA1
 1 DATA2                          NORMAL   MEMBER       ORCL:DATA2                                         DATA2

6 rows selected.

That worked!

Wed Apr 06 17:45:32 2011
SQL> alter diskgroup fgtest mount
NOTE: cache registered group FGTEST number=2 incarn=0x72c150d7
NOTE: cache began mount (first) of group FGTEST number=2 incarn=0x72c150d7
NOTE: Assigning number (2,0) to disk (ORCL:FILER01DISKA)
NOTE: Assigning number (2,1) to disk (ORCL:FILER01DISKB)
NOTE: Assigning number (2,2) to disk (ORCL:FILER02DISKA)
NOTE: Assigning number (2,3) to disk (ORCL:FILER02DISKB)
Wed Apr 06 17:45:33 2011
NOTE: start heartbeating (grp 2)
kfdp_query(): 12
kfdp_queryBg(): 12
NOTE: cache opening disk 0 of grp 2: FILER01DISKA label:FILER01DISKA
NOTE: F1X0 found on disk 0 fcn 0.0
NOTE: cache opening disk 1 of grp 2: FILER01DISKB label:FILER01DISKB
NOTE: cache opening disk 2 of grp 2: FILER02DISKA label:FILER02DISKA
NOTE: F1X0 found on disk 2 fcn 0.0
NOTE: cache opening disk 3 of grp 2: FILER02DISKB label:FILER02DISKB
NOTE: cache mounting (first) group 2/0x72C150D7 (FGTEST)
Wed Apr 06 17:45:33 2011
* allocate domain 2, invalid = TRUE
kjbdomatt send to node 0
Wed Apr 06 17:45:33 2011
NOTE: attached to recovery domain 2
NOTE: cache recovered group 2 to fcn 0.7252
Wed Apr 06 17:45:33 2011
NOTE: LGWR attempting to mount thread 1 for diskgroup 2
NOTE: LGWR mounted thread 1 for disk group 2
NOTE: opening chunk 1 at fcn 0.7252 ABA
NOTE: seq=3 blk=337
NOTE: cache mounting group 2/0x72C150D7 (FGTEST) succeeded
NOTE: cache ending mount (success) of group FGTEST number=2 incarn=0x72c150d7
Wed Apr 06 17:45:33 2011
kfdp_query(): 13
kfdp_queryBg(): 13
NOTE: Instance updated compatible.asm to 11.1.0.0.0 for grp 2
SUCCESS: diskgroup FGTEST was mounted
SUCCESS: alter diskgroup fgtest mount

The V$ASM_DISK view is nicely updated and everything seems to be green:

SQL> select disk_number,name,state,header_status,path,failgroup from v$asm_disk;

DISK_NUMBER NAME                           STATE    HEADER_STATU PATH                                               FAILGROUP
----------- ------------------------------ -------- ------------ -------------------------------------------------- ------------------------------
 0 DATA1                          NORMAL   MEMBER       ORCL:DATA1                                         DATA1
 1 DATA2                          NORMAL   MEMBER       ORCL:DATA2                                         DATA2
 0 FILER01DISKA                   NORMAL   MEMBER       ORCL:FILER01DISKA                                  FILER01
 1 FILER01DISKB                   NORMAL   MEMBER       ORCL:FILER01DISKB                                  FILER01
 2 FILER02DISKA                   NORMAL   MEMBER       ORCL:FILER02DISKA                                  FILER02
 3 FILER02DISKB                   NORMAL   MEMBER       ORCL:FILER02DISKB                                  FILER02

6 rows selected.

Brilliant-will it have an effect on the database?

Starting the Database

Even though things looked ok, they weren’t! I didn’t expect this to happen:

[oracle@rac11gr1node1 ~]$ srvctl start database -d rac11g
PRKP-1001 : Error starting instance rac11g2 on node rac11gr1node1
rac11gr1node1:ora.rac11g.rac11g2.inst:
rac11gr1node1:ora.rac11g.rac11g2.inst:SQL*Plus: Release 11.1.0.7.0 - Production on Wed Apr 6 17:48:58 2011
rac11gr1node1:ora.rac11g.rac11g2.inst:
rac11gr1node1:ora.rac11g.rac11g2.inst:Copyright (c) 1982, 2008, Oracle.  All rights reserved.
rac11gr1node1:ora.rac11g.rac11g2.inst:
rac11gr1node1:ora.rac11g.rac11g2.inst:Enter user-name: Connected to an idle instance.
rac11gr1node1:ora.rac11g.rac11g2.inst:
rac11gr1node1:ora.rac11g.rac11g2.inst:SQL> ORACLE instance started.
rac11gr1node1:ora.rac11g.rac11g2.inst:
rac11gr1node1:ora.rac11g.rac11g2.inst:Total System Global Area 1720328192 bytes
rac11gr1node1:ora.rac11g.rac11g2.inst:Fixed Size                    2160392 bytes
rac11gr1node1:ora.rac11g.rac11g2.inst:Variable Size              1291847928 bytes
rac11gr1node1:ora.rac11g.rac11g2.inst:Database Buffers    419430400 bytes
rac11gr1node1:ora.rac11g.rac11g2.inst:Redo Buffers                  6889472 bytes
rac11gr1node1:ora.rac11g.rac11g2.inst:ORA-00600: internal error code, arguments: [kccpb_sanity_check_2], [9572],
rac11gr1node1:ora.rac11g.rac11g2.inst:[9533], [0x000000000], [], [], [], [], [], [], [], []
rac11gr1node1:ora.rac11g.rac11g2.inst:
rac11gr1node1:ora.rac11g.rac11g2.inst:
rac11gr1node1:ora.rac11g.rac11g2.inst:SQL> Disconnected from Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - 64bit Production
rac11gr1node1:ora.rac11g.rac11g2.inst:With the Partitioning, Real Application Clusters, OLAP, Data Mining
rac11gr1node1:ora.rac11g.rac11g2.inst:and Real Application Testing options
rac11gr1node1:ora.rac11g.rac11g2.inst:
CRS-0215: Could not start resource 'ora.rac11g.rac11g2.inst'.
PRKP-1001 : Error starting instance rac11g1 on node rac11gr1node2
CRS-0215: Could not start resource 'ora.rac11g.rac11g1.inst'.

Oops. A quick search on Metalink revealed note Ora-00600: [Kccpb_sanity_check_2], [3621501],[3621462] On Startup (Doc ID 435436.1). The explanation for the ORA-600 is that “the seq# of the last read block is higher than the seq# of the control file header block.” Oracle Support explains it with a lost write, but here the situation is quite different. Interesting! I have to leave that for another blog post.

11.1 GC agent refuses to start

This is a follow-up post from my previous tale of how not to move networks for RAC. After having successfully restarted the cluster as described here in a previous post I went on to install a Grid Control 11.1 system. This was to be on Solaris 10 SPARC-why SPARC? Not my platform of choice when it comes to Oracle software, but my customer has a huge SPARC estate and wants to make most of it.

After the OMS has been built (hopefully I’ll find time to document this as it can be quite tricky on SPARC) I wanted to secure the agents on my cluster against it. That worked ok for the first node:

  • emctl clearstate agent
  • emctl secure agent
  • emctl start agent

Five minutes later the agent appeared in my list of agents in the Grid Control console. With this success backing me I went to do the same on the next cluster node.

Here things were different-here’s the sequence of commands I used:

$ emctl stop agent
Oracle Enterprise Manager 11g Release 1 Grid Control 11.1.0.1.0
Copyright (c) 1996, 2010 Oracle Corporation.  All rights reserved
$

I didn’t pay too much attention to the fact that there has been no acknowledgement of the completion of the stop command. I noticed something wasn’t quite right when I tried to get the agent’s status:

$ emctl status agent
Oracle Enterprise Manager 11g Release 1 Grid Control 11.1.0.1.0
Copyright (c) 1996, 2010 Oracle Corporation.  All rights reserved.
---------------------------------------------------------------
emctl stop agent
Error connecting to https://node2.example.com:3872/emd/main

Now that should have reported that the agent was down. Strange. I tried a few more commands,  such as the following one to start the agent.

[agent]oracle@node2.example.com $ emctl start agent
Oracle Enterprise Manager 11g Release 1 Grid Control 11.1.0.1.0
Copyright (c) 1996, 2010 Oracle Corporation.  All rights reserved.
Agent is already running

Which wasn’t the case-there was no agent process whatsoever in the process table. I also checked the emd.properties file. Note that the emd.properties file is in $AGENT_HOME/hostname/sysman/config/ now instead of $AGENT_HOME/sysman/config as it was in 10g.

Everything looked correct, and even a comparison with the first node didn’t reveal any discrepancy. So I scratched my head a little more until I found a MOS note on the subject stating that the agent cannot listen to multiple addresses. The note is for 10g only and has the rather clumsy title “Grid Control Agent Startup: “emctl start agent” Command Returns “Agent is already running” Although the Agent is Stopped (Doc ID 1079424.1)

Although stating it’s for 10g and multiple NICs it got me thinking. And indeed, the /etc/hosts file has not been updated, leaving the old cluster address in /etc/hosts while the new one was in DNS.

# grep node2 /etc/hosts
10.x.x4.42            node2.example.com node2
172.x.x.1x8          node2-priv.example.com node2-priv
# host node2.example.com
node2.example.com has address 10.x5.x8.3
[root@node2 ~]# grep ^hosts /etc/nsswitch.conf
hosts:      files dns

This also explained why the agent started on the first node-it had an updated /etc/hosts file. Why the other nodes didn’t have their hosts file updated will forever remain a mystery.

Things then changed dramatically after the hosts file has been updated:

$ emctl status agent

Oracle Enterprise Manager 11g Release 1 Grid Control 11.1.0.1.0
Copyright (c) 1996, 2010 Oracle Corporation.  All rights reserved.
---------------------------------------------------------------
Agent is Not Running

Note how emctl acknowledges that the agent it down now. I successfully secured and started the agent:

$ emctl secure agent
Oracle Enterprise Manager 11g Release 1 Grid Control 11.1.0.1.0
Copyright (c) 1996, 2010 Oracle Corporation.  All rights reserved.
Agent is already stopped...   Done.
Securing agent...   Started.
Enter Agent Registration Password :
Securing agent...   Successful.

$ emctl status agent
Oracle Enterprise Manager 11g Release 1 Grid Control 11.1.0.1.0
Copyright (c) 1996, 2010 Oracle Corporation.  All rights reserved.
---------------------------------------------------------------
Agent is Not Running
$ emctl start agent

Oracle Enterprise Manager 11g Release 1 Grid Control 11.1.0.1.0
Copyright (c) 1996, 2010 Oracle Corporation.  All rights reserved.
Starting agent .............. started.

One smaller problem remained:

$ emctl status agent
Oracle Enterprise Manager 11g Release 1 Grid Control 11.1.0.1.0
Copyright (c) 1996, 2010 Oracle Corporation.  All rights reserved.
---------------------------------------------------------------
Agent Version     : 11.1.0.1.0
OMS Version       : 11.1.0.1.0
Protocol Version  : 11.1.0.0.0
Agent Home        : /u01/app/oracle/product/agent11g/node8.example.com
Agent binaries    : /u01/app/oracle/product/agent11g
Agent Process ID  : 14045
Parent Process ID : 14014
Agent URL         : https://node8.example.com:3872/emd/main
Repository URL    : https://oms.example.com:1159/em/upload
Started at        : 2011-02-14 09:59:03
Started by user   : oracle
Last Reload       : 2011-02-14 10:00:13
Last successful upload                       : 2011-02-14 10:00:19
Total Megabytes of XML files uploaded so far :    11.56
Number of XML files pending upload           :      188
Size of XML files pending upload(MB)         :    65.89
Available disk space on upload filesystem    :    60.11%
Collection Status                            : Disabled by Upload Manager
Last successful heartbeat to OMS             : 2011-02-14 10:00:17
---------------------------------------------------------------
Agent is Running and Ready

The message in red highlights the “Disabled by Upload Manager”. That’s because a lot of stuff hasn’t been transferred yet. Let’s force an upload-I know the communication between agent and OMS is working, so that should resolve the issue.

$ emctl upload
$ emctl status agent
Oracle Enterprise Manager 11g Release 1 Grid Control 11.1.0.1.0
Copyright (c) 1996, 2010 Oracle Corporation.  All rights reserved.
---------------------------------------------------------------
Agent Version     : 11.1.0.1.0
OMS Version       : 11.1.0.1.0
Protocol Version  : 11.1.0.0.0
Agent Home        : /u01/app/oracle/product/agent11g/node8.example.com
Agent binaries    : /u01/app/oracle/product/agent11g
Agent Process ID  : 14045
Parent Process ID : 14014
Agent URL         : https://node8.example.com:3872/emd/main
Repository URL    : https://oms.example.com:1159/em/upload
Started at        : 2011-02-14 09:59:03
Started by user   : oracle
Last Reload       : 2011-02-14 10:02:12
Last successful upload                       : 2011-02-14 10:02:53
Total Megabytes of XML files uploaded so far :    91.12
Number of XML files pending upload           :       22
Size of XML files pending upload(MB)         :     1.50
Available disk space on upload filesystem    :    60.30%
Last successful heartbeat to OMS             : 2011-02-14 10:02:19
---------------------------------------------------------------
Agent is Running and Ready

That’s about it-a few minutes later the agent was visible on the console. Now that only had to be repeated for all remaining 6 nodes…

NB: For the reasons shown in this article I don’t endorse duplicating host information in /etc/hosts and DNS-a resilient DNS infrastructure should always be used to store this kind of information.

Patch 10270073 – 11.1.0.1.2 Patch Set Update for Oracle Management Service

Today is patch day – on my current site we are having quite a few problems with our management server (OMS from now on). It occasionally simply “hangs” and doesn’t respond when connecting to the SSH port. A few minutes later it’s back to normal-but this behaviour is not reproducible.

So in an effort to please Oracle support who couldn’t find a reason for this I decided to apply Patch Set Update 2 to the OMS to get it to 11.1.0.1.2. I would also like to filter the corresponding agent patch for our 11.1 agents through as well. The PSU has been released 2 weeks ago so it’s reasonably fresh.The patches for OMS and agent are generic, which is nice as it implies they are available for every platform. Our OMS had had problems before, and one-off patches have been applied. So the first step as always with PSUs is to check if there are conflicts. The readme file has the required instructions. I unzipped p10270073_111010_Generic.zip in /tmp/ and then executed the prerequisite checker in /tmp as shown in this example:

[oms]oracle@oms $ $ORACLE_HOME/OPatch/opatch prereq CheckConflictAgainstOHWithDetail -phBaseDir ./10270073
Invoking OPatch 11.1.0.8.0

Oracle Interim Patch Installer version 11.1.0.8.0
Copyright (c) 2009, Oracle Corporation.  All rights reserved.

PREREQ session

Oracle Home       : /u01/app/oracle/product/middleware/oms11g
Central Inventory : /u01/app/oracle/product/oraInventory
 from           : /var/opt/oracle/oraInst.loc
OPatch version    : 11.1.0.8.0
OUI version       : 11.1.0.8.0
OUI location      : /u01/app/oracle/product/middleware/oms11g/oui
Log file location : /u01/app/oracle/product/middleware/oms11g/cfgtoollogs/opatch/opatch2011-02-01_08-48-14AM.log

Patch history file: /u01/app/oracle/product/middleware/oms11g/cfgtoollogs/opatch/opatch_history.txt

OPatch detects the Middleware Home as "/u01/app/oracle/product/middleware"

Invoking prereq "checkconflictagainstohwithdetail"

ZOP-40: The patch(es) has conflicts/supersets with other patches installed in the Oracle Home (or) among themselves.

Prereq "checkConflictAgainstOHWithDetail" failed.

Summary of Conflict Analysis:

Patches that can be applied now without any conflicts are :
10270073

Following patches are not required, as they are subset of the patches in Oracle Home or subset of the patches in the given list :
9563902, 9537948, 9489355

Following patches will be rolled back from Oracle Home on application of the patches in the given list :
9563902, 9537948, 9489355

Conflicts/Supersets for each patch are:

Patch : 10270073

 Bug Superset of 9563902
 Super set bugs are:
 9491872,  9476313,  9544428

 Bug Superset of 9537948
 Super set bugs are:
 9537948

 Bug Superset of 9489355
 Super set bugs are:
 9489355

OPatch succeeded.
[oms]oracle@oms $

OK, so a few one-offs will be rolled back. Let’s get started. First of all we have to stop the OMS as shown here:

[oms]oracle@oms $ emctl stop oms
Oracle Enterprise Manager 11g Release 1 Grid Control
Copyright (c) 1996, 2010 Oracle Corporation.  All rights reserved.
Stopping WebTier...
WebTier Successfully Stopped
Stopping Oracle Management Server...
Oracle Management Server Successfully Stopped
Oracle Management Server is Down

The next step is to apply the patch. Here’s the sample session:

[oms]oracle@oms $ cd 10270073

[oms]oracle@oms $ $ORACLE_HOME/OPatch/opatch apply
Invoking OPatch 11.1.0.8.0

Oracle Interim Patch Installer version 11.1.0.8.0
Copyright (c) 2009, Oracle Corporation.  All rights reserved.

Oracle Home       : /u01/app/oracle/product/middleware/oms11g
Central Inventory : /u01/app/oracle/product/oraInventory
 from           : /var/opt/oracle/oraInst.loc
OPatch version    : 11.1.0.8.0
OUI version       : 11.1.0.8.0
OUI location      : /u01/app/oracle/product/middleware/oms11g/oui
Log file location : /u01/app/oracle/product/middleware/oms11g/cfgtoollogs/opatch/opatch2011-02-01_09-00-00AM.log

Patch history file: /u01/app/oracle/product/middleware/oms11g/cfgtoollogs/opatch/opatch_history.txt

OPatch detects the Middleware Home as "/u01/app/oracle/product/middleware"

ApplySession applying interim patch '10270073' to OH '/u01/app/oracle/product/middleware/oms11g'
Interim patch 10270073 is a superset of the patch(es) [  9563902 9537948 9489355 ] in the Oracle Home
OPatch will rollback the subset patches and apply the given patch.
Execution of 'sh /tmp/10270073/custom/scripts/init -apply 10270073 ':

Return Code = 0

Running prerequisite checks...

OPatch detected non-cluster Oracle Home from the inventory and will patch the local system only.

Backing up files and inventory (not for auto-rollback) for the Oracle Home
Backing up files affected by the patch '10270073' for restore. This might take a while...
Backing up files affected by the patch '9563902' for restore. This might take a while...
Backing up files affected by the patch '9537948' for restore. This might take a while...
Backing up files affected by the patch '9489355' for restore. This might take a while...
ApplySession rolling back interim patch '9563902' from OH '/u01/app/oracle/product/middleware/oms11g'

Patching component oracle.sysman.oms.core, 11.1.0.1.0...
Copying file to "/u01/app/oracle/product/middleware/oms11g/sysman/omsca/scripts/wls/create_domain.py"
RollbackSession removing interim patch '9563902' from inventory
ApplySession rolling back interim patch '9537948' from OH '/u01/app/oracle/product/middleware/oms11g'

Patching component oracle.sysman.oms.core, 11.1.0.1.0...
Updating jar file "/u01/app/oracle/product/middleware/oms11g/sysman/jlib/emInstall.jar" with "/u01/app/oracle/product/middleware/oms11g/.patch_storage/9537948_Apr_12_2010_03_37_52/files//sysman/jlib/emInstall.jar/oracle/sysman/configassistant/addon/AddOnConfigAssistantDriver.class"
Updating jar file "/u01/app/oracle/product/middleware/oms11g/sysman/jlib/emInstall.jar" with "/u01/app/oracle/product/middleware/oms11g/.patch_storage/9537948_Apr_12_2010_03_37_52/files//sysman/jlib/emInstall.jar/oracle/sysman/configassistant/addon/AddOnConfigAssistantDriver$1.class"
Updating jar file "/u01/app/oracle/product/middleware/oms11g/sysman/jlib/emInstall.jar" with "/u01/app/oracle/product/middleware/oms11g/.patch_storage/9537948_Apr_12_2010_03_37_52/files//sysman/jlib/emInstall.jar/oracle/sysman/configassistant/addon/AddOnConfigAssistantDriver$AddOnFileFilter.class"
RollbackSession removing interim patch '9537948' from inventory
ApplySession rolling back interim patch '9489355' from OH '/u01/app/oracle/product/middleware/oms11g'

Patching component oracle.sysman.oms.core, 11.1.0.1.0...
Copying file to "/u01/app/oracle/product/middleware/oms11g/bin/HAConfigCmds.pm"
RollbackSession removing interim patch '9489355' from inventory

OPatch back to application of the patch '10270073' after auto-rollback.

Backing up files affected by the patch '10270073' for rollback. This might take a while...

Patching component oracle.sysman.oms.core, 11.1.0.1.0...
Updating jar file "/u01/app/oracle/product/middleware/oms11g/sysman/jlib/emCORE.jar" with "/sysman/jlib/emCORE.jar/oracle/sysman/eml/ecm/policy/PolicyViolationsController.class"
[...]
Copying file to "/u01/app/oracle/product/middleware/oms11g/bin/SecureOMSCmds.pm"
Copying file to "/u01/app/oracle/product/middleware/oms11g/sysman/emdrep/scripts/SecureAgent_oms.pl"
ApplySession adding interim patch '10270073' to inventory

Verifying the update...
Inventory check OK: Patch ID 10270073 is registered in Oracle Home inventory with proper meta-data.
Files check OK: Files from Patch ID 10270073 are present in Oracle Home.
Execution of 'sh /tmp/10270073/custom/scripts/post -apply 10270073 ':

Return Code = 0
--------------------------------------------------------------------------------
The following warnings have occurred during OPatch execution:
1) OUI-67620:Interim patch 10270073 is a superset of the patch(es) [  9563902 9537948 9489355 ] in the Oracle Home
--------------------------------------------------------------------------------
OPatch Session completed with warnings.

OPatch completed with warnings.

This process took 1 hour on my SPARC zone-not too impressed actually. But I’m blaming it on the overloaded box instead of the patch process.The outcome of the patching looks ok to me-I already knew I applied PSU 2 as a superset of 3 other patches. The next step is to apply the post install script. This is done by calling a JDBC application as shown here:

[oms]oracle@oms $ $ORACLE_HOME/bin/rcuJDBCEngine sys/secretpassword@repositoryHost:1821:REPOSDB JDBC_SCRIPT post_install_script.sql
Extracting Statement from File Name: 'post_install_script.sql' Line Number: 1
Extracted SQL Statement: [alter session set current_schema=SYSMAN]
 Statement Type: 'DDL Statement'
Executing SQL statement: alter session set current_schema=SYSMAN
Extracting Statement from File Name: 'post_install_script.sql' Line Number: 1
Extracting Statement from File Name: 'post_install_script.sql' Line Number: 1
Extracted SQL Statement: [SET serveroutput on size 1000000]
Skipping Unsupported Statement
 Statement Type: 'Oracle RCU NotSupported SQLPlus Statement'
Extracting Statement from File Name: 'post_install_script.sql' Line Number: 2
Extracted SQL Statement: [BEGIN
 EXECUTE IMMEDIATE 'drop table bundle_component_files';
 EXCEPTION WHEN OTHERS THEN
 IF sqlcode = -942 THEN
 NULL;
 ELSE
 RAISE;
 END IF;
END;
]
....
 END;
 END IF;
 END IF;
 RAISE;
END;
]
 Statement Type: 'BEGIN/END Anonymous Block'
Completed SQL script execution normally.
1 scripts were processed
[oms]oracle@oms $

Now finally it’s time to start the OMS and cross our fingers to see if it worked:

[oms]oracle@oms $ emctl start oms
Oracle Enterprise Manager 11g Release 1 Grid Control
Copyright (c) 1996, 2010 Oracle Corporation.  All rights reserved.
Starting WebTier...
WebTier Successfully Started
Starting Oracle Management Server...
Oracle Management Server Successfully Started
Oracle Management Server is Up
[oms]oracle@oms $

I’d call this success-I managed to log on to the OMS and performed some basic testing which implied the patch was successfully applied.

Adding storage dynamically to ASM on Linux

Note: This discussion is potentially relevant only to OEL 5.x and RHEL 5.x- I haven’t been able to verify that it works the same way on other Linux distributions. I would assume so though. Before starting with the article, here are some facts:

  • OEL/RHEL 5.5 64bit
  • Oracle 11.2.0.2
  • native multipathing: device-mapper-multipath

The question I have asked myself many times is: how can I dynamically add a LUN to ASM without having to stop any component of the stack? Mocking “reboot-me” OS’s like Windows I soon was quiet when it came to discussing the addition of a LUN to ASM on Linux. Today I learned how to do this, by piecing together information I got from Angus Thomas, a great Red Hat system administrator I had the pleassure to work with in 2009 and 2010. And since I have a short lived memory I decided to write it down.

I’ll describe the process from the top to bottom, from the addition of the LUN to the server all the way up to the addition of the ASM disk to the disk group.

Adding the storage to the cluster nodes

The first step is to obviosuly get the LUN assigned to the server(s). This is the easy part, and outside of the control of the Linux/Oracle admin. The storage team will provision a new LUN to the hosts in question. At this stage, Linux has no idea about the new storage: to make it available, the system administrator has to rescan the SCSI bus. A proven and tested way in RHEL 5 is to issue this command:

[root@node1 ~]# for i in `ls -1 /sys/class/scsi_host`; do
> echo "- - -" > /sys/class/scsi_host/${i}/scan
> done

The new, unpartitioned LUN will appear in /proc/partitions. If it doesn’t then there’s probably something wrong on the SAN side-check /var/log/messages and talk to your storage administrator. If it’s not a misconfiguration then you may not have an option but to reboot the node.

Configure Multipathing

So far so good, the next step is to add it to the multipathing. First of all, you need to find out what the new WWID of the device is. In my case that’s simple: the last new line in /proc/partitions is usually a giveaway. If you are unsure, ask the man who can check the WWID a console to the array. It’s important to get this right at this stage :)

To add the new disk to the multipath.conf file, all you need to do is to add a new section, as in the following example:


multipaths {
..
multipath {
wwid 360000970000294900664533030344239
alias ACFS0001
path_grouping_policy failover
}
..
}

By the way, I have written a more detailed post about configuring multipathing in a previous blog post here. Don’t forget to replicate the changes to the other cluster nodes!

Now  you reload multipathd using /etc/init.d/multipathd reload on each node, and voila, you should see the device in /dev/mapper/ – my ACFS disk appeared as /dev/mapper/ACFS0001.

Now the tricky bit is to partition it (if you need to-it’s no longer mandatory with 11.1 and newer. Some software like EMC’s Replication Manager requires you to though). I succeeded in doing so by checking the device in /dev/disk/by-id and then using fdisk against it as in this example:

...
# fdisk /dev/disk/by-id/scsi-360000970000294900664533030344239
Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel
Building a new DOS disklabel. Changes will remain in memory only,
until you decide to write them. After that, of course, the previous
content won't be recoverable.

The number of cylinders for this disk is set to 23251.
There is nothing wrong with that, but this is larger than 1024,
and could in certain setups cause problems with:
1) software that runs at boot time (e.g., old versions of LILO)
2) booting and partitioning software from other OSs
 (e.g., DOS FDISK, OS/2 FDISK)
Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)

Command (m for help): u
Changing display/entry units to sectors

Command (m for help): n
Command action
 e   extended
 p   primary partition (1-4)
p
Partition number (1-4): 1
First sector (32-47619839, default 32): 128
Last sector or +size or +sizeM or +sizeK (129-47619839, default 47619839):
Using default value 47619839

Command (m for help): p

Disk /dev/disk/by-id/scsi-360000970000294900664533030344239: 24.3 GB, 24381358080 bytes
64 heads, 32 sectors/track, 23251 cylinders, total 47619840 sectors
Units = sectors of 1 * 512 = 512 bytes

 Device Boot                                                         Start  End         Blocks     Id  System
/dev/disk/by-id/scsi-360000970000294900664533030344239p1             128    47619839    23809855+  83  Linux

Command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.
Syncing disks.

Once you are in fdisk, the commands are identical to single-pathed storage. Type “n” to create a new partition, “p” for a primary and specify the start and end cylinders as needed. Type “p” to print the partition table, and if you are happy with it use “w” to write it. You might wonder why I added an offset and changed the unit (“u”)-this is due to the EMC storage this site uses.  The EMC® Host Connectivity Guide for Linux (P/N 300-003-865 REV A23) suggests a 64k offset. Don’t simply repeat this in your environment-check with the storage team first.

Before adding the partitions to ACFS0001 and ACFS0002 I had 107 partitions:


[root@node1 ~]# wc -l /proc/partitions
107 /proc/partitions

The new partitions are recognised after the 2 fdisk commands completed:


[root@node1 ~]# wc -l /proc/partitions
 107 /proc/partitions

But when you check /dev/mapper now you still don’t see the partition-the naming convention is to append pn to the device name, i.e. /dev/mapper/ACFS0001p1 for the first partition and so on.

kpartx to the rescue! This superb utility can read the partition table of a device and modify it. Initially my setup was as follows:


[root@node1 ~]# ls -l /dev/mapper/ACFS*
brw-rw---- 1 root disk 253, 31 Jan 18 10:05 /dev/mapper/ACFS0001
brw-rw---- 1 root disk 253, 32 Jan 18 10:05 /dev/mapper/ACFS0002

Usually I would have rebooted the node at this stage as I didn’t know about how to update the partition table. But with kpartx (“yum install kpartx” to install) this is no longer needed. Consider the below example:

[root@node1 ~]# kpartx -l /dev/mapper/ACFS0001
ACFS0001p1 : 0 47619711 /dev/mapper/ACFS0001 129
[root@node1 ~]# kpartx -a /dev/mapper/ACFS0001
[root@node1 ~]# kpartx -l /dev/mapper/ACFS0002
ACFS0002p1 : 0 47619711 /dev/mapper/ACFS0002 129
[root@node1 ~]# kpartx -a /dev/mapper/ACFS0002

[root@node1 ~]# ls -l /dev/mapper/ACFS000*
brw-rw---- 1 root disk 253, 31 Jan 18 10:05 /dev/mapper/ACFS0001
brw-rw---- 1 root disk 253, 36 Jan 18 10:13 /dev/mapper/ACFS0001p1
brw-rw---- 1 root disk 253, 32 Jan 18 10:05 /dev/mapper/ACFS0002
brw-rw---- 1 root disk 253, 37 Jan 18 10:13 /dev/mapper/ACFS0002p1

“kpartx -l” prints the partition table, and “kpartx -a” adds it as the example shows. No more need to reboot! However, as it’s been pointed out in the comments section (see below), kpartx doesn’t use/add both paths, so you should run the partprobe command to add the missing paths:


[root@node1 ~]# partprobe
[root@node1 ~]# wc -l /proc/partitions
109 /proc/partitions

 

 

See how there are 109 partitions listed now instead of just 107 from before-the 2 missing paths have been added (one for each device).

Add disks to ASM

With this done, you can add the disk to ASM – I personally like the intermediate step to create and ASMLib disk. Connect to ASM as sysasm and add the disk using the alter diskgroup command:

SQL> alter diskgroup ACFSDG add disk 'ORCL:ACFS0002', 'ORCL:ACFS0001';

Now just wait for the rebalance operation to complete.