Search

Top 60 Oracle Blogs

Recent comments

January 2009

Identifying shared memory segment users using lsof

Lsof (list open files) is a really useful tool for troubleshooting open file decriptors which prevent a deleted file from being released or a shared memory segment from being removed.
Here’s a little situation on Linux where an Oracle shared memory segment was not released as someone was still using it.
$ ipcs -ma ------ Shared Memory Segments -------- key shmid owner perms bytes nattch status 0x00000000 393216 oracle 640 289406976 1 dest 0xbfb94e30 425985 oracle 640 289406976 18 0x3cf13430 557058 oracle 660 423624704 22 ------ Semaphore Arrays -------- key semid owner perms nsems 0xe2260ff0 1409024 oracle 640 154 0x9df96b74 1671169 oracle 660 154 ------ Message Queues -------- key msqid owner perms used-bytes messages The bold line should have disappeared after instance shutdown, but it didn’t.

Identifying shared memory segment users using lsof

Lsof (list open files) is a really useful tool for troubleshooting open file decriptors which prevent a deleted file from being released or a shared memory segment from being removed.
Here’s a little situation on Linux where an Oracle shared memory segment was not released as someone was still using it.
$ ipcs -ma ------ Shared Memory Segments -------- key shmid owner perms bytes nattch status 0x00000000 393216 oracle 640 289406976 1 dest 0xbfb94e30 425985 oracle 640 289406976 18 0x3cf13430 557058 oracle 660 423624704 22 ------ Semaphore Arrays -------- key semid owner perms nsems 0xe2260ff0 1409024 oracle 640 154 0x9df96b74 1671169 oracle 660 154 ------ Message Queues -------- key msqid owner perms used-bytes messages The bold line should have disappeared after instance shutdown, but it didn’t.

Making a Shell Variable Read Only

Being inherently lazy, I am always a sucker for shortcuts, neat tricks to cut my work and, most important, not to do the same thing again and again. Here is a tip I find useful.

Have you ever been frustrated to find that some line has changed some important shell variable such as ORACLE_BASE inside a shell script? The list of variables that are important to safety and efficiency of your shell is a long one - PS1, ORACLE_BASE, PATH, and so on. Using this little known command, you can easily "protect" a variable. The trick is to make it readonly. First, set the variable:

# export ORACLE_BASE=/opt/oracle

Then make it readonly:

# readonly ORACLE_BASE

Now if you want to set it:

# export ORACLE_BASE=/opt/oracle1
-bash: ORACLE_BASE: readonly variable

You can't. You can't even unset the variable:

# unset ORACLE_BASE
-bash: unset: ORACLE_BASE: cannot unset: readonly variable

This is a cool way to protect important variables.

To get a list of variables that are readonly, use

# declare -r
declare -ar BASH_VERSINFO='([0]="3" [1]="00" [2]="15" [3]="1" [4]="release" [5]="i386-redhat-linux-gnu")'
declare -ir EUID="500"
declare -rx ORACLE_BASE="/opt/oracle"
declare -ir PPID="13204"
declare -r SHELLOPTS="braceexpand:emacs:hashall:histexpand:history:interactive-comments:monitor"

The other day

The other day, we had a serious issue in the ASM diskgroups - one diskgroup refused to come up because one disk was missing; but it was not clear from the message which of the 283 devices was missing. This underscores the difficulty in diagnosing ASM discovery related issues. In this post, I have tried to present a way to diagnose this sort of issues through a real example.

We had planned to move from one storage device to another (EMC DMX-3 to DMX-4) using SRDF/A technology. The new storage was attached to a new server. The idea was to replicate data at the storage level using SRDF/A. At the end of the replication process, we shut the database and ASM down and brought up the ASM instance in the newer storage on the new server. Since the copy was disk level, the disk signatures were intact and the ASM disks retained their identity from the older storage. So, when the ASM instance was started, it recognized all the disks and mounted all the diskgroups (10 of them) except one.

While bringing up a disk group called APP_DG3 on the new server it complained that disk number “1” is missing; but it was not clear which particular disk was. In this blog the situation was diagnosed and performed.

Note: the asm disk paths changed on the storage. This was not really a problem; since we could simply define a new asm_diskstring. Remember: the diskstring initialization parameter simply tells the ASM instance which disks should be looked at while discovering. Once those disks are identified, ASM looks at its signature on the disk headers to check the properties - the disk number, the diskgroup it belongs to, the capacity, version compatibilty and so on. So as long as the correct asm_diskstring init parameter is provided, ASM can readily discover the disks and get the correct names.

Diagnosis

Reliable latch waits and a new blog

Here’s a link to Alex Fatkulin’s blog if you haven’t seen it already: http://afatkulin.blogspot.com/
He has some good Oracle internals information in there, I also like his research style.
Alex just blogged about a finding (on Oracle 11g on Linux) that when Oracle process doesn’t get a latch after spinning, it goes to sleep using semop() system call, which never wakes up unless this semaphore is posted by another process. From past versions we remember that Oracle processes go to sleep for a short period of time, wake up, try to get the latch and sleep again for a longer period of time if unsuccessful (up to _max_exponential_sleep centiseconds).

Reliable latch waits and a new blog

Here’s a link to Alex Fatkulin’s blog if you haven’t seen it already: http://afatkulin.blogspot.com/
He has some good Oracle internals information in there, I also like his research style.
Alex just blogged about a finding (on Oracle 11g on Linux) that when Oracle process doesn’t get a latch after spinning, it goes to sleep using semop() system call, which never wakes up unless this semaphore is posted by another process. From past versions we remember that Oracle processes go to sleep for a short period of time, wake up, try to get the latch and sleep again for a longer period of time if unsuccessful (up to _max_exponential_sleep centiseconds).

Reliable latch waits and a new blog

Here’s a link to Alex Fatkulin’s blog if you haven’t seen it already: http://afatkulin.blogspot.com/
He has some good Oracle internals information in there, I also like his research style.
Alex just blogged about a finding (on Oracle 11g on Linux) that when Oracle process doesn’t get a latch after spinning, it goes to sleep using semop() system call, which never wakes up unless this semaphore is posted by another process. From past versions we remember that Oracle processes go to sleep for a short period of time, wake up, try to get the latch and sleep again for a longer period of time if unsuccessful (up to _max_exponential_sleep centiseconds).

Reliable latch waits and a new blog

Here’s a link to Alex Fatkulin’s blog if you haven’t seen it already: http://afatkulin.blogspot.com/
He has some good Oracle internals information in there, I also like his research style.
Alex just blogged about a finding (on Oracle 11g on Linux) that when Oracle process doesn’t get a latch after spinning, it goes to sleep using semop() system call, which never wakes up unless this semaphore is posted by another process. From past versions we remember that Oracle processes go to sleep for a short period of time, wake up, try to get the latch and sleep again for a longer period of time if unsuccessful (up to _max_exponential_sleep centiseconds).

Mystats utility

A variation on Jonathan Lewis's SNAP_MY_STATS package to report the resource consumption of a unit of work between two snapshots. Designed to work under constrained developer environments, this version has enhancements such as time model statistics and the option to report on specific statistics. ***Update*** Now available in two formats: 1) as a PL/SQL package and 2) as a free-standing SQL*Plus script (i.e. no installation/database objects needed). June 2007 (updated October 2011)

Runstats utility

A variation on Tom Kyte's invaluable RUNSTATS utility that compares the resource consumption of two alternative units of work. Designed to work under constrained developer environments and builds on the original with enhancements such as "pause and resume" functionality, time model statistics and the option to report on specific statistics. ***Update*** Now available in two formats: 1) as a PL/SQL package and 2) as a free-standing SQL*Plus script (i.e. no installation/database objects needed). January 2007 (updated October 2011)