Top 60 Oracle Blogs

Recent comments


Linux takes a page from Solaris… pmap available on Linux.

Recently, there was a thread on an internal alias of old Sun guys.  The problem at hand was to track down a process that is consuming memory on Linux.  This is the type of problem that can be solved many ways (ps, top, etc…), but to my amazement someone mentioned that pmap could be used for Linux…. I guess I didn’t get the memo:)

About 6 months back I wrote a few entries that discussed Solaris tools for the Linux Guy in the posts:

Compiling iozone for Solaris 10 on SPARC

I have started working with ZFS and its various ways of protecting disks from failure. It’s a low end setup at best, where a JBOD is used as an extension to a M-series server. Many JBODs come with SAS connectivity only, and no on-board intelligence so a host based solution has to be chosen to protect the lot from disk failures.

For Solaris, you can use ZFS amongst other solutions. Alternatively, ASM is a possibility. I couldn’t reproduce the exact setup, so I had to improvise. Still I wanted to find out how ZFS performs compared to ASM. For this purpose I used an EMC VMX and a Sun 5410 server which had 10 multipathed 10G LUNs presented to it via EMC Power Path 5.3.

To test the file system performance I decided to use both IOZone and Bonnie++. Bonnie++ is no problem, you can get it from Sun Freeware. Note that Oracle no longer produce the Companion CD, which leaves SunFreeware the only source for alternative packages.

Before you can compile anything on Solaris, you need a compiler (why doesn’t Solaris come with one?). Installing the compiler was simple. I got the required files from these URLS:

They can be installed by unzipping and pkgadd:

bash-3.00# pkgadd -d gcc-3.4.6-sol10-sparc-local

The following packages are available:
  1  SMCgcc     gcc
                (sparc) 3.4.6

Select package(s) you wish to process (or 'all' to process
all packages). (default: all) [?,??,q]:

Processing package instance  from 

gcc(sparc) 3.4.6

The selected base directory  must exist before
installation is attempted.

Do you want this directory created now [y,n,?,q] y
Using  as the package base directory.
## Processing package information.
## Processing system information.
## Verifying disk space requirements.
## Checking for conflicts with packages already installed.
## Checking for setuid/setgid programs.

Installing gcc as 

## Installing part 1 of 1.
[ verifying class  ]

Repeat the procedure for libiconv as well.

Once that is done, it’s time to compile iozone. Get the latest tar ball from (it was iozone3_397.tar in my case) and unzip it somewhere on your system. Change directory to stageDIR/src/current. For the makefile to work, set your path to include /usr/local/bin and /usr/ccs/bin. The makefile comes with different targets depending on your architecture and compiler. In my case I thought that Solaris10gcc-64 would be most appropriate. Trying this option however didn’t succeed:

bash-3.00# make Solaris10gcc-64

Building iozone for Solaris10gcc-64

gcc -O -c  -Dunix -DHAVE_ANSIC_C -DASYNC_IO -D__LP64__ \
                -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -Dsolaris \
                 -m64 libbif.c -o libbif10-64.o
gcc -O -c  -Dunix -DHAVE_ANSIC_C -DASYNC_IO -D__LP64__ \
                -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -Dsolaris \
                -DNAME='"Solaris10gcc-64"'  -m64 libasync.c -o libasync10-64.o
gcc -c -O -Dunix -DHAVE_ANSIC_C -DASYNC_IO \
                -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -Dsolaris \
                -DNAME='"Solaris10gcc-64"'  -m64 iozone.c -o iozone_solaris10gcc-64.o

Building fileop for Solaris10gcc-64

gcc -c -O  -m64 fileop.c -o fileop_Solaris10gcc-64.o

Building the pit_server

cc -c   pit_server.c  -o pit_server.o
gcc  -O  -m64 iozone_solaris10gcc-64.o libasync10-64.o libbif10-64.o \
        -lthread -lpthread -lposix4 -lnsl -laio \
        -lsocket -o iozone
gcc  -O -m64 fileop_Solaris10gcc-64.o -o fileop
gcc  -O -m64 pit_server.o -lthread -lpthread -lposix4 \
        -lnsl -laio -lsocket -o pit_server
ld: fatal: file pit_server.o: wrong ELF class: ELFCLASS32
ld: fatal: File processing errors. No output written to pit_server
collect2: ld returned 1 exit status
*** Error code 1
make: Fatal error: Command failed for target `Solaris10gcc-64'

ELFCLASS32? I specified that I wanted 64bit! Looking at the output a bit closer it turned out that cc (which I sym-linked to gcc in /usr/local/bin) created a 32bit object file. This was easy to fix. Instead of

# cc -c   pit_server.c  -o pit_server.o

I executed

# gcc -m64  -c pit_server.c  -o pit_server.o

That created a 64bit object file:

bash-3.00# file pit_server.o
pit_server.o:   ELF 64-bit MSB relocatable SPARCV9 Version 1

Now the linking worked as well:

bash-3.00# gcc  -O -m64 pit_server.o -lthread -lpthread -lposix4 \
>         -lnsl -laio -lsocket -o pit_server
bash-3.00# echo $?

And finally pit_server was 64 bit:

bash-3.00# file pit_server
pit_server:     ELF 64-bit MSB executable SPARCV9 Version 1,
                dynamically linked, not stripped, no debugging information available

Happy benchmarking

Automatic log gathering for Grid Control 11.1

Still debugging the OMS problem (it occasionally hangs and has to be restarted) I wrote a small shell script to help me gather all required logs for Oracle support. These are the logs I need for the SR, Niall Litchfield has written a recent blog post about other useful log locations.

The script is basic, and can possibly be extended. However it saved me a lot of time getting all the required information to one place from where I could take it and attach it to the service request. Before uploading I usually zip all files into to avoid clashing with logs already uploaded. I run the script via cron daily at 09:30.


# A script to gather diagnostic information for OMS GC 11.1 for use by Oracle
# Support
# WARNING the previous output of the script's execution is NOT preserved!
# (c) Martin Bach Consulting Ltd All rights reserved
# The script performs the following tasks
# 1) creates a heap dump for the EMGC_OMS1 server
# 2) creates a compressed tar archive of all server logs
#    including the webtier

set -x


# get heap dump for EMGC_OMS1-I upped the JVM args to -Xms of 768M which works
# fine as an identifier in Solaris 10. For Linux you may have to grep for EMGC_OMS1
# in the ps output
# the heap dump goes into EMGC_OMS1.out which is compressed and saved later
DUMP=`ps -ef | grep "Xms768m" | grep -v grep | awk '{print "kill -3 "$2}'`

if [ ! -d $DST_DIR ]; then
echo $DST_DIR does not exist-creating it
mkdir -p $DST_DIR

# get the relevant logs
cd $GC_INST/em/EMGC_OMS1/sysman/log/
tar -cvf - . | gzip > $DST_DIR/gc_inst.em.EMGC_OMS1.sysman.log.tar.gz

cd $GC_INST/user_projects/domains/GCDomain/servers/EMGC_OMS1/logs/
tar -cvf - . | gzip > $DST_DIR/

cd $GC_INST/user_projects/domains/GCDomain/servers/EMGC_ADMINSERVER/logs/
tar -cvf - . | gzip > $DST_DIR/

cd $GC_INST/WebTierIH1/diagnostics/logs/OHS/ohs1
tar -cvf - . | gzip > $DST_DIR/webtier.ohs1.log.tar.gz

cd $GC_INST/WebTierIH1/diagnostics/logs/OPMN/opmn
tar -cvf - . | gzip > $DST_DIR/webtier.opmn.log.tar.gz

Good luck!

Solaris Eye for the Linux Guy… Part II (oprofile, Dtrace, and Oracle Event Trace)

Proper tool for the job

My grandfather used to say to me: “Use the proper tool for the job”.  This is important to keep in mind when faced with performance issues.  When I am faced with performance problems in Oracle, I typically start at a high level with AWR reports or Enterprise Manager to get a high level understanding of the workload.   To drill down further, the next step is to use Oracle “10046 event” tracing.  Cary Millsap created a methodology around event tracing called “Method-R” which shows how to focus in on the source of a performance problem by analyzing the components that contribute to response time.   These are all fine places to start to analyze performance problems from the “user” or “application” point of view.  But what happens if the OS is in peril?

Solaris Eye for the Linux guy… or how I learned to stop worrying about Linux and Love Solaris (Part 1)

This entry goes out to my Oracle techie friends that have been in the Linux camp for sometime now and are suddenly finding themselves needing to know more about Solaris… hmmmm… I wonder if this has anything to do with Solaris now being an available option with Exadata?  Or maybe the recent announcement that the SPARC T3 multiplier for T3-x servers is now 0.25.  Judging by my inbox recently, I suspect a renewed interest in Solaris to continue.

Installing RAC on Solaris 10/09 x64

One of the major adventures this time of the year involves installing RAC on Solaris 10 10/09 x86-64. The system setup included EMC Power Path 5.3 as the multipathing solution to shared storage.

I initially asked for 4 BL685 G6 with 24 cores, but in the end “only” got two-still plenty of resources to experiment with.  I especially like the output of this command:

$ /usr/sbin/psrinfo | wc –l

Nice! Actually, it’s 4 Opteron processors:

$ /usr/sbin/prtdiag | less
System Configuration: HP ProLiant BL685c G6
 BIOS Configuration: HP A17 12/09/2009
 BMC Configuration: IPMI 2.0 (KCS: Keyboard Controller Style)
==== Processor Sockets ====================================
Version                          Location Tag
 -------------------------------- --------------------------
 Opteron                          Proc 1
 Opteron                          Proc 2
 Opteron                          Proc 3
 Opteron                          Proc 4

So much for the equipment. The operating system showed 4 NICs, all called bnxen where n was 0 through 4. The first interface, bnxe0, will be used for the public network. The second NIC is to be ignored and the final 2, bnxe2 and bnxe3 will be used for the high available cluster interconnect feature. This way I can prevent the use of SFRAC which inevitably would have meant a clustered Veritas file system instead of ASM.

One interesting point to notice is that the Oracle MOS document 1210883.1 specifies that the interfaces for the private interconnect are on the same subnet. So-node1 will use for bnxe2 and for bnxe3. Similarly, node2 uses for bnxe2 and for bnxe3. The Oracle example is actually a bit more complicated than it could have been, as they use a /25 subnet mask. But ipcalc confirms that the address range they use are all well within the subnet:

 Address:            00001010.00000001.00000000.1 0000000
 Netmask: = 25  11111111.11111111.11111111.1 0000000
 Wildcard:             00000000.00000000.00000000.0 1111111
 Network:         00001010.00000001.00000000.1 0000000 (Class A)
 Broadcast:            00001010.00000001.00000000.1 1111111
 HostMin:            00001010.00000001.00000000.1 0000001
 HostMax:            00001010.00000001.00000000.1 1111110
 Hosts/Net: 126                   (Private Internet)

This setup will have some interesting implications which I’ll describe a little later.

Part of the test was to find out how mature the port to Solaris on Intel was. So I decided to start off by installing Grid Infrastructure on node 1 first, and extend the cluster to node2 using the script in $ORACLE_HOME/oui/bin.

The installation uses 2 different accounts to store the Grid Infrastructure binaries separately from the RDBMS binaries. Operating system accounts are oragrid and oracle.

Oracle: uid=501(oracle) gid=30275(oinstall) groups=309(dba),2046(asmdba),2047(asmadmin)
OraGrid: uid=502(oragrid) gid=30275(oinstall) groups=309(dba),2046(asmdba),2047(asmadmin)

I started off by downloading files 1,2 and 3 of patch 10098816 for my platform. The ratio of downloads of this patch was 243 to 751 between x64 and SPARC. So not a massive uptake of this patchset for Solaris it would seem.

As the oragrid user I created user equivalence for RSA and DSA ssh-keys, a little utility will do this now for you, but I’m old-school and create the keys and exchanged them on the hosts myself. Not too bad a task on only 2 nodes.

The next step was to find out about the shared storage. And that took me a little while I admit freely: I haven’t used the EMC Power Path multipathing software before and found it difficult to approach, mainly for the lack of information about it. Or maybe I just didn’t find it, but device-mapper-multipath for instance is easier to understand. Additionally, the fact that this was Solaris Intel made it a little more complex. First I needed to know what the device names actually mean. As on Solaris SPARC, /dev/dsk will list the block devices, /dev/rdsk/ lists the raw devices. So there’s where I’m heading. Next I checked the devices, emcpower0a to emcpower9a. In the course of the installation I found out how to deal with these. First of all, on Solaris Intel, you have to create a partition of the LUN before it can be dealt with in the SPARC way. So for each device you would like to use, fdisk the emcpowerxp0 device, i.e.

# fdisk /dev/rdsk/emcpower0p0

If there is no partition, simply say “y” to the question if you want to use all of it for Solaris and exit fdisk. Otherwise, delete the existing partition (AFTER HAVING double/triple CHECKED THAT IT’S REALLY NOT NEEDED!) and create a new one of type “Solaris2”. It didn’t seem necessary to make it active.

Here’s a sample session:

bash-3.00# fdisk /dev/rdsk/emcpower0p0
No fdisk table exists. The default partition for the disk is:
a 100% "SOLARIS System" partition
Type "y" to accept the default partition,  otherwise type "n" to edit the partition table.

Now let’s check the result:

bash-3.00# fdisk /dev/rdsk/emcpower0p0
Total disk size is 1023 cylinders
Cylinder size is 2048 (512 byte) blocks
Partition   Status    Type          Start   End   Length    %
=========   ======    ============  =====   ===   ======   ===
1       Active    Solaris2          1  1022    1022    100
1. Create a partition
2. Specify the active partition
3. Delete a partition
4. Change between Solaris and Solaris2 Partition IDs
5. Exit (update disk configuration and exit)
6. Cancel (exit without updating disk configuration)
Enter Selection: 6


This particular device will be used for my OCRVOTE disk group, that’s why it’s only 1G. The next step is identical on SPARC-start the format tool, select partition, change the fourth partition to use the whole disk (with an offset of 3 cylinders at the beginning of the slice) and label it. With that done, exit the format application.

This takes me back to the discussion of the emcpower-device name. The letters [a-p] refer to the slices of the device, while p stands for the partition. /dev/emcpowernc is slice 2 of the second multipathed device, in other words the whole disk. I usually create a slice 4 which translates to emcpowerne. After having completed the disk initialisation, I had to ensure that the ones I was working on were really shared. Unfortunately the emcpower devices are not consistently named across the cluster. What is emcpower0a on node1 turned out to be emcpower2a on the second node. How to check? The powermt tool to the rescue. Similar to “multipath –ll” on Linux the powermt command can show the underlying disks which are aggregated under the emcpowern pseudo device. So I wanted to know if my device /dev/rdsk/emcpower0e was shared. What I really was interested on was the native device:

# powermt display dev=emcpower0a | awk \
 > '/c[0-9]t/ {print $3}'

Well, does that exist on the other node?

# powermt display dev=all | /usr/sfw/bin/ggrep -B8  c1t50000974C00A611Cd6s0
Pseudo name=emcpower3a
Symmetrix ID=000294900664
Logical device ID=0468
state=alive; policy=SymmOpt; priority=0; queued-IOs=0;
--------------- Host ---------------   - Stor -   -- I/O Path --  -- Stats ---
###  HW Path               I/O Paths    Interf.   Mode    State   Q-IOs Errors
3072 pci@39,0/pci1166,142@12/pci103c,1708@0/fp@0,0 c1t50000974C00A611Cd6s0 FA  8eA   active  alive       0      0

So yes it was there. Cool! I checked the 2 other OCR/voting disks LUNS and they were shareable as well. The final piece was to change the ownership of the devices to oragrid:asmdba and permissions to 0660.

Project settings

Another item to look at is the project settings for the grid owner and oracle. It’s important to set projects correctly, otherwise the installation will fail when ASM is starting. All newly created users inherit the settings from the default project. Unless the sys admins set the default project high enough, you will have to change them. To check the settings you can use the “prctl -i project default” call to check all the values for this project.

I usually create a project for the grid owner, oragrid, as well as for the oracle account. My settings are as follows for a maximum SGA size of around 20G:

projadd -c “Oracle Grid Infrastructure” ‘’
projmod -s -K “process.max-sem-nsems=(privileged,256,deny)” ‘’
projmod -s -K “project.max-shm-memory=(privileged,20GB,deny)” ‘’
projmod -s -K “project.max-shm-ids=(privileged,256,deny)” ‘’

Repeat this for the oragrid user, then log in as oragrid and check that the project is actually assigned:

# id -p oragrid
uid=223(oragrid) gid=30275(oinstall) projid=100(user.oragrid)

Installing Grid Infrastructure

Finally ready to start the installer! The solaris installation isn’t any different from Linux except for the aforementioned fiddling with the raw devices.

The installation went smoothly, I ran and without any problem. If anything, it was a bit slow, taking 10 minutes to complete on node1. You can tail the rootcrs_node1.log file in /data/oragrid/product/ to see what’s going on behind the scenes. This is certainly one of the biggest improvements over 10g and 11g Release 1.

Extending the cluster

The MOS document I was alluding to earlier suggested, like I said, to have all the private NIC IP addresses in the same subnet. That isn’t necessarily to the liking of cluvfy. The communication over bnxe3 on both hosts fails, as shown in this example. Tests executed from node1:

bash-3.00# ping is alive
bash-3.00# ping is alive
bash-3.00# ping is alive
bash-3.00# ping
^C is not replying

Tests executed on node 2

bash-3.00# ping is alive
bash-3.00# ping
bash-3.00# ping is alive
bash-3.00# ping is alive

I decided to ignore this for now, and sure enough, the cluster extension didn’t fail. As I’m not using GNS, the command to add the node was

$ ./ -debug -logLevel finest "CLUSTER_NEW_NODES={loninengblc208}" \

This is actually a little more verbose than I needed, but it’s always good to be prepared for a SR with Oracle.

However, the OUI command will perform a pre-requisite check before the actual call to runInstaller, and that repeatedly failed, complaining about connectivity on the bnxe3 network. Checking the contents of the script I found an environment variable “$IGNORE_PREADDNODE_CHECKS” which can be set to “Y” to force the script to ignore the pre-requisite checks. With that set, the addNode operation succeeded.

RDBMS installation

This is actually not worthy to report, it’s pretty much the same as on Linux. However, a small caveat is specified to Solaris x86-64. Many files in the Oracle inventory didn’t have correct permissions. When launching runInstaller to install the binaries, I was bombarded with complaints about file permissions.

For example, has the wrong permissions. Example for Solaris Intel:

# ls -l
 -rw-r--r--   1 oragrid  oinstall     317 Nov  9 15:01

On Linux:

$ ls -l
 -rw-rw---- 1 oragrid oinstall 345 Oct 21 12:44

There were a few more, I fixed them using these commands:

$ chmod 770 ContentsXML
$ chmod 660 install.platform
$ chmod 770 oui
$ chmod 660 ContentsXML/*
$ chmod 660 oui/*

Once the permissions were fixed the installation succeeded.


Nothing to report here, it’s the same as for Linux.

Database Gateway for Sybase

Before starting to write this article I was wondering if it was of any use to anyone. Who in their right state of mind would use the Database Gateway for xxx, be it Sybase or Adabase or whatever othere database was to be used. If you have to, simply configure another data source in your code and read data from where you need it. Note that I said read! Ever since Piet de Visser’s presentation about simplicity I have been a great supporter of the simplicity approach. Or, like James Morle likes to quote, simplicity is the ultimate form of sophistication.

Transparent Gateways desupported

So, anyway, I have been asked to link Sybase ASE 15.0.3 to an Oracle database, all on Solaris 10/SPARC 64bit. A quick proof of concept had to be performed. Initially I started out with Transparent Gateway for Sybase, the name of the product for 10g Release 2. I should have known something was wrong when the link to download the software was broken and I had to manually copy & paste it. Well to cut a long story short, 10.2 gateways are desupported since 2008! I wasted a whole 2 hours before that came up on MOS. The workaround is to use > 10.2 software for this, and I went for 11.2. MOS states that this is possible for >, > and > My database was which means I’m fine.

Database Gateway for Sybase

As always the new software is on OTN, under the 11.2 database downloads. Click on “See all” to access and download I downloaded and unzipped the file to /u01/app/oracle/stage. The idea was to install the gateway into its own Oracle home, as all the installed software was and lower. Again, as always it’s required to unset all Oracle related environment variables such as ORACLE_SID, AGENT_HOME, OMS_HOME, LD_LIBRARY_PATH_64, LD_LIBRARY_PATH, ORACLE_HOME etc.

The installation is really straight forward as you’d expect. Ensure that you have X11 forwarding enabled and an X server available (I use XMing), and execute “runInstaller”.

  1. Choose an appropriate name for the home location and path. Mine were OraGtw11g_home1 and /u01/app/oracle/product/gateway/ but any path (not pointing to an existing ORACLE_HOME) is fine
  2. On the “Available Product Components” page select “Oracle Database Gateway for Sybase”
  3. On the configuration screen enter the hostname where your Sybase database is running, the port number (from $SYBASE/interfaces) and the database name to be accessed. Don’t worry, all these parameters can be changed later
  4. Start the installation
  5. When prompted, perform a typical install using netca.
  6. Exit the Universal Installer

Great, that’s all done now! Some more configuration steps are to be performed still.

Configuring the Gateway Home

I’d recommend that you enter the gateway ORACLE_HOME into /var/opt/oracle/oratab (or /etc/oratab, if you are not on Solaris). This way navigating is greatly simplified.

The connection information Oracle needs to connect to Sybase is stored in a configuration file located in the following location:


This is populated with the values provided during the installation, but for troubleshooting purposes you should set the logging to ON:


Once connections to Sybase have been established, reset the value to OFF to avoid excessive log file generation. The log files are generated in $ORACLE_HOME/dg4sybs/log, one for each query making use of the connection to Sybase.

By default, all stored procedures do not return a result set to the use. To change this behaviour, set the following parameter to TRUE:


The next step is to configure the listener. You should have performed a typical configuration during the installation (i.e. netca will have run). If so, edit $ORACLE_HOME/network/admin/listener.ora and add a static SID into the SID_LIST as in the following example:

# listener.ora Network Configuration File:
# /u01/app/oracle/product/gateway/
# Generated by Oracle configuration tools.
    (SID_DESC =
      (PROGRAM = dg4sybs)
      (ENV = /u01/app/oracle/product/gateway/
      (SID_NAME = dg4sybs)
      (ORACLE_HOME = /u01/app/oracle/product/gateway/
    (ADDRESS = (PROTOCOL = TCP)(HOST = localhost)(PORT = 1521))
ADR_BASE_LISTENER =/u01/app/oracle/product/gateway/

Unlike earlier versions of the gateway, the Sybase OCS libs don’t need to be included in the ENV variable. Obviously, if there is a listener on your host already, add the entries to the existing Now start the listern

Configuring the Database

Before Oracle can access the Sybase system, it must have an entry in its tnsnames.ora file to contact the listener we just created/modified:

dg4sybs =
    (CONNECT_DATA= (SID=dg4sybs))

The hostname needs to refer to where the gateway’s listener runs on, as will the port. The SID needs to match the gateway SID defined in the SID_LIST_LISTENER parameter of the gateway’s listener.ora.

Within Oracle, create a database link to the Sybase database:

SQL> create database link dg4sysb connect to user identified by password using 'dg4sysbs';

Enclose the username and password in double-quotes as they are case sensitive. That’s it, you can now query the Sybase database from within Oracle.

With this setup it is now possible to query Sybase tables, and stored procedures. I have written a very basic one to demonstrate that it’s possible. Logging on to Sybase, I created a simple stored procedure to report the version, similar to the v$version view in Oracle.

1> drop procedure myversion
2> go
1> set chained on
2> go
1> create procedure myversion ( @ver varchar(255) OUTPUT ) as select @ver = @@version
2> go

Now on Oracle, I can query this like this:

set serveroutput on
  v varchar2(4000);

And this worked!

Oracle Exadata Database Machine v2 vs x2-2 vs x2-8 Deathmatch

This post has bee updated live from the Oracle OpenWorld as I’m learning what’s new. Last update done on 28-Sep-2010.

Oracle Exadata v2 has been transformed into x2-2 and x2-8. x2-2 is just slightly updated while x2-8 is a much more high-end platform. Please note that Exadata x2-2 is not just an old Exadata v2 — it’s a fully refreshed model. This is a huge confusion here at the OOW and even at the Oracle web site.

The new Exadata pricing list is released and Exadata x2-2 costs exactly the same as old Exadata v2. Exadata x2-8 Full Rack (that’s the only x2-8 configuration — see below why) is priced 50% higher then Full Rack x2-2. This is hardware price only to clarify the confusion (updated 18-Oct-2010).

Exadata Storage Server Software pricing is the same and licensing costs per storage server and per full rack is the same as for Exadata v2 because number of disks didn’t change. Note that storage cells got upgraded but priced the same when it comes to Exadata Server software and hardware. Nice touch but see implications on databases licensing below.

This comparison is for Full-Rack models Exadata x2-2 and x2-8 and existing v2 model.

Finally, data-sheets are available for both x2-2 (Thx Dan Norris for the pointers):

and x2-8:

It means that live update of this post is probably over (27-Sep-2010).

v2 Full Rack x2-2 Full Rack x2-8 Full Rack
Database servers 8 x Sun Fire x4170 1U 8 x Sun Fire x4170 M2 1U 2 x Sun Fire x4800 5U
Database CPUs Xeon E5540 quad core 2.53GHz Xeon X5670 six cores 2.93GHz Xeon X7560 eight cores 2.26GHz
database cores 64 96 128
database RAM 576GB 768GB 2TB
Storage cells 14 x SunFire X4275 14 x SunFire X4270 M2 14 x SunFire X4270 M2
storage cell CPUs Xeon E5540 quad core 2.53GHz Xeon L5640 six cores 2.26GHz Xeon L5640 six cores 2.26GHz
storage cells CPU cores 112 168 168
IO performance & capacity 15K RPM 600GB SAS or 2TB SATA 7.2K RPM disks 15K RPM 600GB SAS (HP model – high performance) or 2TB SAS 7.2K RPM disks (HC model – high capacity)
Note that 2TB SAS are the same old 2 TB drives with new SAS electronics. (Thanks Kevin Closson for ref)
15K RPM 600GB SAS (HP model – high performance) or 2TB SAS 7.2K RPM disks (HC model – high capacity)
Note that 2TB SAS are the same old 2 TB drives with new SAS electronics. (Thanks Kevin Closson for ref)
Flash Cache 5.3TB 5.3TB 5.3TB
Database Servers networking 4 x 1GbE x 8 servers = 32 x 1GbE 4 x 1GbE x 8 servers + 2 x 10GbE x 8 servers = 32 x 1Gb + 16 x 10GbEE 8 x 1GbE x 2 servers + 8 x 10GbE x 2 servers = 16 x 1Gb + 16 x 10GbEE
InfiniBand Switches QDR 40Gbit/s wire QDR 40Gbit/s wire QDR 40Gbit/s wire
InfiniBand ports on database servers (total) 2 ports x 8 servers = 16 ports 2 ports x 8 servers = 16 ports 8 ports x 2 servers = 16 ports
Database Servers OS Oracle Linux only Oracle Linux (possible Solaris later, still unclear) Oracle Linux or Solaris x86

x2-8 has fewer but way bigger database servers. That means that x2-8 will scale better with the less RAC overhead for the databases. The bad news is that if one database server fails or down for maintenance, 50% of capacity is gone. What does that mean? It means that Exadata x2-8 is designed more for multi-rack deployments so that you can go beyond “simple” 2 node RAC. Some folks argue that two node RAC is less reliable for evictions and etc but you probably don’t know that Exadata has special IO fencing mechanism that makes it much more reliable.

Because there is 4 times more RAM in Exadata x2-8, more and more operations can be done fully in memory without even going to storage cells. This is why boost in number of cores / CPU performance is important — since InfniBand bandwidth stays the same, you need some other way to access more data so having more data on buffer cache will keep more CPU cores busy.

With Exadata x2-2, processing capacity on database servers increased and RAM increase is insignificant. So how does it impact “well-balanced” Exadata v2? Well, if more and more operations are offloaded to storage cells then database servers could have more “useful” data pumped in over InfniBand and actually spend CPU cycles processing the data rather then filtering it. With Exadata v2, depending on the compression level, CPU was often a bottleneck on data loads so having some more CPU capacity on database tiers won’t harm.

Old configuration v2 will not be available so be ready to spend more on Oracle database licenses unless you are licensed under ULA or something.

Both Exadata x2-8 and x2-2 will run updated Oracle Linux 5.5 with Oracle Enterprise Kernel. x2-8 can also run Solaris x86 on database servers as expected. This confirms my assumption that if Oracle adds Solaris x86 into Exadata, it will prove that Oracle is fully committed to Solaris Operating System. A rather pleasant news to me! However, Solaris 11 Express is not available right now and probably will be available towards the end of this calendar year.

If you look at x2-2 and x2-8 side by side physically, you will see that four 1U databases servers of x2-2 basically replaced by one 5U database server in x2-8 in terms of space capacity. There are also more internal disks in those bigger servers and more power supplies so they are more redundant.

More processing power on storage servers in x2-8 and x2-2 (not dramatically more but definitely noticeable) will speed up smart scans accessing data compressed with high level. As more and more operations can be uploaded to the storage cells, boost in CPU capacity there is quite handy. Note that this doesn’t impact licensing in any way — Exadata Storage Server Software is using number of physical disk spindles as the licensing metric.

Regarding claims of the full database encryption — need to understand how it works and what are the improvements. Oracle Transparent Data Encryption was available on Exadata v2 but had many limitations when using with other Exadata features. I assume that Exadata x2-x addresses those but need to follow up on details so stay tuned. I believe that customers of Exadata v2 will be able to take advantage of all new Exadata software features – the platform architecture hasn’t changed.

Wish List of Oracle OpenWorld 2010 Announcements: Exadata v3 x2-8, Linux, Solaris, Fusion Apps, Mark Hurd, Exalogic Elastic Cloud, Cloud Computing

It’s Sunday morning early in San Francisco and the biggest ever Oracle OpenWorld is about to start. It looks like it’s also going to be the busiest ever OpenWorld for me — my schedule looks crazy and I still need to do the slides for my Thursday sessions (one on ASM and one on cloud computing). Fortunately, my slides for today’s presentation are all ready to go.

OK. Don’t let me carry away — I started this post with the intention to write about what I expect Oracle to announce at this OpenWorld and it seems like the most important announcements happen at tonight’s keynote. I hasn’t been at the Oracle ACE Directors briefing so unlike them, all I can say is pure speculation-based and my wishes of what should be covered. Actually, unlike them, I actually CAN say at least something. :)

  1. Oracle Exadata Database Machine v3 (x2-8) — well, that shouldn’t come as a surprise to anybody by now. I fully expect upgrade of the hardware — new Intel CPUs (probably with more cores), more memory, possibly more flash (this technology moves really quick these days). Maybe 10GbE network can be introduced to address some of the customers demands but I don’t think it’s needed that much. InfiniBand might just stay as it is — I think there is enough throughput but Marc Fielding noted that moving InfiniBand to the next speed level shouldn’t be very expensive. Other then cosmetic upgrade, I believe that hardware architecture will largely stay the same — it works very well, it’s proven and very successful. Maybe something should be done to let customers integrate Exadata better into their data-centers — folks keep complaining of inflexibility (and I think Oracle should stay firm on this and don’t let customer screw themselves up but who knows).
    On the software side, I expect new Exadata Storage Software release announcement that will be able to offload more and more on the storage side. The concept of moving data intensive operation closed to the disks has proven to be very effective. I also expect to have more Exadata features for consolidation. If you didn’t notice, database release few days ago has Exadata specific QoS (Quality of Service) feature. I think this is what’s going to make Exadata to be a killer consolidation platform for the databases — true private cloud for Oracle databases or a true grid as Oracle insists calling it’s private cloud idea. Speaking about software… hm — see Linux and Solaris below.
    And back to consolidation, there must be the new platform similar to Exadata that integrates Oracle hardware and software and that should fill the gap as a consolidation offering for anything else but databases — Fusion Middleware, Fusion Apps and whole lineup of Oracle software. Whether it’s going to have Exadata in its name — I don’t know. It’s going to be names Exalogic Elastic Cloud. It would make sense to support both Solaris and Linux virtualization technologies on that new platform.
    Oh, and I hope to see Oracle start offering vertical focused solutions based on Exadata. Like Retail Database Machine. Maybe it won’t come at the OpenWorld but I think it would be a good move by Oracle.
  2. Solaris and Linux — I’ve been preaching for a while that having acquired Solaris engineering team, it would be insane not to take over Linux distribution from RedHat and start providing truly Oracle Linux. I was expecting Oracle to do that for a while. Either that or change Oracle’s commitment from Linux to Solaris on x86 platform. If Oracle is serious about Solaris now then the best indication of that would be Solaris x86 powered Exadata. In other words, the future of Linux and Solaris at Oracle should be made clear during this OpenWorld.
  3. Fusion Apps — god, I really hope something will be out. After all those years talking about it, I can’t stand anymore hearing about the ghost product (or line of products). I think it’s also confirmed by Debra Lilley’s increased activity over the past year — she is buzzing unusually strong about it. ;-) Of course, Fusion Apps will be all about integration of zillion of Oracle products into one system (which is a very difficult task). Oh, and if Fusion Apps are announced then they will run best on Exadata, of course. Oracle Fusion Apps Machine?
  4. Mark Hurd — finally, I’d be very keen to see the first serious public appearance of Mark Hurd as Oracle’s co-president. I think he will set the tone for the future of Oracle’s hardware business. So far it’s been all about profitability which is probably the best thing Oracle could do with otherwise dead Sun hardware business.

That’s all. I’m sure there will be more. I didn’t mention SPARC and that’s not because I forgot.

This OpenWorld promises to be very interesting!

Kernel NFS fights back… Oracle throughput matches Direct NFS with latest Solaris improvements

After my recent series of postings, I was made aware of David Lutz’s blog on NFS client performance with Solaris.  It turns out that you can vastly improve the performance of NFS clients using a new parameter to adjust the number of client connections.

root@saemrmb9> grep rpcmod /etc/system
set rpcmod:clnt_max_conns=8

This parameter was introduced in a patch for various flavors of Solaris.  For details on the various flavors, see David Lutz’s recent blog entry on improving NFS client performance.  Soon, it should be the default in Solaris making out-of-box client performance scream.

DSS query throughput with Kernel NFS

I re-ran the DSS query referenced in my last entry and now kNFS matches the throughput of dNFS with 10gigE.

Kernel NFS throughput with Solaris 10 Update 8 (set rpcmod:clnt_max_conns=8)

This is great news for customers not yet on Oracle 11g.  With this latest fix to Solaris, you can match the throughput of Direct NFS on older versions of Oracle.  In a future post, I will explore the CPU impact of dNFS and kNFS with OLTP style transactions.

Posted in Oracle, Storage Tagged: 11g, 7410, analytics, database, dNFS, NAS, NFS, Oracle, performance, Solaris, Sun, tuning