strace is a linux utility to profile system calls. Using strace you can see the system calls that a process executes, in order to investigate the inner working or performance. In my presentation about multiblock reads I put the text ‘strace lies’. This is NOT correct. My current understanding is that strace does show every system call made by an executable. So…why did I make that statement? (editorial note: this article dives into the inner working of Linux AIO)
For some time now, I am using gdb to trace the inner working of the Oracle database. The reason for using gdb instead of systemtap or Oracle’s dtrace is the lack of user-level tracing with Linux. I am using this on Linux because most of my work is happening on Linux.
In order to see the same information with gdb on the system calls of Oracle as strace, there’s the Oracle debug info repository. This requires a bit of explanation. When strace is used on a process doing IO that Oracle executes asynchronous, the IO calls as seen with strace look something like this:
A reader asked an interesting question yesterday with regards to the previous post on the subject: where did you get your service metrics from when you queried v$servicemetric-PDB or CDB$ROOT?
I queried the PDB, but this morning repeated the test to make sure the results are consistent, and they are. This is definitely something you’d hope for: you should not have different results in the same v$-view depending on the container you execute your query in for a given CON_ID.
During testing I noticed something interesting though. I queried gv$servicemetric but did not limit the result to the service I wanted to test with (FCFSRV). Here is the query against gv$servicemetric while the system was idle.
This is a follow-up on yesterday’s post about services in the new 12c database architecture. After having worked out everything I needed to know about TAF and RAC 12c in CDBs I wanted to check how FCF works with PDBs today. While investigating I found out that the Runtime Load Balancing Advisory does not seem to work as expected. To double-check, I ran the same test against a 12c non-CDB and noticed that the RTLB is not “broken” there. But I’m getting ahead of myself. First of all, here is my test case:
The service is specifically created to connect against the PDB:
In preparation of the OUGN Spring Seminar and to finally fulfill at least a part of my promise from July I was getting ready to research RAC, PDBs and services for my demos. It turned out to become a lot more interesting than I first assumed.
RAC and Multi-Tenancy
So the first attempt to really look at how this works has started with my 2 node cluster where I created a RAC database: RAC12C, administrator managed with instance RAC12C1 and RAC12C2. The database is registered in Clusterware. Clusterware and RDBMS are patched to the January PSU, i.e. 126.96.36.199.2.
I’m in the process of taking on some of the MySQL databases in my company. The first ones are MySQL 4.1 running on Windows, so we are upgrading them to MySQL 5.6 on Oracle Linux. As with many of our systems, these will be running on VMware virtual machines.
Since the current installations are so old, we are planning on dumping out the data and creating fresh installations on the new systems. Based on the advice I got from Ronald Bradford and Sheeri Cabral, we are also taking this opportunity to switch to InnoDB and utf8, rather than MyISAM and latin1 that are currently used.
For those of you using Oracle Linux with UEK3, here are a couple of important blog posts that may have passed you by.
Recently I had the pleasure of corresponding with Hans-Peter Sloot. After looking at my simple tool in this post to gather cell IO data from cellcli, he took it a several steps further and created a nice python version that goes to the next level to pull IO statistics from the cells.
This script provides breaks down the IO by “Small” and “Large” as is commonly done by the Enterprise manager. It also provides a summary by cell. Here is a sample output from this script.
Well, there’s been a bit of a delay in with my planned testing of dbVisit Replicate and Oracle GoldenGate for zero-downtime upgrades. So, I’ll be (hopefully) getting back to that within a couple of weeks.
Meanwhile, I recently ran across a discussion on the Oracle OTN Community forums, asking about performance and hugepages configuration, here in the Oracle Database – General Questions Forum.
I think my answer bears repeating, so, here is a slightly modified version:
First, I’m going to take a strong position on hugepages. I’m going to go as far as to say, for any non-trivial SGA size, if you’re not using hugepages, you’re doing it wrong. There are three main points to consider.
In my previous posts about the first RAC Grid Infrastructure Patchset I document a few issues I encountered that were worth noting. But where things work as advertised I am more than happy to document it too. In a way, the January 2014 GI PSU works as you’d hope it would (at least in my lab for my 2 node cluster). Well-almost: if you have a non 12.1 database in your environment you might encounter this.
UPDATE: You might want to hold off the application of the PSU in a RAC/GI environment as Mike Dietrich advises.
Admittedly it’s taken from an Oracle Restart (i.e. non cluster) environment but I can’t see this not happening in RAC: