I posted a fair amount of stuff on how Oracle is generating IOs, and especially large IOs, meaning more than one Oracle block, so > 8KB. This is typically what is happening when the Oracle database is executing a row source which does a full segment scan. Let’s start off with a quiz: what you think Oracle is the maximum IO size the Oracle engine is capable of requesting of the Operating System (so the IO size as can be seen at the SCI (system call interface) layer? If you made up your answer, remember it, and read on!
The real intention of this blogpost is to describe what is going on in the Oracle database kernel, but also what is being done in the Linux kernel. Being a performance specialised Oracle DBA means you have to understand what the operating system does. I often see that it’s of the utmost importance to understand how an IO ends up as a request at the NAS or SAN head, so you understand what a storage admin is talking about.
Prompted by an actual task at hand I spent some time investigating an 126.96.36.199 feature – concurrent statistics gathering. It has been on my to-do list for quite some time but so far I didn’t have a use case, and use cases make it so much easier. The question was-how can I gather statistics on a really large, partitioned table? Previously, you could revert to the degree in dbms_stats.gather_table_stats to ensure that statistics were gathered in parallel. This is all good, but sometimes you need more umph. Some DBAs wrote scripts to execute individual statistic gathering jobs against partitions in parallel, using the tabname and partname arguments in dbms_stats.gather_table_stats(). But that requires manual effort – and the not-quite-so-new concurrent option is so much nicer. Let me take you along the ride… Actually I have to tell the story starting with the happy ending as I had a few snags along the way. This is 188.8.131.52.1 on Oracle Linux 6.5.
If you ever wanted to know how Clusterware works with registered database resources, read on! It takes a little while to get your head around the concepts of the ORACLE_SID, the instance_name and the database name as well. And how Clusterware deals with all of them. Although this post has been written on 184.108.40.206.0 on Linux, it should be applicable to 11.2 Clusterware as well. Oh and by Clusterware I mean Grid Infrastructure of course ;)
Why would you want to care?
Most deployments I have seen use the simple formula: ORACLE_SID = instance_name = db_name, especially in single instance deployments. RAC One Node and RAC databases are slightly different as their instances are usually named db_name where is the n-th instance in the cluster. What however, if you want to have separate SID, instance name and database names? I keep things simple for now and don’t throw in a different db_unique_name…
My lab server has 2 SSDs, one is connected using SATA 2 and another is connected using SATA 3. I’d expect the SATA 3 connected device to be equally well equipped or even better to do work than the “old” interface. I ran SLOB on these devices to find out if there was a difference. To my great surprise the SATA2 – connected SSD performed a lot better than the SATA 3 device, as shown in the AWR report! Initially I was not entirely sure why, since the FIO results on both devices are roughly equal. You will see why though when reading this post. In summary: use XFS for any concurrent writes. Or maybe ASM.
Let’s do a little I/O investigation because a) it’s cool and b) you can.
This is a quick post on using git on a server. I use my Synology NAS as a fileserver, but also as a git repository server. The default git package for Synology enables git usage on the command line, which means via ssh, or via web-DAV. Both require a logon to do anything with the repository. That is not very handy if you want to clone and pull from the repository in an automated way. Of course there are ways around that (basically setting up password-less authentication, probably via certificates), but I wanted simple, read-only access without authentication. If you installed git on a linux or unix server you get the binaries, but no daemon, which means you can only use ssh if you want to use that server for central git repositories.
Some of you might have followed the discussion around the number of standby redo logs on twitter, but since 140 characters are woefully short for the complete story here’s the writeup that prompted the question. This is a test with 220.127.116.11 on virtualised Linux, repeated on a proper platform with physical hardware.
First of all here’s my setup. I have a dbca-based database (CDB, but doesn’t matter) that features 3 groups for its online redo logs. They are all 50 MB in size-important for this test, but not realistic :) Following the Oracle documentation I created n + 1 groups (per thread) on the standby to stop Data Guard broker from complaining about missing standby redo logs (SRL).
The end result was positive, here’s what the broker thinks:
Followers of my blog and website know I play around with installations on Fedora for fun. All of my installation guides on Fedora come with a link at the top that points to this disclaimer.
A few times recently I’ve been contacted by people saying their boss, teacher or customer is insisting they install Oracle on Fedora. Rather than repeat myself, I’ve added another point at the bottom of this disclaimer that reads:
This post looks like I am jumping on the bandwagon of IT orchestration like a lot of people are doing. Maybe I should say ‘except for (die hard) Oracle DBA’s’. Or maybe not, it up to you to decide.
Most people who are interested in IT in general will have noticed IT orchestration has gotten attention, especially in the form of Puppet and/or Chef. I _think_ IT orchestration has gotten important with the rise of “web scale” (scaling up and down applications by adding virtual machines to horizontal scale resource intensive tasks), in order to provision/configure the newly added machines without manual intervention, and people start picking it up now to use it for more tasks than provisioning of virtual machines for web applications.
(the details are investigated and specific to Oracle’s database implementation on Linux x86_64)
Exadata IO: This event is not used with Exadata storage, ‘cell single block physical read’ is used instead.
Despite p3 listing the number of blocks, I haven’t seen a db file sequential read event that read more than one block ever. Of course this could change in a newer release.
This is probably as much a note-to-self as it can possibly be. Recently I have enjoyed some more in-depth research about how the Linux kernel works. To that extent I started fairly low-level. Theoretically speaking, you need to understand the hardware-software interface first before you can understand the upper levels. But in practice you get by with less knowledge. But if you are truly interested in how computers work you might want to consider reading up on some background. Some very knowledgable people I deeply respect have recommended books by David A. Patterson and John L. Hennessy. I have these two: