I just realised this week that I haven’t really detailed anything about policy managed RAC databases. I remembered having done some research about server pools way back when 184.108.40.206 came out. I promised to spend some time looking at the new type of database that comes with server pools: policy managed databases but somehow didn’t get around to doing it. Since I’m lazy I’ll refer to these databases as PMDs from now on as it saves a fair bit of typing.
So how are PMDs different from Administrator Managed Databases?
First of all you can have PMDs with RAC only, i.e. in a multi-instance active/active configuration. Before 11.2 RAC you had to tie an Oracle instance to a cluster node. This is why you see instance prefixes in a RAC spfile. Here is an example from my lab 220.127.116.11.6 cluster:
Some days are just too good to be true :) I ran into an interesting problem trying to install 18.104.22.168.0 Grid Infrastructure for a two node cluster. The storage was presented via iSCSI which turned out to be a blessing and inspiration for this blog post. So far I haven’t found out yet how to create “shareable” LUNs in KVM the same way I did successfully with Xen. I wouldn’t recommend general purpose iSCSI for anything besides lab setups though. If you want network based storage, go and use 10GBit/s Ethernet and either use FCoE or (direct) NFS.
Here is my setup. Storage is presented in 3 targets using tgtd on the host:
So this is a little bit of a plug for myself and Enkitec but I’m running my Grid Infrastructure And Database High Availability Deep Dive Seminars again for Oracle University. This time these events are online, so no need to come to a classroom at all.
Here is the short description of the course:
Providing a highly available database architecture fit for today’s fast changing requirements can be a complex task. Many technologies are available to provide resilience, each with its own advantages and possible disadvantages. This seminar begins with an overview of available HA technologies (hard and soft partitioning of servers, cold failover clusters, RAC and RAC One Node) and complementary tools and techniques to provide recovery from site failure (Data Guard or storage replication).
This might be something very obvious for the reader but I had an interesting revelation recently when implementing parallel_degree_limit_p1 in a resource consumer group. My aim was to prevent users mapped to a resource consumer group from executing any query in parallel. The environment is fictional, but let’s assume that it is possible that maintenance operations for example leave indexes and tables decorated with a parallel x attribute. Another common case is the restriction of PQ resource to users to prevent them from using all the machine’s resources.
This can happen when you perform an index rebuild for example in parallel to speed the operation up. However the DOP will stay the same with the index after the maintenance operation, and you have to explicitly set it back:
This is pretty much a note to myself on how to set up Data Guard broker for RAC 22.214.171.124+. The tests have been performed on Oracle Linux 5.5 with the Red Hat Kernel. Oracle was 126.96.36.199. Sadly my lab server didn’t support more than 2 RAC nodes, so everything has been done on the same cluster. It shouldn’t make a difference though. If it does, please let me know).
WARNING: there are some rather deep changes to the cluster here, be sure to have proper change control around making such amendments as it can cause outages! Nuff said.
A few days ago I wrote about my new lab server and the misfortune with kernel UEK (aka 2.6.32 + backports). It simply wouldn’t recognise the memory in the server:
# free -m total used free shared buffers cached Mem: 3385 426 2958 0 9 233 -/+ buffers/cache: 184 3200 Swap: 511 0 511
Ouch. Today I gave it another go, especially since my new M4 SSD has arrived. My first idea was to upgrade to UEK2. And indeed, following the instructions on Wim Coekaerts’s blog (see references), it worked:
Thanks to the kind introduction from Kevin Closson I was given the opportunity to benchmark the Virident PCIe flash cards. I have written a little review of the testing conducted, mainly using SLOB. To my great surprise Virident gave me access to a Westmere-EP system running a top of the line 2s12c24t system with lots of memory.
In summary the testing shows that the “flash revolution” is happening, and that there are lots of vendors out there building solutions for HPC and Oracle database workloads alike. Have a look at the attached PDF for the full story if you are interested. When looking at the numbers please bear in mind it was a two socket system! I’m confident the server could not max out the cards.
This is as much a note to myself how to do this in the future as it is something hopefully worth reading for you. The requirement has been precise as always: migrate a database from 10.2 on SPARC to 11.2 on Linux. In the process, go from Veritas to ASM and make it quick!
I like short briefings but this was too short. Since the database was reasonably large I opted for the transportable tablespace approach, however I now think that a massively parallel impdp with network_link could have saved me quite a bit of time.
The following is by no means the complete story, but hopefully gives you an idea how to do these things. Always check, and document, then test (rinse and repeat). Only when proper signoff is received should you try such a process in production. Remember to script it and have at least one clean run of the scripts! This process is not super-quick, if you have low downtime requirements then consider Streams or better: Golden Gate for the process.
In preparation for a research project and potential UKOUG conference papers I am researching the effect of NUMA on x86 systems.
NUMA is one of the key features to understand in modern computer organisation, and I recommend reading “Computer Architecture, Fifth Edition: A Quantitative Approach” from Hennessy and Patterson (make sure you grab the 5th edition). Read the chapter about cache optimisation and also the appendix about the memory hierarchy!
Now why should you know NUMA? First of all there is an increasing number of multi-socket systems. AMD has pioneered the move to a lot of cores, but Intel is not far behind. Although AMD is currently leading in the number of cores (“modules”) on a die, Intel doesn’t need to: the Sandy-Bridge EP processors are way more powerful on a one-to-one comparison than anything AMD has at the moment.
Prompted by a recent interview I wanted to perform a little test with RMAN, and incrementally updated backups. I created a 188.8.131.52.5 database on my Linux (OEL 5.5) test system and refreshed my understanding of this most useful rman feature (which is for another post).
So after I was happy with the working of the incrementally updated image copies I decided to see if I could restore my database with those. Time to use “drop database”, which removes spfile, data files, temp files and the control files. Tabula rasa!
But it didn’t matter I thought: I have controlfile/spfile autobackups and fully recovered image copies plus all the archived logs in the FRA. What could possibly go wrong? Well it took me 30 minutes to get the database back.