Thinking about automation a lot, especially in the context of cloud computing, I have decided to create a small series of related posts that hopefully help someone deploying environments in a fully automated way. As my colleague @fritshoogland has said many times: the only way to deploy database servers (or any other server for that matter) in a consistent way, is to do it via software. No matter how hard you try to set up 10 identical systems manually, there will be some, maybe ever so subtle, differences between them. And with the cloud you probably have 42 systems or more to deal with! In this context, my first post could be a building block: the provisioning of the Oracle RDBMS the form of an RPM.
In a nutshell, I would like to
In the previous blog post I used libvirt and KVM in my lab environment to simulate a transient disk failure and how to recover from it. This post takes this example a step further: I am simulating another disk failure, but this time won’t pretend I can fix the issue and put it back. In other words, I simulate the effect of the disk_repair_time hitting zero.
Most of what I am covering here is an extension of the previous post, I’ll mention the main detail here for your benefit, but would like to invite you to revert to the previous post for more detail.
The idea is to show you the output of the ASM alert.log and result of the lost disk in the V$-views.
As with the previous post, the code examples in this one are for demonstration purposes only!
For quite some time we have been treated nicely by ASM when it comes to transient disk failures. Since 11.1 (if memory serves me right), transient failures won’t cause an ASM disk to be dropped immediately. This is good, because it can potentially save a lot of time! When a disk is dropped from an ASM disk, a rebalance operation is unavoidable. And there is more than a chance of another one following it, because ultimately you want to add the disk back to the disk group. Well, to be fair, this is only applicable for normal or high redundancy disk groups.
The feature I just described very briefly is referred to as ASM Fast Mirror Resync, and documented in the ASM Administrator’s Guide in section Administering Oracle ASM Disk Groups.
I have already written about the use of Connection Manager 11.2 to govern access to a database. While researching this piece I have updated the original post so that it’s relevant for 12c as well.
Although the idea of using Connection Manager is tempting, the obvious first question is about high availability of the process, you don’t want it to become a Single Point of Failure (SPOF). After all, if the gatekeeper to your database environment fails, you are effectively closed for business. One option would be to add another CMAN process that could be used.
I have created a few VMs to simulate the scenario. I installed CMAN in 2 VMs running Oracle Linux 7 and CMAN 12c. The client homes have been patched up to the April PSU because that’s considered good practice (although I couldn’t find a reference to CMAN in the OPatch logs). The hosts are named oracledev and server4.
In the first part of this little series I explained how a missing (default) tablespace in a PDB can have an impact when opening it. In this instalment I’ll add another common cause for PDBs not opening in read-write mode: different patch levels between the PDB to be plugged in and the CDB.
Assume for a moment that you upgraded Oracle binaries from 22.214.171.124.5 to 126.96.36.199.160419. In other words, you applied the most current PSU + OJVM patch at the time of writing to your database. Once complete, you would see output similar to this for a single instance Oracle 12c installation without Oracle Restart:
[oracle@server2 ~]$ opatch lspatches 22674709;Database PSU 188.8.131.52.160419, Oracle JavaVM Component (Apr2016) 22291127;Database Patch Set Update : 184.108.40.206.160419 (22291127) OPatch succeeded.
I am using database CDB1 in this blog post to indicate the patched CDB:
Container Databases have been an area that I have researched intensively over the past years. With this post (and hopefully some others that follow) I would like to demonstrate some of the new situations the DBA might be confronted with. Please don’t use this post to give the new 12c architecture a hard time: standardised deployments (which I love) help you a lot. Not only do your DBA scripts work reliably everywhere, but the error condition I am showing in this post should be a lot less likely.
At the end of the post I’ll show an alternative approach using a standardised way of creating PDBs.
Setting the scene, my environment is as follows:
I have written about the importance of automation in previous posts, and this one is following the same line. This time I wanted to test and document my experience with “golden database templates”. Wy? Because most DBAs will appreciate that there are more interesting things to do than to run “create database” scripts all day, so why not automate the process? For quite some time DBCA, or the Database Creation Assistant offers you the option to create templates. One approach you could use would be to
All of this is done in hope to make life for the administrators a lot easier.
This is just another short post about one of the little things worth knowing. Assume you are on 220.127.116.11 for example and you want to know if parallel_degree_limit reliably keeps the degree of parallelism at bay, for all sessions and queries across the board. After all, every machine has finite CPU (and other) resources that shouldn’t be exhausted. Setting parallel_degree_limit of course only has an effect if you set parallel_degree_policy to limited, auto or adaptive. The latter is the Big Knob to enable Automatic Degree Of Parallelism (DOP), and the post assumes that Auto DOP is enabled for the session. In this configuration, what happens to hinted statements?
The documentation (12c Reference Guide, section on parallel_degree_limit) states that you can provide an integer in addition to “CPU” and “IO”:
UPDATE 28-JUL-2016: the problem – as stated by Anil in the comments section – seems to be fixed. I applied the Proactive Bundle Patch 23273686 and OJVM patch 23177536 to the Grid and RDBMS home. After a successful execution of datapatch the errors in the CRSD agent log I reported in this post went away, and I can also see the settings in the database alert.log.
Scroll to the bottom of the post for more detail.
For those not yet on that patch level-the original post
While preparing to teach a class this week I have had the time to look into the In Memory (cost) option a bit closer. I noticed a few interesting things along the way and would like to share one of these here.
One of the questions I was asking myself was:
“What happens if I scan segments that are within the IM area, and some are not?”
I was primarily thinking of joins in a DWH environment, but in order to keep the test case simple enough and reproducible I decided to go with a partitioned table where the current partition is assigned to the IMCS, and the rest is not. For this test I’m relying on the Swingbench SH schema. All of this runs on my laptop in a VM so I had to be selective when it comes to indexes. I also chose to NOT partition the tables at this stage, I wanted to chose my own partitioning scheme. For reference, here is the command that created the SH schema: