Search

OakieTags

Who's online

There are currently 0 users and 32 guests online.

Recent comments

Affiliations

Oakies Blog Aggregator

Multitenant : PDB CONTAINERS Clause

I forgot to mention, I put another multitenant article live at the weekend.

I’m not sure I will ever use it, but it’s good to know it’s there.

I was originally working on an article on a completely different multitenant feature, but the examples I was using highlighted a bug, which kind-of scuppered that article. I’ve raised an SR and I’m waiting on the acknowledgement and possible fix. I’ll hold that article back until the fix is in place.

Cheers

Tim…


Multitenant : PDB CONTAINERS Clause was first posted on August 20, 2014 at 9:14 am.
©2012 "The ORACLE-BASE Blog". Use of this feed is for personal non-commercial use only. If you are not reading this article in your feed reader, then the site is guilty of copyright infringement.

LOB Length

It’s funny how you can make little savings in work all over the place in Oracle if you’re prepared to look a little closely at what’s going on. Here’s a quirky little example with LOBs and function calls that might just have some greater relevance in other situations. Here’s a little data set, and two queries that I might run against it:


create table tbl(
	c1      clob
)
lob (c1) store as c_lob(
	disable storage in row
	nocache nologging
)
;

begin
	for i in 1..128 loop
		insert into tbl values(rpad('x',4000));
		commit;
	end loop;
end;
/

commit;

-- collect stats, prime dictionary cache and library cache
-- run both queries twice and check stats on second run

select
	round(sum(ceil(len/8100)) * 8/1024,0)    used_mb
from
	(
	select
		/*+ no_merge */
		dbms_lob.getlength(c1) len
	from
		tbl
	)
where
	len > 3960
;

select
	round(sum(ceil(len/8100)) * 8/1024,0)    used_mb
from
	(
	select
		rownum rn, dbms_lob.getlength(c1) len
	from
		tbl
	)
where
	len > 3960
;

The question that you might ask yourselves when you see these queries is: will they do similar amounts of work. Of course, I wouldn’t be asking the question if the answer were yes. Despite the no_merge() hint, which you might think would have the same effect as the rownum approach, Oracle seems to execute the call to dbms_lob.getlength() twice for each row in the first query, but only once per row for the second query. Here are the stats (from autotrace) on the second run of the two queries when autotrace is enabled:


Statistics (for no_merge)
----------------------------------------------------------
         40  recursive calls
          0  db block gets
        271  consistent gets
          0  physical reads
          0  redo size
        541  bytes sent via SQL*Net to client
        544  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

Statistics (for rownum)
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        131  consistent gets
          0  physical reads
          0  redo size
        541  bytes sent via SQL*Net to client
        544  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

As you can see, the consistent gets for the no_merge() approach is roughly double that for the rownum approach – and since we have 128 rows/LOBs in the table that looks suspiciously like 2 gets vs. 1 get per LOB depending on the approach – which suggests two calls to the function. This is further corroborated by the execution plans, and especially by the predicate sections (how often have I said “always check the predicates”) which show that the predicate has been pushed inside the view that’s been hinted to be non-mergeable, but it hasn’t been pushed inside the view that uses the rownum instantion trick:


Execution Plan for no_merge()
----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     1 |    13 |     2   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE     |      |     1 |    13 |            |          |
|   2 |   VIEW              |      |     6 |    78 |     2   (0)| 00:00:01 |
|*  3 |    TABLE ACCESS FULL| TBL  |     6 |   522 |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("DBMS_LOB"."GETLENGTH"("C1")>3960)

Execution Plan for rownum
-----------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |     1 |    13 |     2   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE      |      |     1 |    13 |            |          |
|*  2 |   VIEW               |      |   128 |  1664 |     2   (0)| 00:00:01 |
|   3 |    COUNT             |      |       |       |            |          |
|   4 |     TABLE ACCESS FULL| TBL  |   128 | 11136 |     2   (0)| 00:00:01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("LEN">3960)

My first thought on seeing this difference was to apply the /*+ no_push_pred */ hint to block predicate pushing – but then I remembered that the hint is about join predicate pushing and this is a simple filter push. A quick search of the hidden parameters, though, revealed this:

_optimizer_filter_pushdown : enable/disable filter predicate pushdown

Setting this parameter to false – either through a call to ‘alter session’ or through an /*+ opt_param( opt_param(‘_optimizer_filter_pushdown’ , ‘false’) */ hint – allowed the no_merge approach to produce the same plan and resource usage as the rownum approach. Of course, for a production system, I’d probably use the rownum approach rather than mess around with hidden parameters.

Footnote:

I don’t know why the code with the no_merge() approach reported 40 recursive calls (on its first execution with autotrace). A couple of variations on the experiment suggested that it had something to do with the number of rows (or consequential buffer visits) that survived the predicate call – for a sufficiently small number of rows the recursive call count happened to drop to zero; but the phenomenon needs further investigation.

Just Don’t Call It Science!

I’ve had two rather frustrating conversations recently with people who have claimed “science” and “proof” where it is totally inappropriate. One was regarding a 3000 year old religious text. The other was a bunch of medical-related blog posts. In both cases, the people were making very valid points and thought they were backing up their stance with “scientific proof”. The problem was, by inappropriately playing the science card, they made themselves and their beliefs sound stupid. Like I said, in both cases they were making very valid points, but their supporting arguments were just plain dumb!

I really do feel most people out there do not have a clue what science and scientific methods actually are. Just to bring this closer to home, I would just like to make this point. Please read this next line and the following explanation before you feel the need to comment…

I don’t think there is an Oracle blogger that I follow that is writing blog posts worthy of being called “science”.

That comment is not meant to shock and offend. I’m not trying to sound elitist, because I certainly rate much lower on the science scale than many other bloggers. The point of that comment is to put things into perspective. The scientific approach to gathering, validating and presenting information is a very formal affair. If you’ve experienced a research-based (not taught) Masters or PhD you will have just scratched the surface of what I’m talking about. If you’ve not experienced science to at least that level, I’m not sure you are really in a position to judge the extent of what people mean when they say “scientific proof”.

At this point I think the audience response will be split amongst these types of reaction.

  • People who saw the word “religious” in the second sentence, instantly jumped to a conclusion about what this post is about and either stopped reading or started to prepare their counter argument.
  • People who have a “favourite blogger” who *they believe* to have a super-scientific approach to Oracle and are desperate to tell me so.
  • People who follow “popular science” and believe the watered down crap they present in those articles and documentaries resembles real science and therefore think they know about science and think I’m talking crap.
  • People who just don’t give a damn and will happily call what they read in blog posts and wikipedia science, regardless of source, scientific approach etc.
  • People who understand what I’m talking about and realise we all just producing useful and informative content, but are not living up to the rigorous standards that would be required to call our tests and content “science” or “scientific proof”.

I think everyone is entitled to their own opinion, educated or not, but what I find it really hard to tolerate is when people start playing the science card where it is not appropriate. Do that and I’m gonna call bullshit every time!

Cheers

Tim… (Not a “Computer Scientist”, “Data Scientist”, “Software Engineer”, “Data Architect” or any of those other bullshit titles that offend the professions they reference)

 


Just Don’t Call It Science! was first posted on August 19, 2014 at 8:58 am.
©2012 "The ORACLE-BASE Blog". Use of this feed is for personal non-commercial use only. If you are not reading this article in your feed reader, then the site is guilty of copyright infringement.

In Search of Plan Stability (Part 1) - August 27 webinar

Register by clicking here.

Hope you'll join me!

Why am I seeing “Result: Failed. Insufficient resource available” messages?

This is an interesting point that came up recently, where someone had a DBaaS environment running on an Exadata system for a couple of months, with no issues.  It has recently started to return errors on the Database Cloud Self Service portal in the “Placement Algorithm” region:

Constraint : Number Of Databases
Current Resource Usage: 18
Total Resource Available: 10
Maximum Limit Defined: 10
Requested Value: 1
Total Resource that can be used: 10
Result: Failed. Insufficient resource available.

But if I grep for PMON processes,  the number of databases running on the machine is 8.  Why would I see such an error?

The reason here is fairly simple – there are a couple of databases that are down, and if we include those, the number of databases on the machine has reached the maximum limit defined. But the explanation needs a bit more detail.

So why do we include databases that are down in this count? Well, even in a non-cloud environment, shutting down a database doesn’t release the storage used by that database (thankfully!) A mere shutdown of the database cannot be assumed to reduce the quota count, as the database may be restarted at any time, either manually by the database administrator using Enterprise Manager Cloud Control, EM Express, or SQL*Plus, or automatically, using tools such as cron or init.d(rc2.d) and the machine resources will again be used. Over-allocation in quota is generally used to solve such requirements where not all databases will be up and running all the time.

The next question that comes to mind is whether this algorithm is based only on the number of databases, or machine resources such as CPU and memory when the database is down? At a high level, the answer is the algorithm is controlled by 3 factors:

  1. The number of RDBMS instances*: this is the number of instance targets on a specific host. This data is fetched from the Enterprise Manager repository, which is why databases that are down are counted. So if you really want to ensure a database that is shutdown does not count towards the total number of databases, then remove the database target from being managed by Enterprise Manager. :)
  2. Memory – there are two flavours here:
    • Memory usage based on the host – calculated as the average memory utilization on the host over the last 7 days
    • Memory usage based on the databases – calculated as the total sum of memory allocated for the databases on the host
  3. CPU – the average CPU utilization over the lat 7 days on the host

Hopefully this explains why we call the overall theme of database placement as ‘Optimistic’. :)

*You’ll notice I used the term “RDBMS instances” here, NOT “database instance”. There’s a reason for that. It’s one of my pet peeves to see the term “database instance”. In Oracle terms, a database is a set of physical files – the datafiles, redo log files and control files (but not the initialization parameter file, whichever variety – SPFile or the old init.ora text file – you use). An instance is the set of background processes and memory structures in the SGA. While there is most often a one to one mapping between a database and an instance (the exception of course being a Real Application Clusters environment where multiple instances share a single database), there really is no overlap between the term “database” and the term “instance”. Yes, I know our products use the term all the time, and so does our documentation, but to my pedantic way of thinking, they’re WRONG! :)

Weblogic Start Failure: Empty Initial Replica

This post is to address a failed Weblogic start in EM12c.  The fix is performed on a Windows host, but the fix can easily be for any EM12c environment that experiences this issue.  In the below instructions, just replace the %OMS_BASE% for Windows Env Var to $OMS_BASE for Linux/Unix.

Issue:  OMS Fails to Start and Points to Weblogic Issue in EMGC_OMS1.out File.

Error in Out File: 

Reason: Failure on create domain from template step, aka “Emplty Initial Replica”

How to resolve:

1.  Stop all Oracle processes/services on the host.

2.  Go to:

%OMS_BASE%\gc_inst\user_projects\domains\GCDomain\servers\EMGC_ADMINSERVER\data

Rename the ldap folder to ldapold.
3. Go to each of the following folders and remove all folders and files out of the directories:

%OMS_BASE%\gc_inst\user_projects\domains\GCDomain\servers\EMGC_ADMINSERVER\cache
 %OMS_BASE%\gc_inst\user_projects\domains\GCDomain\servers\EMGC_ADMINSERVER\tmp
 %OMS_BASE%\gc_inst\user_projects\domains\GCDomain\servers\EMGC_ADMINSERVER\logs

4.  Restart all Oracle services/processes for the EM12c environment on the host.

You’ll notice the LDAP folder is recreated with the restart of the Weblogic tier.  This should also resolve the domain creation error and all services should start successfully.



Tags:  


Del.icio.us



Facebook

TweetThis

Digg

StumbleUpon




Copyright © DBA Kevlar [Weblogic Start Failure: Empty Initial Replica], All Right Reserved. 2014.

Oracle Midlands : Event #5

Just a quick reminder that Oracle Midlands Event #5 is just around the corner (#000000;">Tuesday 16 September).

  • Boost Performance by Clustering Data - Martin Widlake
  • Data Virtualisation and Instant Cloning - Ron Ekins (Delphix)

Martin is always good value, so I’m looking forward to that session. I’ve seen a couple of talks on the Delphix stuff and it is seriously cool! I think this session will open a few people’s eyes… :)

Big thanks to the Red Gate Software folks for sponsoring the event, allowing it to remain free

You can get more details here. I’ve already registered. See you there!

Cheers

Tim…

 


Oracle Midlands : Event #5 was first posted on August 18, 2014 at 10:43 am.
©2012 "The ORACLE-BASE Blog". Use of this feed is for personal non-commercial use only. If you are not reading this article in your feed reader, then the site is guilty of copyright infringement.

MERGE and IOT’s ….. unhappy bedfellows

Anyone who has used Oracle for a while will be familiar with the Parent/Child locking "issue" when it comes to tables and indexes on foreign keys.  For many years you’d hear people crying "bug" etc but thankfully most now know the reason, and accept it as sensible behaviour.

But lets take a look at a slight variation on that theme.

Lets start with a table called "LOC" which will be our parent table in this example. Note that it is an IOT, and we’ll also have a child table "LOC_CHILD", which is also an IOT.

SQL> CREATE TABLE LOC
  2  (
  3    LOC_ID              NUMBER(4)            NOT NULL,
  4    DATA                     NUMBER(6),
  5    CONSTRAINT LOC_PK
  6    PRIMARY KEY
  7    ( LOC_ID)
  8    ENABLE VALIDATE
  9  )
 10  ORGANIZATION INDEX
 11  /

Table created.

SQL> CREATE TABLE LOC_CHILD
  2  (
  3    CHILD_SEQ                  NUMBER(12)          NOT NULL,
  4    LOC_ID               NUMBER(4)           NOT NULL,
  5    CHILD_DATA  NUMBER(15,6),
  6    CONSTRAINT LOC_CHILD_PK
  7    PRIMARY KEY
  8    (CHILD_SEQ, LOC_ID)
  9    ENABLE VALIDATE
 10  )
 11  ORGANIZATION INDEX
 12  /

Table created.

SQL> insert into LOC
  2  select rownum,50
  3  from dual
  4  connect by level <= 5
  5  /

5 rows created.

Now being a good DBA :-) we’ve read all the "gloom and doom" nonsense about foreign keys being indexed, so just to be careful, we’ll add that index onto our child table before adding our foreign key back to LOC.

SQL> CREATE INDEX LOC_CHILD_IX ON LOC_CHILD
  2  (LOC_ID)
  3  /

Index created.

SQL> ALTER TABLE LOC_CHILD ADD (
  2    CONSTRAINT LOC_CHILD_FK
  3    FOREIGN KEY ( LOC_ID)
  4    REFERENCES LOC (LOC_ID)
  5    ENABLE VALIDATE)
  6  /

Table altered.

SQL> insert into LOC_CHILD
  2  select rownum,mod(rownum,5)+1,dbms_random.value(1000,5000)
  3  from dual
  4  connect by level <= 2000
  5  /

2000 rows created.

SQL> commit;

Commit complete.

So the scene is set..we’ve got our tables seeded with some data, and ready to go..

Lets update a row in the parent table LOC:

SQL> UPDATE loc
  2  SET    DATA = 99
  3  WHERE  LOC_ID = 2;

1 row updated.

Now we’ll pop into a new session and update the child table LOC_CHILD:

SQL> MERGE 
  2       INTO  LOC_CHILD
  3       USING (SELECT 500 CHILD_SEQ,
  4                     2 LOC_ID,
  5                     1000 CHILD_DATA
  6                FROM DUAL) M
  7          ON (    LOC_CHILD.CHILD_SEQ = M.CHILD_SEQ
  8              AND LOC_CHILD.LOC_ID = M.LOC_ID)
  9  WHEN MATCHED
 10  THEN
 11     UPDATE SET
 12        LOC_CHILD.CHILD_DATA =  NVL (LOC_CHILD.CHILD_DATA, 0) + M.CHILD_DATA
 13  WHEN NOT MATCHED
 14  THEN
 15     INSERT     (CHILD_SEQ,
 16                 LOC_ID,
 17                 CHILD_DATA)
 18         VALUES (M.CHILD_SEQ,
 19                 M.LOC_ID,
 20                 M.CHILD_DATA);

[stuck]

And splat…we’re stuck.  (Without evidence to support it) I’d hypothesize its due to the flexibility of the merge command.  A single merge can insert, update and even delete rows, so I’m guessing that to handle this flexibility (in particular, the DELETE option) then the locking errs on the side of safety.

Monitoring a Microsoft OS Failover Cluster

I know, I know-  none of you are using Microsoft Windows.  This is why I get so many questions on this topic and why there is so much interest in a white paper that no one thought I needed to write.  Well, while that pesky ol’ white paper is in review, I’m going to go onto a secondary topic of how to monitor a Microsoft Active/Passive cluster with Enterprise Manager 12c, release 4.

There are some disclaimers I’m going to put in this blog post-

1.  monitoring a clustered host is not explicitly stated as not supported with EM12c, but there is also some documentation that I’m still pouring over that is a bit contradicting… :)

2.  I am only covering MS Windows 2008 R2 server with this article.  It will be quite simple to cover MS Windows 2012 server, but no guarantees on 2003, as the security model for clustering, etc. is very different in the earlier releases.

3.  This is a blog post.  It’s not a MOS note, it’s not a white paper, so remember, this is just me blogging on how I accomplished this and there is no official support documentation behind this topic.

And so here we go!

The concept of OS level clustering can be very foreign to DBAs who’ve spent much of their time with application level clustering, (i.e. RAC/Real Application Clustering) but I believe it is important for us to understand different clustering models, along their benefits and drawbacks.

Microsoft OS level clustering, in an Active/Passive configuration for Microsoft includes the following basic, physical design-

  • Two hosts, identified by a physical name and physical IP Address
  • A Virtual Cluster name, along with Virtual Cluster address.
  • A quorum disk, (similar to a voting disk in RAC)
  • Shared storage

The overall design appears similar to the following when in an active/passive mode with the Management agent installed with two databases monitored:

failover_agent

The most important thing you need to recognize is that no matter what the HOSTS are named or their IP addresses, the databases and the management server, (along with the listener, that I didn’t include in my diagram!) is all configured with the VIRTUAL CLUSTER NAME.  They do not recognize the host names at all and are only monitoring and running on the ACTIVE host via the VIRTUAL cluster name.

This is achieved through the same method as you would any other Windows host.  I recommend a silent installation, even with EM12c, Release 4, (12.1.0.4) using a response file to enter the commands and ensure you are completing all information as you normally would, but with the cluster information.

1.  The first requirement is your cluster MUST BE STABLE.

If a Windows Server cluster is not stable, don’t proceed.  Correct whatever DNS, Active Directory or shared storage issues exist before proceeding to install Oracle or the agent.  This will save you a lot of headaches.

2.  Failover Group should exist to add the Management Agent to.

If an application failover group doesn’t already exist in the Microsoft Failover Cluster Administration Manager, then you will need to request a virtual host name and virtual IP Address to be used for the Central Access Point, (CAP).  This is the group service that will manage the failover for the databases, etc, (so if databases already exist, this should exist and all you will do is add the agent to it….)

Test out all cluster connectivity via Name Server (nslookup) commands

nslookup 
nslookup 
nslookup 
nslookup 
 

Install Software

The management agent is going to be installed on the shared storage, which means it will only know about the active host in the cluster, (disclaimer alert!) To perform the installation, we are going to use a new method of PsExec with Windows Servers.

Download Agent Software

Check your software library for the correct version of the agent and download it via EM CLI commands:

emcli get_supported_platforms
emcli get_agentimage -destination=G:/swlib/ –platform="Microsoft Windows x64 (64-bit)"

Exit from EM CLI, unzip the file and prep for installation.

Setting up PsExec on the OMS

Create your folder you wish to install to:  C:\agent12c

Download the PsExec bat file and the utility, following instructions in the DOC ID 1636851.1

Download the agentDeployPsExec.bat and the PsExec utility to a folder on C:\psexec.

Create the psexec.rsp response file with the following information and save it to the C:\psexec directory with the other files:

HOST_NAMES=
USER_NAME=
PASSWORD=
AGENT_IMAGE_PATH=C:\agent12c_ins
AGENT_BASE_DIR=C:\agent12c
OMS_HOST=
EM_UPLOAD_PORT=
AGENT_REGISTRATION_PASSWORD=
PSEXEC_DIR=C:\psexec

If you are unsure of the values for your upload port, etc., run the following on the OMS:

emctl status oms -details

It requires the SYSMAN password and will return all information pertinent to your EM environment.

Once you have this all filled in, you are ready to deploy from the OMS to the 1st node of the OS clustered server.

Run the following:

agentDeployPsExec.bat AGENT_BASE_DIR= RESPONSE_FILE=\psexec.rsp
C:\psexec>agentDeployPsExec.bat PROPERTIES_FILE=C:\psexec\psexec.rsp

C:\psexec>echo off
===================================================================
Agent deployment started on host : host1.us.oracle.com

Creating installation base directory ...

Note:  You must have remote admin privileges to the target host to perform this successfully.  If the account in the response file does NOT have privileges granted to create directories, start remote services and such, it will fail.

It will take some time to deploy the agent and once complete, will show the following:

Agent deployed successfully.

Duplicate registry and services to second host

Click on Start, Run and type in regedit.exe on the first host of the failover cluster.

Go to HKEY_LOCAL_MACHINE\SOFTWARE\oracle and right click on the folder and choose Export.  Save the registry file in a secure location.

FTP the registry file to the second host.

Log into the second host and double click on the registry file.  It will ask you if you are sure you want to copy the registry key, click OK.

Create the service

The service for the agent must now be duplicated on the second host.  This is OS level clustering, so no application level service creation should be performed, (emctl, emcli, oradim…)  Use sc commands, (Windows Service Control) to create the service.

Open up a command prompt in administrator mode and the duplicate the OracleAgent12c1:

The syntax for creating the service is as follows:

sc create  binPath= "" start= auto

By opening up Windows services on the first host, you can go to the OracleAgent12c1 and double click on it to view the values you need for above:

ora_agent

Run the sc command on the second host to create the support service for the installation performed on the first host.

Add the Agent to the Failover Group

In the Server Manager, go to the Failover Management and open up the Failover group:

failover_cluster_1

Right click on Add a resource and choose Generic Service.  You can then choose the Agent service listed, (for the example above, Oracleagent12c1Agent) and follow through with the defaults to finish.

The service for the agent is now set to be cluster aware and will failover if the first host it was installed becomes unavailable for some reason.

Failover Validation

The installation is now complete on the shared storage of the failover cluster for the first host and you’ve now copied over the registry settings and duplicated service, so you are ready to test the failover and ensure the agent is successful.

There are a couple ways to test the failover:

1.  Reboot the first host-  this will cause a failover.

2.  Right click on the Failover group and click on More Actions, then Simulate failure of this resource.

You should now see the drives and services, including the agent, failover and start on the second host.  Verify that all services come online and log into the EMCC , (EM12c console) to verify uploading occurs to the OMS.  Verify that all the targets you added are showing correctly for the virtual hostname.

Test failover multiple times to both hosts.  If a failure occurs on the second host, check the services, comparing to the 2nd to 1st host and/or dependencies on start up of your Failure group.

Drawbacks

The clear drawback of OS level clustering through the agents is that only one host is being monitored at a time.  As the targets, (MSSQL database, applications, etc.) that are being monitored are active on only one host at a time, there would be manual intervention required if dual agents were deployed.

A workaround in the form of a monitoring script to ping the hosts at all times, only alerting if no response received is a second level of host monitoring availability.

I’m also inspecting the option of Failsafe with EM12c, (not currently supported) and the new Partner Agents to see if there are more opportunities to monitoring OS level clustering.

 

 

 

 



Tags:  


Del.icio.us



Facebook

TweetThis

Digg

StumbleUpon




Copyright © DBA Kevlar [Monitoring a Microsoft OS Failover Cluster], All Right Reserved. 2014.

In-memory limitation

I’ve been struggling to find time to have any interaction with the Oracle community for the last couple of months – partly due to workload, partly due to family matters and (okay, I’ll admit it) I really did have a few days’ holiday this month. So making my comeback with a bang – here’s a quick comment about the 12.1.0.2 in-memory feature, and how it didn’t quite live up to my expectation; but it’s also a comment about assumptions, tests, and inventiveness.

One of the 12.1.0.2 manuals tells us that the optimizer can combine the in-memory columnar storage mechanism with the “traditional” row store mechanisms – unfortunately it turned out that this didn’t mean quite what I had hoped; I had expected too much of the first release. Here’s a quick demo of what doesn’t happen, what I wanted to happen, and how I made it happen, starting with a simple definition (note – this is running 12.1.02 and the inmemory_size parameter has been set to enable the feature):


create table t1 nologging
as
select	*
from	all_objects
where	rownum <= 50000
;

alter table t1 inmemory
no inmemory (object_id, object_name)
inmemory memcompress for query low (object_type)
-- all other columns implicitly inmemory default
;

insert into t1 select * from t1;
commit;

insert into t1 select * from t1;
commit;

insert into t1 select * from t1;
commit;

begin
	dbms_stats.gather_table_stats(user, 't1', method_opt=>'for all columns size 1');
end;
/

rem
rem	Needs select on v$_im_column_level granted
rem

select
	table_name,
	column_name,
	inmemory_compression
from
	v$im_column_level
where	owner = user
and	table_name = 'T1'
order by
	segment_column_id
;

explain plan for
select
	last_ddl_time, created
from
	t1
where	t1.created > trunc(sysdate)
and	t1.object_type = 'TABLE'
and	t1.subobject_name is not null
;

select * from table(dbms_xplan.display);

All I’ve done at this point is create a table with most of its columns in-memory and a couple excluded from the columnar store. This is modelling a table with a very large number of columns where most queries are targeted at a relatively small subset of the data; I don’t want to have to store EVERY column in-memory in order to get the benefit of the feature, so I’m prepared to trade lower memory usage in general against slower performance for some queries. The query against v$im_column_level shows me which columns are in-memory, and how they are stored. The call to explain plan and dbms_xplan then shows that a query involving only columns that are declared in-memory could take advantage of the feature. Here’s the resulting execution plan:

-----------------------------------------------------------------------------------
| Id  | Operation                  | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |      |     1 |    27 |    73   (9)| 00:00:01 |
|*  1 |  TABLE ACCESS INMEMORY FULL| T1   |     1 |    27 |    73   (9)| 00:00:01 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - inmemory("T1"."SUBOBJECT_NAME" IS NOT NULL AND
              "T1"."OBJECT_TYPE"='TABLE' AND "T1"."CREATED">TRUNC(SYSDATE@!))
       filter("T1"."SUBOBJECT_NAME" IS NOT NULL AND
              "T1"."OBJECT_TYPE"='TABLE' AND "T1"."CREATED">TRUNC(SYSDATE@!))

Note that the table access full includes the inmemory keyword; and the predicate section shows the predicates that have taken advantage of in-memory columns. The question is – what happens if I add the object_id column (which I’ve declared as no inmemory) to the select list.  Here’s the resulting plan:


--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |    32 |  1818   (1)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |     1 |    32 |  1818   (1)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("T1"."SUBOBJECT_NAME" IS NOT NULL AND
              "T1"."OBJECT_TYPE"='TABLE' AND "T1"."CREATED">TRUNC(SYSDATE@!))

There’s simply no sign of an in-memory strategy – it’s just a normal full tablescan (and I didn’t stop with execution plans, of course, I ran other tests with tracing, snapshots of dynamic performance views etc. to check what was actually happening at run-time).

In principle there’s no reason why Oracle couldn’t use the in-memory columns that appear in the where clause to determine the rowids of the rows that I need to select and then visit the rows by rowid but (at present) the optimizer doesn’t generate a plan to do that. There’s no reason, though, why we couldn’t try to manipulate the SQL to produce exactly that effect:


explain plan for
select
        /*+ no_eliminate_join(t1b) no_eliminate_join(t1a) */
        t1b.object_id, t1b.last_ddl_time, t1b.created
from
        t1 t1a, t1 t1b
where   t1a.created > trunc(sysdate)
and     t1a.object_type = 'TABLE'
and     t1a.subobject_name is not null
and     t1b.rowid = t1a.rowid
;

select * from table(dbms_xplan.display);

------------------------------------------------------------------------------------
| Id  | Operation                   | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |      |     1 |    64 |    74   (9)| 00:00:01 |
|   1 |  NESTED LOOPS               |      |     1 |    64 |    74   (9)| 00:00:01 |
|*  2 |   TABLE ACCESS INMEMORY FULL| T1   |     1 |    31 |    73   (9)| 00:00:01 |
|   3 |   TABLE ACCESS BY USER ROWID| T1   |     1 |    33 |     1   (0)| 00:00:01 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - inmemory("T1A"."SUBOBJECT_NAME" IS NOT NULL AND
              "T1A"."OBJECT_TYPE"='TABLE' AND "T1A"."CREATED">TRUNC(SYSDATE@!))
       filter("T1A"."SUBOBJECT_NAME" IS NOT NULL AND
              "T1A"."OBJECT_TYPE"='TABLE' AND "T1A"."CREATED">TRUNC(SYSDATE@!))

I’ve joined the table to itself by rowid, hinting to stop the optimizer from getting too clever and eliminating the join. In the join I’ve ensured that one reference to the table can be met completely from the in-memory columns, isolating the no inmemory columns to the second reference to the table. It is significant that the in-memory tablescan is vastly lower in cost than the traditional tablescan – and there will be occasions when this difference (combined with the knowledge that the target is a relatively small number of rows) means that this is a very sensible strategy. Note – the hints I’ve used happen to be sufficient to demonstrate method but I’d be much more thorough in a production system (possibly using an SQL baseline to fix the execution plan).

Of course, this method is just another example of the “visit a table twice to improve the efficiency” strategy that I wrote about a long time ago; and it’s this particular variant of the strategy that allows you to think of the in-memory columnar option as an implementation of OLTP bitmap indexes.