Search

OakieTags

Who's online

There are currently 0 users and 45 guests online.

Recent comments

Affiliations

Oakies Blog Aggregator

Oracle XMLDB Sample Code

This week Mark Drake, Senior Product Manager Oracle XMLDB, put up a new page on Oracle OTN with some great code examples and utilities demonstrating Oracle XMLDB functionality. Among others, the latest code for Mark’s XFILES XMLDB demo application version 5, Introduction to Oracle XML DB Repository Events, Basic Introduction to Oracle XML DB 11.2.0.3.0, …

Continue reading »

Site Update

Updated this site to the latest “stuff” including a new stats plugin for WordPress, called kstats, just to see if this will work for me. I noticed already a small strange issue that is manifests itself in the main title of the page. Only in internet explorer it shows incorrectly, but all the others browsers …

Continue reading »

VirtualBox 4.1.6 Released…

VirtualBox 4.1.6 has been released. You can download it from the usual place and see the updates in the changelog.

Happy upgrading.

Cheers

Tim…




An interesting problem with ext4 on Oracle Linux 5.5

I have run into an interesting problem with my Red Hat 5.5 installation. Naively I assumed that ext4 has been around for a long time it would be stable. For a test I performed for a friend, I created my database files on a file system formatted with ext4 and mounted it the same way I would have mounted an ext3 file system:

$ mount | grep ext4
/dev/mapper/mpath43p1 on /u02/oradata type ext4 (rw)

Now when I tried to create a data file within a tablespace of a certain size, I got block corruption which I found very interesting. My first thought was: you must have a corruption of the file system. So I shut down all processes accessing /u02/oradata and gave the file system a thorough checking.

# umount /u02/oradata
#
# fsck.ext4 -cfv /dev/mapper/mpath43p1
e4fsck 1.41.9 (22-Aug-2009)
Checking for bad blocks (read-only test): done
/dev/mapper/mpath43p1: Updating bad block inode.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information

/dev/mapper/mpath43p1: ***** FILE SYSTEM WAS MODIFIED *****

42 inodes used (0.00%)
14 non-contiguous files (33.3%)
0 non-contiguous directories (0.0%)
# of inodes with ind/dind/tind blocks: 0/0/0
Extent depth histogram: 18/14
3679655 blocks used (61.82%)
0 bad blocks
5 large files

26 regular files
7 directories
0 character device files
0 block device files
0 fifos
0 links
0 symbolic links (0 fast symbolic links)
0 sockets
--------
33 files
#

Have a look at the command line options and make sure you understand them before letting fsck loose on your file systems! By the way if Linux tells you the file system cannot be unmounted, use “fuser -m /u02/oradata” to list all PIDs accessing the mount point.

As you can see from the output the file system was fine-which struck me as odd. I mounted it again and started my database, surely that was only a glitch.

After repeating my test, I got the same block corruption. I now had a suspicion that ext4 might be the problem. Instead of creating a tablespace with 1 8GB data file, I used 4 2 GB data files and the problem went away.

After some experimentation I found out that the magical boundary is somewhere > 3G for single data files. Here is the proof:

SQL> create tablespace WillIBeUnusable datafile '/u02/oradata/orcl/corrupt.dbf' size 3G;

Tablespace created.

SQL> !dbv file=/u02/oradata/orcl/corrupt.dbf

DBVERIFY: Release 11.2.0.2.0 - Production on Thu Nov 3 10:32:02 2011

Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.

DBVERIFY - Verification starting : FILE = /oradata/orcl/corrupt.dbf

DBVERIFY - Verification complete

Total Pages Examined         : 393216
Total Pages Processed (Data) : 0
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 0
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 127
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 393089
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 3371559 (0.3371559)
$

However a 5G data file reports these:

$ dbv file=/oradata/orcl/corrupt.dbf

DBVERIFY: Release 11.2.0.2.0 - Production on Thu Nov 3 10:33:49 2011

Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.

DBVERIFY - Verification starting : FILE = /oradata/orcl/corrupt.dbf

...

Page 655357 is marked corrupt
Corrupt block relative dba: 0x0289fffd (file 10, block 655357)
Completely zero block found during dbv:

Page 655358 is marked corrupt
Corrupt block relative dba: 0x0289fffe (file 10, block 655358)
Completely zero block found during dbv:

Page 655359 is marked corrupt
Corrupt block relative dba: 0x0289ffff (file 10, block 655359)
Completely zero block found during dbv:

Page 655360 is marked corrupt
Corrupt block relative dba: 0x028a0000 (file 10, block 655360)
Completely zero block found during dbv:

DBVERIFY - Verification complete

Total Pages Examined         : 655360
Total Pages Processed (Data) : 0
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 0
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 127
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 524160
Total Pages Marked Corrupt   : 131073
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 3371948 (0.3371948)
$

That’s not good. A search on supporthtml.oracle.com revealed that ext4 is supported only in 5.6 and later. At that point I stopped bothering with it since even if I worked out where the problem was, I would still be running in an unsupported configuration. The main reason I prefer not to be in “unsupported” terrain is that I don’t like Oracle support to get away by simply dismissing the service request.

The Alternative: XFS

However, as Greg Rahn, Kevin Closson and others will tell you, you shouldn’t use ext4 for databases anyway: use XFS! The latter has been substantially improved in Oracle/Red Hat Linux 6 and should be your file system of choice.

Almost forgot the hard facts:

  • Oracle Linux 5.5 64bit
  • Oracle 11.2.0.1 single instance

Simple SQL – Finding the Next Operation

November 3, 2011 An interesting request came in from an ERP mailing list – how would you write a SQL statement that indicates the next operation in a manufacturing process.  Sounds like an easy requirement.  Let’s take a look at a graphical view of one example (the graphical view is created using a program that I wrote [...]

Movember

So it's that time of year again, when perfectly reasonable people decide to deliberately make fools of themselves for a good month and a good cause ....

It's Movember, the month formerly known as November, now dedicated to growing moustaches and raising awareness and funds for men's health; specifically prostate and testicular cancer. I'm donating my top lip to the cause for 30 days in an effort to help change the face of men's health. My Mo will spark conversations, and no doubt generate some laughs; all in the name of raising vital awareness and funds for cancer's affecting men.

Why am I so passionate about men's health?
* 1 in 9 men will be diagnosed with prostate cancer in their lifetime
* This year 37,000 new cases of the disease will be diagnosed
* 1 in 2 men will be diagnosed with cancer in their lifetime
* 26% of men are less likely to go the doctor compared to women (Not sure about this statistic - thanks Graham)
* Men never go to the bl**dy doctor! (That's a bit more like it)
* Too many of my friends have already been affected by some of these issues.

I'm asking you to support my Movember campaign by making a donation by either:
*Donating online at: http://mobro.co/dougburns
*If you want to go old school you can write a cheque payable to 'Movember', reference my name and Registration Number 763450 and send it to: Movember Europe, PO Box 68600, London, EC1P 1EF

If you'd like to find out more about the type of work you'd be helping to fund by supporting Movember, take a look at the Programmes We Fund section on the Movember website: http://uk.movember.com/about

Thank you in advance for supporting my efforts both last year and hopefully this to change the face of men's health. On a personal note, I understand that it will be difficult to achieve last year's terrific final total when so many friends in the community are signing up (such as Alex Gorbachev, James Morle, Dan Norris and Jacco Landlust to name a few) but I'll give it a try anyway and, whatever happens, the money will all end up in the right place so sponsor away!

Oh, but *please* don't ask me to go through the pantomime misery again! ;-)

Doug Burns

OTN APAC Tour: NZOUG Day 1 & 2

The evening before the NZOUG conference was a bit chaotic. There was still no resolution to the Qantas fiasco and I was starting to believe I would have to cancel my sessions in Perth and try to fly home from Auckland. I tried to switch my flights, but everything was sold out.

By the next morning the Qantas strike seemed to be over, but there were reports of delays and disruption, so I was still not sure if I would make it to Perth.

NZOUG Day 1:

I was just about to start my first session (Clonedb) when there was a fire alarm. Fortunately it was resolved pretty quickly and I was able to get things back on track. From there is was straight on to do my second session (Edition-Based Redefinition), with no fire alarms this time. Both my sessions seemed to go down well. I got some good questions in each session, which is always cool, and some more later in the day and at the evening event.

I spent a little time chatting to some of the guys on the Quest stand and the guys from the DBVisit stand.

My first session of the day as an attendee was Graham Wood’s Exadata session. I saw this session for the first time a few years ago when it was known as the “Terabyte Hour”. With the hardware refreshes and a few software tweaks that have happened since then it now only takes about 18 minutes to complete the demo, so the name has changed. :) My comment to Graham at the end of the session was, “Much as I hate to admit it, it’s really impressive.”

That pretty much took me to the end of the presentations on the first day.

Chris Muir told me to check my emails as my flights had been sorted out. Sure enough, when I checked Lillian Buziak from the OTN team had contacted Oracle Travel and fixed everything for me. She is a total miracle worker. If I was younger and more attractive (and she wasn’t already married) she would be mine, oh yes, she would be mine…

In the evening we all went to the conference dinner. The theme was “Murder Mystery”, which involved a few poor souls getting selected to be made a fool of in front to the rest of the audience. Unfortunately, I was one of the fools in question, along with Chris Muir, Debra Lilley, Bambi and a couple of guys. The compere and the ghost of the victim (my wife, murdered on her wedding day) lead us through various “role playing scenarios” to determine who was the murderer. The final decision for the murderer came down to me (the husband) and the chef, with the chef being voted the winner/loser/murderer. It was all very confusing, fun and embaressing, all roled in to one. :)

The whole event seemed to go down really well with the audience, who had plenty of comments (and photos). I have a feeling this is going to haunt (no punn intended) me for a long time.

NZOUG Day 2:

This was cut very short for me. My new Perth flight was arranged for 14:10, so I had to leave the conference at 11:00. I still got time to chat to a few people, mostly about the previous evenings events, and check out Ronald Bradford‘s session on the top 9 issues people have with mySQL databases. I’ve been a casual user of mySQL for years, but never really spent much time looking at it in any depth. I learned quite a bit from this session. Maybe I’ll spend a little more time playing with it in future.

Assuming the rest of the conference carried on the way it started, I would say it was a big success.

Perth:

The flight to Perth was pretty straight forward and I got to the apartment with no dramas. I took a walk over to the conference venue, which is significantly further than I thought. It took be about 60 minutes to walk it at a pace, with no bag in the (relatively speaking) cool evening air. I’m not sure what it will be like in the summer sun.

At 06:00 this morning I went out for a run along the river, then made myself look like a freak by doing sprints on the way back. Nothing like the sight of a fat sweaty bloke panting like a dog to turn heads. The locals were out in force, walking, running and doing boot camps on the banks of the river. Even at that time is was very sunny. I think it would be damn near lethal to try it at midday. The flies and mossies were out in force. If nothing else the swatting and endless ticks you develop when they fly round your face and ears helps you burn more calories. :)

Cheers

Tim…




Very Cool Yet Very Dense! Low Wattage HP Technology in Project Moonshot. By The Way, Large Node Counts Require A Symmetrical Architecture.

HP has partnered with Calxeda to produce early samples of 4U chassis containing 288 Systems on Chip servers.

I care about this sort of systems offering because I espouse the symmetrical Massively Parallel Processing (MPP) computing paradigm.

Here’s a nice quote from The Register:

[…a half rack of Redstone machines and their external switches implementing 1,600 server nodes has 41 cables, burns 9.9 kilowatts, and costs $1.2m.

A more traditional x86-based cluster doing the same amount of work would only require 400 two-socket Xeon servers, but it would take up 10 racks of space, have 1,600 cables, burn 91 kilowatts, and cost $3.3m.

Shared Nothing / Shared Everything?
Did you notice I mentioned symmetrical MPP in this post? EMC Greenplum Database is a symmetrical MPP software product. This means all code can and does run on all available CPU units. There is no arbitrary cut off point where some CPUs must run certain code and other CPUs (in dedicated servers) can run certain code. That would be an asymmetrical MPP and it is impossible to handle data flow in a balanced matter if some CPUs must run some code and others cannot. Please allow me to quote myself:

The scalability of an MPP is solely related to whether it is symmetrical or asymmetrical.

So what about shared-disk? The scalability of an MPP has nothing to do with whether the disks are accessed via shared-disk or dedicated (non-shared) plumbing. Oracle Real Application Clusters scales DW/BI/Analytics workloads fantastically and it is a shared-disk architecture. However, coupling Real Application Clusters with Exadata Smart Scan is where the asymmetrical attributes are introduced.

The shared-disk versus shared-nothing argument is old, tired and irrelevant. Interestingly, I part ways with my colleagues here in EMC Greenplum on that matter. If you see literature that props the shared-nothing element of Greenplum please bear in mind that it is my personal assertion that shared versus shared-nothing is not a scalability topic related to DW/BI/Analytics workloads. To put it another way, it is my personal campaign and I’ll be blogging soon on that matter. Oh, I forgot to mention that I’m right on the matter (smiley).

Little Things Doth Crabby Make?
I don’t want to draw attention to the lack of care for Electrostatic discharge in the handling of components in the following video because I’m too excited having finally seen things that I’ve been anxiously anticipating for quite some time. So, no, I won’t make this an installment in the Little Things Doth Crabby Make series. HP most likely uses ESD wrist-bands when they are not producing a video (smiley).

The Disclaimer
Please take a gander at the upper right hand corner of this page. You’ll see the disclaimer that spells out the fact that these are my personal words and thoughts. I am not blogging about any EMC business in this post. I’m simply blogging about low-wattage general-purpose servers—something I’m very interested in.



Filed under: oracle

IOT Part 6 – Inserts and Updates Slowed Down (part A)

<..IOT1 – the basics
<….IOT2 – Examples and proofs
<……IOT3 – Significantly reducing IO
<……..IOT4 – Boosting Buffer Cache efficiency
<……….IOT5 – Primary Key Drawback
…………>IOT6(B) – OLTP Inserts

A negative impact of using Index Organized Tables is that inserts are and updates can be significantly slowed down. This post covers the former and the reasons why – and the need to always run tests on a suitable system. (I’m ignoring deletes for now – many systems never actually delete data and I plan to cover IOTs and delete later)

Using an IOT can slow down insert by something like 100% to 1000%. If the insert of data to the table is only part of a load process, this might result in a much smaller overall impact on load, such as 25%. I’m going to highlight a few important contributing factors to this wide impact spread below.

If you think about it for a moment, you can appreciate there is a performance impact on data creation and modification with IOTs. When you create a new record in a normal table it gets inserted at the end of the table (or perhaps in a block marked as having space). There is no juggling of other data.
With an IOT, the correct point in the index has to be found and the row has to be inserted at the right point. This takes more “work”. The inserting of the new record may also lead to an index block being split and the extra work this entails. Similar extra work has to be carried out if you make updates to data that causes the record to move within the IOT.
Remember, though, that an IOT is almost certainly replacing an index on the heap table which, unless you are removing indexes before loading data and recreating them after, would have to be maintained when inserting into the Heap table. So some of the “overhead” of the IOT would still occur for the heap table in maintaining the Primary Key index. Comparing inserts or updates between a heap table with no indexes and an IOT is not a fair test.

For most database applications data is generally written once, modified occasionally and read many times – so the impact an IOT has on insert/update is often acceptable. However, to make that judgement call you need to know

  • what the update activity is on the data you are thinking of putting into an IOT
  • the magnitude of the impact on insert and update for your system
  • the ratio of read to write.

There is probably little point putting data into an IOT if you constantly update the primary key values (NB see IOT-5 as to why an IOT’s PK columns might not be parts of a true Primary Key) or populate previously empty columns or hardly ever read the data.

There is also no point in using an IOT if you cannot load the data fast enough to support the business need. I regularly encounter situations where people have tested the response of a system once populated but fail to test the performance of population.

Now to get down to the details. If you remember the previous posts in this thread (I know, it has been a while) then you will remember that I create three “tables” with the same columns. One is a normal heap table, one is an Index Organized Table and one is a partitioned Index Organized Table, partitioned into four monthly partitions. All tables have two indexes on them, the Primary Key index (which is the table in the case of the IOTs) and another, roughly similar index, pre-created on the table. I then populate the tables with one million records each.

These are the times, in seconds, to create 1 million records in the the HEAP and IOT tables:

                  Time in Seconds
Object type         Run_Normal
------------------  ----------
Normal Heap table        171.9  
IOT table               1483.8

This is the average of three runs to ensure the times were consistent. I am using Oracle V11.1 on a machine with an Intel T7500 core 2 Duo 2.2GHz, 2GB memory and a standard 250GB 5000RPM disk. The SGA is 256MB and Oracle has allocated around 100MB-120MB to the buffer cache.

We can see that inserting the 1 million rows into the IOT takes 860% the time it does with a heap table. That is a significant impact on speed. We now know how large the impact is on Insert of using an IOT and presumably it’s all to do with juggling the index blocks. Or do we?

This proof-of-concept (POC) on my laptop {which you can also run on your own machine at home} did not match with a proof-of-concept I did for a client. That was done on V10.2.0.3 on AIX, on a machine with 2 dual-core CPUS with hyper-threading (so 8 virtual cores), 2GB SGA and approx 1.5GB buffer cache, with enterprise-level storage somewhere in the bowels of the server room. The results on that machine to create a similar number of records were:

                  Time in Seconds
Object type         Run_Normal
------------------  ----------
Normal Heap table        152.0  
IOT table                205.9

In this case the IOT inserts required 135% the time of the Heap table. This was consistent with other tests I did with a more complex indexing strategy in place, the IOT overhead was around 25-35%. I can’t go into too much more detail as the information belongs to the client but the data creation was more complex and so the actual inserts were only part of the process – this is how it normally is in real life. Even so, the difference in overhead between my local-machine POC and the client hardware POC is significant, which highlights the impact your platform can have on your testing.

So where does that leave us? What is the true usual overhead? Below are my more full results from the laptop POC.

                        Time in Seconds
Object type         Run_Normal    Run_quiet    Run_wrong_p
------------------  ----------    ---------    -----------
Normal Heap table        171.9        81.83         188.27  
IOT table               1483.8      1055.35        1442.82
Partitioned IOT          341.1       267.83         841.22 

Note that with the partitioned IOT the creation took 341 second, the performance ratio to a heap table is only 198% and is much better than the normal IOT. Hopefully you are wondering why!

I’m running this test on a windows laptop and other things are going on. The timings for Run_Quiet are where I took steps to shut down all non-essential services and applications. This yielded a significant increase for all three object types but the biggest impact was on the already-fastest Heap table.

The final set of figures is for a “mistake”. I created the partitions wrong such that half the data went into one partition and the rest into another and a tiny fraction into a third, rather than being spread over 4 partitions evenly. You can see that the Heap and normal IOT times are very similar to the Run_Normal results (as you would expect as these test are the same) but for the partitioned IOT the time taken is half way towards the IOT figure.

We need to dig into what is going on a little further to see where the effort is being spent, and it turns out to be very interesting. During my proof-of-concept on the laptop I grabbed the information from v$sesstat for the session before and after each object creation so I could get the figures just for the loads. I then compared the stats between each object population and show some of them below {IOT_P means Partitioned IOT}.

STAT_NAME                            Heap    	IOT	        IOT P
------------------------------------ ---------- -------------  -----------
CPU used by this session                  5,716         7,222        6,241
DB time                                  17,311       148,866       34,120
Heap Segment Array Inserts               25,538            10           10

branch node splits                           25            76           65
leaf node 90-10 splits                      752         1,463        1,466
leaf node splits                          8,127        24,870       28,841

consistent gets                          57,655       129,717      150,835
cleanout - number of ktugct calls        32,437        75,201       88,701
enqueue requests                         10,936        28,550       33,265

file io wait time                     4,652,146 1,395,970,993  225,511,491
session logical reads                 6,065,365     6,422,071    6,430,281
physical read IO requests                   123        81,458        3,068
physical read bytes                   2,097,152   668,491,776   25,133,056
user I/O wait time                          454       139,585       22,253
hot buffers moved to head of LRU         13,077       198,214       48,915
free buffer requested                    64,887       179,653      117,316

The first section shows that all three used similar amounts of CPU, the IOT and partitioned IOT being a little higher. Much of the CPU consumed was probably in generating the fake data.The DB Time of course pretty much matches the elapsed time well as the DB was doing little else.
It is interesting to see that the Heap insert uses array inserts which of course are not available to the IOT and IOT_P as the data has to be inserted in order. {I think Oracle inserts the data into the heap table as an array and then updates the indexes for all the entries in the array – and I am only getting this array processing as I create the data as an array from a “insert into as select” type load. But don’t hold me to any of that}.

In all three cases there are two indexes being maintained but in the case of the IOT and IOT_P, the primary key index holds the whole row. This means there has to be more information per key, less keys per block and thus more blocks to hold the same data {and more branch blocks to reference them all}. So more block splits will be needed. The second section shows this increase in branch node and leaf block splits. Double the branch blocks and triple the leaf block splits. This is probably the extra work you would expect for an IOT. Why are there more leaf block splits for the partitioned IOT? The same data of volume ends up taking up more blocks in the partitioned IOT – 200MB for the IOT_P in four partitions of 40-60MB as opposed to a single 170MB for the IOT. The larger overall size of the partition is just due to a small overhead incurred by using partitions and also a touch of random fluctuation.

So for the IOT and IOT_P there is about three times the index-specific work being done and a similar increase in related statistics such as enqueues, but not three times as it is not just index processing that contribute to these other statistics. However, the elapsed time is much more than three times as much. Also, the IOT_P is doing more index work than the IOT but it’s elapsed time is less. Why?

The fourth section shows why. Look at the file io wait times. This is the total time spent waiting on IO {in millionths of a second} and it is significantly elevated for the IOT and to a lesser degree for the IOT_P. Physical IO is generally responsible for the vast majority of time in any computer system where it has not been completely avoided.
Session logical reads are only slightly elevated, almost negligably so but the number of physical reads to support it increases from 123 for the Heap table insert to 81,458 for the IOT and 3,068 for the IOT_P. A clue as to why comes from the hot buffers moved to head of LRU and free buffer requested statistics. There is a lot more activity in moving blocks around in the buffer cache for the IOT and IOT_P.

Basically, for the IOT, all the blocks in the primary key segment are constantly being updated but eventually they won’t all fit in the block buffer cache – remember I said the IOT is eventually 170MB and the buffer cache on my laptop is about 100MB – so they are flushed down to disk and then have to be read back when altered again. This is less of a problem for the IOT_P as only one partition is being worked on at a time (the IOT_P is partitioned on date and the data is created day by day) and so more of it (pretty much all) will stay in memory between alterations. The largest partition only grows to 60MB and so can be worked on in memory.
For the heap, the table is simply appended to and only the indexes have to be constantly updated and they are small enough to stay in the block buffer cache as they are worked on.

This is why when I got my partitioning “wrong” the load took so much longer. More physical IO was needed as the larger partition would not fit into the cache as it was worked on – A quick check shows that logical reads and in fact almost all statistics were very similar but 26,000 IO requests were made (compared to 81,458 for the IOT and 3,068 for the correct IOT_P).

Of course, I set my SGA size and thus the buffer cache to highlight the issue on my laptop and I have to say even I was surprised by the magnitude of the impact. On the enterprise-level system I did my client’s proof of concept on, the impact on insert was less because the buffer cache could hold the whole working set, I suspect the SAN had a considerable cache on it, there was ample CPU resource to cope with the added latching effort and the time taken to actually create the data inserted was a significant part of the workload, reducing the overall impact of the slowness caused by the IOT.

{Update, in This little update I increase my block buffer cache and show that physical IO plummets and the IOT insert performance increases dramatically}.

This demonstrates that a POC, especially one for what will become a real system, has to be a realistic volume on realistic hardware.
For my client’s POC, I still did have to bear in mind the eventual size of the live working set and the probably size of the live block buffer cache and make some educated guesses.

It also explains why my “run_quiet” timings showed a greater benefit for the heap table than the IOT and IOT_P. A windows machine has lots of pretty pointless things running that take up cpu and a bit of memory, not really IO so much. I reduced the CPU load and it benefits activity that is not IO, so it has more impact on the heap table load. Much of the time for the IOT and IOT_P is taken hammering the disk and that just takes time.

So, in summary:

  • Using an IOT increases the index block splitting and, in turn, enqueues and general workload. The increase is in proportion to the size of the IOT compared to the size of the replaced PK.
  • The performance degredation across the whole load process may well be less than 50% but the only way to really find out is to test
  • You may lose the array processing load that may benefit a heap table load if you do the load via an intermediate table.
  • With an IOT you may run into issues with physical IO if the segment (or part of the segment) you are loading into cannot fit into the buffer cache (This may be an important consideration for partitioning or ordering of the data loaded)
  • If you do a proof of concept, do it on a system that is as similar to the real one as you can
  • Just seeing the elapsed time difference between test is sometimes not enough. You need to find out where that extra time is being spent

I’ve thrown an awful lot at you in this one post, so I think I will stop there. I’ve not added the script to create the test tables here, they are in IOT-5 {lacking only the grabbing of the v$sesstat information}.