Search

Top 60 Oracle Blogs

Recent comments

Oakies Blog Aggregator

TPC-Hをチューニングする

TPC-Hベンチマークの続き
現在のディスク転送量は420MB/s強。

1セッションでこれ以上望めないのであれば、効率的な読み込み量を追求する。

オブジェクトを特定する

上の時系列スパイクチャートから、レスポンス要因はUser I/O

下のオブジェクトごとのDBアクティビティから、
LINEITEMのdirect path readが全体の78%の負荷が分かる。

だから、LINEITEMのREAD量を減らす、パーティショニング、コンプレス機能が効果的なことが分かる。

念のためにスパイクが飛び出た時のNo1 SQLのPlanを見てみると:
なるほど、LINEITEMへのアクセス負荷がポイントだな。。。

最後に、
Embarcadero® DB Optimizer™ XE を初めて使ってみたのだけど、
ものすごく簡単でした。

Oracle 11g R1/R2 Real Application Clusters Handbook… Review in process

A few years back I had the pleasure of meeting Ben Prusinski at Oracle Open world.  Ben has published several books on Oracle Internals and Debugging.  In his most recent project he teams with Guenadi Jilevski and Syed Jaffer Hussain to write
“Oracle 11g R1/R2 Real Application Clusters Handbook”.  I have only just cracked the cover but already know this will be a good resource for the beginner and seasoned professional.  If you want to look for yourself, you can preview or order the book through PACKT publishing.

Filed under: Oracle Tagged: book, Oracle, review

Metric Collection Error

I’m in the later stages of implementing Oracle Enterprise Manager 11g for a customer. Right now there are rather too many metric collection errors for either myself or the customer to be truly happy with. There is remarkably little other than this post by Oracle’s Werner.de.Guyter on how to deal with these. Unfortunately whilst Werner’s [...]

APEX 4.0 Websheets are Cool!

Wow! I just finished presenting a webinar on APEX 4.0 New Features and Websheets. Websheets allow non-technical people to copy-and-paste spreadsheets into APEX and then manipulate them using sophisticated database features; so easy… You can see the webinar slides at http://www.parklane.com.au or http://www.kingtraining.com

Oracle RAC on VirtualBox…

With the recent news that the latest version of VirtualBox now supports shared disks, I thought I better give it a go and see if I could do a RAC installation on it. The good news is it worked as expected. You can see a quick run through here:

This is pretty good news as that was the last feature that tied me to VMware Server. I’ve now moved pretty much everything I do at home on to VirtualBox and it’s working fine.

It’s worth taking a little time looking at the VBoxManage command line. Some of the operations, like creating the shared disks, have to be done from the command line at the moment. It’s also handy for running VMs in headless mode if you don’t want the GUI screen visible all the time.

Cheers

Tim…

Unseen Academicals…

Unseen Academicals is the 37th book in the Discworld series. When I first started reading the series I found Terry Pratchett‘s writing style a little strange. Having had a long break from reading his stuff, switching back was a little difficult at first. Fortunately the characters are brilliant, so you get drawn back in fairy quickly. The first 50 pages were a little tricky, but after that it was just like home again.

The wizards of the Unseen University are forced to make a decision between playing football or eating less, so obviously they pick football. There are a few new characters introduced, including Nutt, who is awesome.

If you like the others you will love this.

Cheers

Tim…

PS. Football refers to real football, not that American stuff. :)

Oracle Exadata and Netezza TwinFin Compared – An Engineer’s Analysis

There seems to be little debate that Oracle’s launch of the Oracle Exadata Storage Server and the Sun Oracle Database Machine has created buzz in the database marketplace. Apparently there is so much buzz and excitement around these products that two competing vendors, Teradata and Netezza, have both authored publications that contain a significant amount of discussion about the Oracle Database with Real Application Clusters (RAC) and Oracle Exadata. Both of these vendor papers are well structured but make no mistake, these are marketing publications written with the intent to be critical of Exadata and discuss how their product is potentially better. Hence, both of these papers are obviously biased to support their purpose. My intent with this blog post is simply to discuss some of the claims, analyze them for factual accuracy, and briefly comment on them. After all, Netezza clearly states in their publication: The information shared in this paper is made available in the spirit of openness. Any inaccuracies result from our mistakes, not an intent to mislead. In the interest of full disclosure, my employer is Oracle Corporation, however, this is a personal blog and what I write here are my own ideas and words (see [...]

Joins – HJ

In the second note on my thesis that “all joins are nested loop joins with different startup costs” I want to look at hash joins, and I’ll start by going back to the execution plan I posted on “Joins – NLJ”. --------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| [...]

Oracle Exadata - Storage Indexes

Wow! - I was stunned a few days ago by Exadata’s Storage Indexes. I was doing a little testing to see what could be offloaded and what couldn’t (more on that later). I have a 384 million row table I was using on our Exadata Quarter Rack test system. A single threaded full scan with no where clause on the table takes about 24 seconds (ho hum - it’s amazing how quickly we become numbed to the outstanding performance ). So imagine my surprise when I decided to check and see how many nulls I had in a column and the result came back in .07 seconds. Wow! I thought it was a bug! Turns out it was the Storage Indexes. Alright already, I’ll show you some output from the system (by the way, as usual I used a couple of scripts: fsx.sql and mystats.sql):

SYS@LABRAT1> select count(*) from kso.skew3;
 
  COUNT(*)
----------
 384000048
 
1 row selected.
 
Elapsed: 00:00:24.06
SYS@LABRAT1> /
 
  COUNT(*)
----------
 384000048
 
1 row selected.
 
Elapsed: 00:00:23.94
 
SYS@LABRAT1> set timing off
SYS@LABRAT1> @mystats
Enter value for name: %storage%
 
NAME                                                                             VALUE
---------------------------------------------------------------------- ---------------
cell physical IO bytes saved by storage index                                        0
 
1 row selected.
 
SYS@LABRAT1> set timing on
SYS@LABRAT1> select count(*) from kso.skew3 where col1 is null;
 
  COUNT(*)
----------
        12
 
1 row selected.
 
Elapsed: 00:00:00.07
SYS@LABRAT1> set timing off
SYS@LABRAT1> @fsx
Enter value for sql_text: select count(*) from kso.skew3 where col1 is null
Enter value for sql_id: 
Enter value for inst_id: 
 
 
 INST SQL_ID         CHILD  PLAN_HASH      EXECS     AVG_ETIME      AVG_LIO    AVG_PIO AVG_PX OFFLOADABLE IO_SAVED_% SQL_TEXT
----- ------------- ------ ---------- ---------- ------------- ------------ ---------- ------ ----------- ---------- ----------------------------------------
    1 0u1q4b7puqz6g      0 2684249835          5           .09    1,956,226  1,956,219      0 Yes             100.00 select count(*) from kso.skew3 where col
 
1 row selected.
 
SYS@LABRAT1> @mystats
Enter value for name: %storage%
 
NAME                                                                             VALUE
---------------------------------------------------------------------- ---------------
cell physical IO bytes saved by storage index                              16012763136
 
1 row selected.

So apparently Storage Indexes are NULL aware. Very cool! This may have repercussions regarding design and implementation decisions. There are systems that don’t use NULLs in order to insure that they can access records via B-Tree indexes (which as you’re aware do not store NULLs). SAP for example uses a single space character instead of NULLs.

The Sun / Oracle Exadata V2 has a number of features that get a lot of attention because of the dramatic performance gains they can provide. Smart Scan / Offloading, Hybrid Columnar Compression (HCC), Flash Cache, Infiniband Fabric, etc… Exadata Storage Indexes are not really high on the list of things that the sales guys tout, but they can provide a very significant performance benefit. One of the reasons they aren’t talked about much is that there is nothing really that you can do to affect how they work. There is no tuning them in other words. They just work. However, understanding how they work can certainly help you make better decisions about how to build systems and how to write SQL statements. I had an online discussion with someone a few weeks ago about Storage Indexes and the comment was made that we didn’t really need to understand them, because there was nothing we could do about them anyway. My thought was that we couldn’t do anything about the fact that normal B-Tree indexes don’t store NULLs either, but it is pretty important information to know when you are designing a system or trying to access data using a B-Tree index.

I found a very good description of Storage Indexes in an Oracle White Paper:


Oracle Sun Database Machine Application Best Practices for Data Warehousing [ID 1094934.1]

Exdata Storage Index

In Oracle Database 11gR2 a new feature called the Exadata Storage Index has been introduced. The storage index contains a summary of the data distribution on the disk and provides an additional method to eliminate unnecessary disk IO.

Each disk in the Exadata storage cell is divided into equal sized pieces called storage regions (default 1MB). There is an index entry for every storage regions (1MB of data stored on disk). Each entry contains the minimum and maximum value for columns seen in ‘where’ clause predicates. Information for up to 8 columns can be stored. The index is then used to eliminate disk IO by identifying which storage regions don’t match the ‘where’ clause of a query. For example consider the following query

The storage index is stored in the memory on each of the Exadata storage cells and is created and maintained transparently. However, if a storage cell is shutdown or rebooted the storage index will be lost from memory and will be recreated on subsequent accesses to the data after the cell has been brought back online.

The storage index can be used as an additional partitioning strategy if the data to be loaded is sorted on the additional partitioning column. For example, in the TPC-H benchmark, the Lineitem table is range partitioned on L_PARTKEY but several of the queries have where clause predicates on the SHIPDATE column. By loading the data in the Lineitem table sorted by SHIPDATE, the storage index can be used to eliminate data based on the SHIPDATE.

In order for the data to be fully sorted during the load you MUST load using an INSERT /*+ APPEND */ statement with an order by clause on the column you want indexed. Before doing the load you should examine the execution plan to check that the sort and the load are being done by the same set of parallel server processes and that no data redistribution takes place between the sort and the load steps in the plan.

The same document also contained this nifty graphic. You get the idea though right? Keep a min/max value for each 1M storage region.

The idea of using the Storage Index as an additional partitioning strategy is very interesting. But as you can see from my example, Storage Indexes can also provide a huge benefit for highly skewed data. This is an area that Oracle has traditionally struggled with. The typical scenario involves using an index for highly selective values and using a table scan for non-selective values. This choice is complicated by Oracle’s default bind variable peeking behavior. With Exadata, we may find that we are better off just ignoring the issue altogether. The full table scan will pretty much always provide the best performance possible, without the possibility of bind variable peeking or the overhead of a normal B-Tree index (that is to say, we probably won’t create the index at all). The storage index will provide very quick access to the highly selective values while the normal Exadata Offloading will handle the non-selective values better than we ever thought possible.

By the way, Netezza has a very similar “feature” called zone maps, although they apparently use a 3M storage unit. The larger unit means that it would be much more sensitive to how the data is sorted on disk, at least for the “virtual” partition elimination behavior.

One last thing. There is a way to turn off Storage Indexes on Exadata via a hidden parameter (_kcfis_storageidx_disabled), if you should ever want to do that (just for testing the feature perhaps).

Here’s a quick example:

SYS@LABRAT1> select name, value from v$mystat s, v$statname n where n.statistic# = s.statistic# and name like '%storage%';
 
NAME                                                                             VALUE
---------------------------------------------------------------------- ---------------
cell physical IO bytes saved by storage index                                        0
 
Elapsed: 00:00:00.01
SYS@LABRAT1> alter session set "_kcfis_storageidx_disabled"=true -- turn them off
  2  /
 
Session altered.
 
Elapsed: 00:00:00.00
SYS@LABRAT1> select count(*) from kso.skew3 where col1 is null;
 
  COUNT(*)
----------
        12
 
Elapsed: 00:00:13.91
SYS@LABRAT1> select name, value from v$mystat s, v$statname n where n.statistic# = s.statistic# and name like '%storage%';
 
NAME                                                                             VALUE
---------------------------------------------------------------------- ---------------
cell physical IO bytes saved by storage index                                        0
 
Elapsed: 00:00:00.00
SYS@LABRAT1> alter session set "_kcfis_storageidx_disabled"=false -- back to default which turns it on
  2  /
 
Session altered.
 
Elapsed: 00:00:00.00
SYS@LABRAT1> select count(*) from kso.skew3 where col1 is null;
 
  COUNT(*)
----------
        12
 
Elapsed: 00:00:00.07
SYS@LABRAT1> select name, value from v$mystat s, v$statname n where n.statistic# = s.statistic# and name like '%storage%';
 
NAME                                                                             VALUE
---------------------------------------------------------------------- ---------------
cell physical IO bytes saved by storage index                              16012763136
 
Elapsed: 00:00:00.01

As always, your comments are welcomed.

SIOUG Conference in Portoroz

This is a short note to point out that I just added to the Public Appearances page the next conference organized by the Slovenian Oracle User Group (SIOUG) in Portoroz. It will take place on September 27-29. My talk, entitled “Join Techniques”, is based on chapter 10 of my book. It will be a shorter [...]