Search

Top 60 Oracle Blogs

Recent comments

performance

Bind Variable Peeking – execution plan inefficiency

In my Beyond Oracle Wait interface article I troubleshooted a test case where an execution plan somehow went “crazy” and started burning CPU, lots of logical IOs and the query never completed.

I have uploaded the test case I used to my new website, to a section where I will upload some of my demo scripts which I show at my seminars (and people can download & test these themselves too):

http://tech.e2sn.com/oracle-seminar-demo-scripts

Basically what I do is this:

  1. I run the query with bind variable values where only a handful of rows match the filter condition. Thus Oracle picks nested loop join (and indexed access path)
  2. Then I run the same query with different bind values, where a lot of rows match the filter condition. Oracle reuses existing execution plan (with nested loops!!!). Oracle ends up looping through a lot of blocks again and again (because nested loop visits the “right” side of the join once for every row coming from the “left” side of the join).

Using nested loops over lots of rows is a sure way to kill your performance.

And an interesting thing with my script is that the problem still happens in Oracle 11.1 and 11.2 too!

Oracle 11g has Adaptive Cursor Sharing, right? This should take care of such a problem, right? Well no, adaptive bind variable peeking is a reactive technique – it only kicks in after the problem has happened!

Craig Shallahamer is now blogging!

Craig Shallahamer is now blogging… check out his initial post here!

Earlier I had a few mail exchanges with him and got to check his new work.. called the Stress Identifier. Hmm.. so how it will be helpful? Well if you have already characterized your workload, you may want to definitively determine which internal database activity is stressing your CPU, IO, and network subsystems. So this tool will avoid the guess work of selecting that “unit of work”… check out the link for more details…

Workload characterization using DBA_HIST tables and kSar

Been busy these past few days..

Well, this post is about taking advantage of the built in data store that started in 10gR1.. yeah it’s AWR, and it’s very much like a “Statspack on steroids” (got that from Kyle Hailey’s modules). What’s exciting about 10g and above is we could utilize the “Time Model” and the OSSTAT view and use them together with the SYSTAT view to have a far better workload information when going through all the SNAP_IDs.

I have this “little” issue before of getting lazy when generating multiple AWR reports… yeah it’s just so daunting, you’ll get tired just by generating those reports by hand and the catch is… you’ve not yet analyzed anything yet.. :p   (but yeah, you’re lucky if you’ve got access to the OEM performance page)

The Core Performance Fundamentals Of Oracle Data Warehousing – Partitioning

[back to Introduction] Partitioning is an essential performance feature for an Oracle data warehouse because partition elimination (or partition pruning) generally results in the elimination of a significant amount of table data to be scanned. This results in a need for less system resources and improved query performance. Someone once told me “the fastest I/O is the one that never happens.” This is precisely the reason that partitioning is a must for Oracle data warehouses – it’s a huge I/O eliminator. I frequently refer to partition elimination as the anti-index. An index is used to find a small amount data that is required; partitioning is used to eliminate vasts amounts of data that is not required. Main Uses For Partitioning I would classify the main reasons to use partitioning in your Oracle data warehouse into these four areas: Data Elimination Partition-Wise Joins Manageability (Partition Exchange Load, Local Indexes, etc.) Information Lifecycle Management (ILM) Partitioning Basics The most common partitioning design pattern found in Oracle data warehouses is to partition the fact tables by range (or interval) on the event date/time column. This allows for partition elimination of all the data not in the desired time window in queries. For example: If I have a [...]

The Core Performance Fundamentals Of Oracle Data Warehousing – Table Compression

[back to Introduction] Editor’s note: This blog post does not cover Exadata Hybrid Columnar Compression. The first thing that comes to most people’s mind when database table compression is mentioned is the savings it yields in terms of disk space. While reducing the footprint of data on disk is relevant, I would argue it is the lesser of the benefits for data warehouses. Disk capacity is very cheap and generally plentiful, however, disk bandwidth (scan speed) is proportional to the number of spindles, no mater what the disk capacity and thus is more expensive. Table compression reduces the footprint on the disk drives that a given data set occupies so the amount of physical data that must be read off the disk platters is reduced when compared to the uncompressed version. For example, if 4000 GB of raw data can compress to 1000 GB, it can be read off the same disk drives 4X as fast because it is reading and transferring 1/4 of the data off the spindles (relative to the uncompressed size). Likewise, table compression allows for the database buffer cache to contain more data without having to increase the memory allocation because more rows can be stored [...]

50+ SQL Performance Optimization scenarios

Before the year ends I’d like to share some good stuff…

I have never seen a huge compilation of SQL tuning tips or rewrite scenarios (with test cases) and got them only on OracleFans forum… ooops… so you can’t read Chinese? try this translated version, whew.. good thing Google has this translate service that I am able to read in Chinese.. </p />
</p></div>

    	  	<div class=

The Core Performance Fundamentals Of Oracle Data Warehousing – Balanced Hardware Configuration

[back to Introduction] If you want to build a house that will stand the test of time, you need to build on a solid foundation. The same goes for architecting computer systems that run databases. If the underlying hardware is not sized appropriately it will likely lead to people blaming software. All too often I [...]

My Whole System Is Slow. Now What?

At CMG'09 a couple of weeks ago, I presented "Measuring Response Times of Code on Oracle Systems." The paper for this presentation was a subset of "For Developers: Making Friends with the Oracle Database." In the presentation, I spent a few minutes talking about why to measure response times in Oracle, and then I spent a lot of minutes talking about how. As usual, I focused heavily on the importance of measuring response times of individual business tasks executed by individual end users.

Kernel NFS fights back… Oracle throughput matches Direct NFS with latest Solaris improvements

After my recent series of postings, I was made aware of David Lutz’s blog on NFS client performance with Solaris.  It turns out that you can vastly improve the performance of NFS clients using a new parameter to adjust the number of client connections.

root@saemrmb9> grep rpcmod /etc/system
set rpcmod:clnt_max_conns=8

This parameter was introduced in a patch for various flavors of Solaris.  For details on the various flavors, see David Lutz’s recent blog entry on improving NFS client performance.  Soon, it should be the default in Solaris making out-of-box client performance scream.

DSS query throughput with Kernel NFS

I re-ran the DSS query referenced in my last entry and now kNFS matches the throughput of dNFS with 10gigE.


Kernel NFS throughput with Solaris 10 Update 8 (set rpcmod:clnt_max_conns=8)

This is great news for customers not yet on Oracle 11g.  With this latest fix to Solaris, you can match the throughput of Direct NFS on older versions of Oracle.  In a future post, I will explore the CPU impact of dNFS and kNFS with OLTP style transactions.

Posted in Oracle, Storage Tagged: 11g, 7410, analytics, database, dNFS, NAS, NFS, Oracle, performance, Solaris, Sun, tuning

Direct NFS vs Kernel NFS bake-off with Oracle 11g and Solaris… and the winner is

NOTE::  Please see my next entry on Kernel NFS performance and the improvements that come with the latest Solaris.

==============

After experimenting with dNFS it was time to do a comparison with the “old” way.  I was a little surprised by the results, but I guess that really explains why Oracle decided to embed the NFS client into the database :)

bake-off with OLTP style transactions

This experiment was designed to load up a machine, a T5240, with OLTP style transactions until no more CPU available.  The dataset was big enough to push about 36,000 IOPS read and 1,500 IOPS write during peak throughput.  As you can see, dNFS performed well which allowed the system to scale until DB server CPU was fully utilized.   On the other hand, Kernel NFS throttles after 32 users and is unable to use the available CPU to scale transactional throughput.

lower cpu overhead yields better throughput

A common measure for benchmarks is to figure out how many transactions per CPU are possible.  Below, I plotted the CPU content needed for a particular transaction rate.  This chart shows the total measured CPU (user+system) to for a given TPS rate.


dNFS vs kNFS (TPS/CPU)

As expected, the transaction rate per CPU is greater when using dNFS vs kNFS.  Please do note, that this is a T5240 machine that has 128 threads or virtual CPUs.  I don’t want to go into semantics of sockets, cores, pipelines, and threads but thought it was at least worth noting.  Oracle sees a thread of a T5240 as a CPU, so that is what I used for this comparison.

silly little torture test

When doing the OLTP style tests with a normal sized SGA, I was not able to fully utilize the 10gigE interface or the Sun 7410 storage.   So, I decided to do a silly little micro benchmark with a real small SGA.  This benchmark just does simple read-only queries that essentially result in a bunch of random 8k IO.  I have included the output from the Fishworks analytics below for both kNFS and dNFS.


Random IOPS with kNFS and Sun Open Storage


Random IOPS with dNFS and Sun 7410 open storage

I was able to hit ~90K IOPS with 729MB/sec of throughput with just one 10gigE interface connected to Sun 7140 unified storage.  This is an excellent result with Oracle 11gR2 and dNFS for a random test IO test… but there is still more bandwidth available.  So, I decided to do a quick DSS style query to see if I could break the 1GB/sec barrier.

===dNFS===
SQL> select /*+ parallel(item,32) full(item) */ count(*) from item;
 COUNT(*)
----------
 40025111
Elapsed: 00:00:06.36

===kNFS===
SQL> select /*+ parallel(item,32) full(item) */ count(*) from item;
 COUNT(*)
----------
 40025111

Elapsed: 00:00:16.18

kNFS table scan


dNFS table scan

Excellent, with a simple scan I was able to do 1.14GB/sec with dNFS more than doubling the throughput of kNFS.

configuration notes and basic tuning

I was running on a T5240 with Solaris 10 Update 8.

$ cat /etc/release
Solaris 10 10/09 s10s_u8wos_08a SPARC
Copyright 2009 Sun Microsystems, Inc.  All Rights Reserved.
Use is subject to license terms.
Assembled 16 September 2009

This machine has the a built-in 10gigE interface which uses multiple threads to increase throughput.  Out of the box, there is very little to tuned as long as you are on Solaris 10 Update 8.  I experimented with various settings, but found that only basic tcp settings were required.

ndd -set /dev/tcp tcp_recv_hiwat 400000
ndd -set /dev/tcp tcp_xmit_hiwat 400000
ndd -set /dev/tcp tcp_max_buf 2097152
ndd -set /dev/tcp tcp_cwnd_max 2097152

Finally, on the storage front, I was using the Sun Storage 7140 Unified storage server as the NFS server for this test.  This server was born out of the Fishworks project and is an excellent platform for deploying NFS based databases…. watch out NetApp.

what does it all mean?

dNFS wins hands down.  Standard kernel NFS only essentially allows one client per “mount” point.  So eventually, we see data queued to a mount point.  This essentially clips the throughput far too soon.   Direct NFS solves this problem by having each Oracle shadow process mount the device directly.  Also with dNFS, all the desired tuning and mount point options are not necessary.  Oracle knows what options are most efficient for transferring blocks of data and configures the connection properly.

When I began down this path of discovery, I was only using NFS attached storage because nothing else was available in our lab… and IO was not initially a huge part of the project at hand.  Being a performance guy who benchmarks systems to squeeze out the last percentage point of performance, I was skeptical about NAS devices.  Traditionally, NAS was limited by slow networks and clumsy SW stacks.   But times change.   Fast 10gigE networks and Fishworks storage combined with clever SW like Direct NFS really showed this old dog a new trick.

Posted in Oracle, Storage Tagged: 11g, 7410, analytics, dNFS, fishworks, NAS, NFS, Oracle, performance, Solaris, Sun