Search

Top 60 Oracle Blogs

Recent comments

Oakies Blog Aggregator

AW-argh

This is another of the blog notes that have been sitting around for several years – in this case since May 2014, based on a script I wrote a year earlier. It makes an important point about “inconsistency” of timing in the way that Oracle records statistics of work done. As a consequence of being first drafted in May 2014 the original examples showed AWR results from 10.2.0.5 and 11.2.0.4 – I’ve just run the same test on 19.3.0.0 to see if anything has changed.

 

[Originally drafted May 2014]: I had to post this as a reminder of how easy it is to forget things – especially when there are small but significant changes between versions. It’s based loosely on a conversation from Oracle-L, but I’m going to work the issue in the opposite order by running some code and showing you the ongoing performance statistics rather than the usual AWR approach of reading the performance stats and trying to guess what happened.

The demonstration needs two sessions to run; it’s based on one session running some CPU-intensive SQL inside an anonymous PL/SQL block with a second another session launching AWR snapshots at carefully timed moments. Here’s the code for the working session:

rem
rem     Script:         awr_timing.sql
rem     Author:         Jonathan Lewis
rem     Dated:          May 2013
rem

alter session set "_old_connect_by_enabled"=true';

create table kill_cpu(n, primary key(n))
organization index
as
select  rownum n
from    all_objects
where   rownum <= 26 -- > comment to avoid wordpress format issue
;

execute dbms_stats.gather_table_stats(user,'kill_cpu')

pause Take an AWR snapshot from another session and when it has completed  press return

declare
        m_ct    number;
begin

        select  count(*) X
        into    m_ct
        from    kill_cpu
        connect by
                n > prior n
        start with
                n = 1
        ;

        dbms_lock.sleep(30);

end;
/

You may recognise an old piece of SQL that I’ve often used as a way of stressing a CPU and seeing how fast Oracle can run. The “alter session” at the top of the code is necessary to use the pre-10g method of running a “connect by” query so that the SQL does a huge number of buffer gets (and “buffer is pinned count” visits). On my current laptop the query takes about 45 seconds (all CPU) to complete. I’ve wrapped this query inside a pl/sql block that then sleeps for 30 seconds.

From the second session you need to launch an AWR snapshot 4 times – once in the pause shown above, then (approximately) every 30 seconds thereafter. The second one should execute while the SQL statement is still running, the third one should execute while the sleep(30) is taking place, and the fourth one should execute after the pl/sql block has ended and the SQL*Plus prompt is visible.

Once you’ve got 4 snapshots you can generate 3 AWR reports. The question to ask then, is “what do the reports say about CPU usage?” Here are a few (paraphrased) numbers – starting with 10.2.0.5 comparing the “Top 5 timed events”, “Time Model”, and “Instance Activity” There are three sets of figures, the first reported while the SQL was still running, the second reported after the SQL statement had completed and the dbms_lock.sleep() call is executing, the last reported after the PL/SQL block has completed. There are some little oddities in the numbers due to backgorund “noise” – but the key points are still clearly visible:

While the SQL was executing

Top 5
-----
CPU Time                       26 seconds

Time Model                               Time (s) % of DB Time
------------------------------------------------- ------------
sql execute elapsed time                     26.9        100.0
DB CPU                                       26.2         97.6

Instance Activity
-----------------
CPU used by this session         0.65 seconds
recursive cpu usage              0.67 seconds

SQL ordered by CPU
------------------
31 seconds reported for both the SQL and PLSQL

During the sleep()

Top 5
-----
CPU Time                        19 seconds

Time Model                               Time (s) % of DB Time
------------------------------------------------- ------------
sql execute elapsed time                     19.0        100.0
DB CPU                                       18.6         98.1

Instance Activity
-----------------
CPU used by this session         0.66 seconds
recursive cpu usage             44.82 seconds

SQL ordered by CPU
------------------
14 seconds reported for both the SQL and PLSQL

After the PL/SQL block ended

Top 5
-----
CPU Time                         1 second

Time Model                               Time (s) % of DB Time
------------------------------------------------- ------------
sql execute elapsed time                      1.4         99.9
DB CPU                                        1.4         99.7

Instance Activity
-----------------
CPU used by this session        44.68 seconds
recursive cpu usage              0.50 seconds

SQL ordered by CPU
------------------
1 second reported for the PLSQL, but the SQL was not reported

Points to notice:

While the SQL was excecuting (and had been executing for about 26 seconds, the Time Model mechanism was recording the work done by the SQL, and the Top N information echoed the Time model CPU figure. At the same time the “CPU used …” Instance Activity Statistics have not recorded any CPU time for the session – and they won’t until the SQL statement completes. Despite this, the “SQL ordered by …” reports double-count in real-time, show the SQL and the PL/SQL cursors as consuming (with rounding errors, presumable) 31 seconds each.

After the SQL execution was over and the session was sleeping the Time model (hence the Top 5) had recorded a further 19 seconds of work. The instance activity, however, has now accumulated 44 seconds of CPU, but only as “recursive CPU usage” (recursive because our SQL was called from with a PL/SQL block), with no “CPU used by this session”. The “SQL ordered by …” figures have recorded the amount of CPU used by both the SQL and PL/SQL in the interval (i.e. 14 seconds – which is a little off) recorded against both.)

After the PL/SQL block has completed the Time Model and the Top 5 report both say that nothing much happened in the interval, but the Instance Activity suddenly reports 44.68 seconds of CPU used by this session – which (roughly speaking) is truish as the PL/SQL block ended and assigned the accumulated recursive CPU usage to the session CPU figure. Finally, when we get down to the “SQL ordered by CPU” the SQL was not reported  – it did no work in the interval – but the PL/SQL block was still reported but only with a generous 1 second of CPU since all it did in the interval was finish the sleep call and tidy up the stack before ending.

Now the same sets of figures for 11.2.0.4 – there’s a lot of similarity, but one significant difference:

While the SQL was executing


Top 5
-----
CPU Time                        26.6 seconds

Time Model                               Time (s) % of DB Time
------------------------------------------------- ------------
sql execute elapsed time                     27.0        100.0
DB CPU                                       26.6         98.5

Instance Activity
-----------------
CPU used by this session         1.09 seconds
recursive cpu usage              1.07 seconds

SQL ordered by CPU
------------------
25.6 seconds reported for both the SQL and PLSQL

During the sleep()

Top 5
-----
CPU Time                        15.1 seconds

Time Model                               Time (s) % of DB Time
------------------------------------------------- ------------
sql execute elapsed time                     15.3         99.8
DB CPU                                       15.1         98.2

Instance Activity
-----------------
CPU used by this session        41.09 seconds
recursive cpu usage             41.03 seconds

SQL ordered by CPU
------------------
14.3 seconds reported for the SQL
13.9 seconds reported for the PLSQL

After the PL/SQL block ended

Top 5
-----
CPU Time                         1.4 seconds

Time Model                               Time (s) % of DB Time
------------------------------------------------- ------------
sql execute elapsed time                      1.5         99.6
DB CPU                                        1.4         95.4

Instance Activity
-----------------
CPU used by this session         1.02 seconds
recursive cpu usage              0.95 seconds

SQL ordered by CPU
------------------
0.5 seconds reported for the PLSQL, and no sign of the SQL

Spot the one difference in the pattern – during the sleep() the Instance Activity Statistic “CPU used by this session” is recording the full CPU time for the complete query, whereas the time for the query appeared only in the “recursive cpu” in the 10.2.0.5 report.

I frequently point out that for proper understanding of the content of an AWR report you need to cross-check different ways in which Oracle reports “the same” information. This is often to warn you about checking underlying figures before jumping to conclusions about “hit ratios”, sometimes it’s to remind you that while the Top 5 might say some average looks okay the event histogram may say that what you’re looking at is mostly excellent with an occasional disaster thrown in. In this blog note I just want to remind you that if you only ever look at one set of figures about CPU usage there are a few special effects (particularly relating to long running PL/SQL / Java / SQL) where you may have to work out a pattern of behaviour to explain unexpectedly large (or small) figures and contradictory figures, The key to the problem is recognising that different statistics may be updated at different stages in a complex process.

Footnote

I doubt if many people still run 11.2.0.4, so I also re-ran the test on 19.3.0.0 before publishing. The behaviour hasn’t changed since 11.2.0.4 although the query ran a little faster, perhaps due to changes in the mechanisms for this type of “connect by pump”.

11.2.0.4 stats


Name                                            Value
----                                            -----
session logical reads                      33,554,435
consistent gets                            33,554,435
consistent gets from cache                 33,554,435
consistent gets from cache (fastpath)      33,554,431
no work - consistent read gets             33,554,431
index scans kdiixs1                        33,554,433
buffer is not pinned count                 16,777,219


19.3.0.0 stats

Name                                            Value
----                                            -----
session logical reads                      16,843,299
consistent gets                            16,843,299
consistent gets from cache                 16,843,299
consistent gets pin                        16,843,298
consistent gets pin (fastpath)             16,843,298
no work - consistent read gets             16,790,166
index range scans                          33,554,433
buffer is not pinned count                 16,790,169

Some changes are trivial (like the change of name for “index scans kdiixs1”) some are interesting (like some gets not being labelled as “pin” and “pin (fastpath)”), some are baffling (like how you can manage 33M index range scans while doing only 16M buffer gets!)

Troubleshooting

A recent thread on the Oracle Developer Community starts with the statement that a query is taking a very long time (with the question “how do I make it go faster?” implied rather than asked). It’s 12.1.0.2 (not that that’s particularly relevant to this blog note), and we have been given a number that quantifies “very long time” (again not particularly relevant to this blog note – but worth mentioning because your “slow” might be my “wow! that was fast” and far too many people use qualitative adjectives when the important detail is quantative). The query had already been running for 15 hours – and here it is:


SELECT 
        OWNER, TABLE_NAME 
FROM
        DBA_LOGSTDBY_NOT_UNIQUE 
WHERE
        (OWNER, TABLE_NAME) NOT IN (
                SELECT 
                        DISTINCT OWNER, TABLE_NAME 
                        FROM     DBA_LOGSTDBY_UNSUPPORTED
        ) 
AND     BAD_COLUMN = 'Y'

There are many obvious suggestions anyone could make for things to do to investigate the problem – start with the execution plan, check whether the object statistics are reasonably representative, run a trace with wait state tracing enabled to see where the time goes; but sometimes that are a couple of very simple observation you can make that point you to simple solutions.

Looking at this query we can recognise that it’s (almost certainly) about a couple of Oracle data dictionary views (which means it’s probably very messy under the covers with a horrendous execution plan) and, as I’ve commented from time to time in the past, Oracle Corp. developers create views for their own purposes so you should take great care when you re-purpose them. This query also has the very convenient feature that it looks like two simpler queries stitched together – so a very simple step in trouble-shooting, before going into any fine detail, is to unstitch the query and run the two parts separately to see how much data they return and how long they take to complete:


SELECT OWNER, TABLE_NAME FROM DBA_LOGSTDBY_NOT_UNIQUE WHERE BAD_COLUMN = 'Y'

SELECT DISTINCT OWNER, TABLE_NAME FROM DBA_LOGSTDBY_UNSUPPORTED

It’s quite possble that the worst case scenario for the total run time of the original query could be reduced to the sum of the run time of these two queries. One strategy to achieve this would be a rewrite of the form:

select  * 
from    (
        SELECT OWNER, TABLE_NAME FROM DBA_LOGSTDBY_NOT_UNIQUE WHERE BAD_COLUMN = 'Y'
        minus
        SELECT DISTINCT OWNER, TABLE_NAME FROM DBA_LOGSTDBY_UNSUPPORTED
)

Unfortunately the immediately obvious alternative may be illegal thanks to things like duplicates (which disappear in MINUS operations) or NULLs (which can make ALL the data “disappear” in some cases). In this case the original query might be capable of returning duplicates of (owner, table_name) from dba_lgstdby_not_unique which would collapse to a single ocurrence each in my rewrite – so my version of the query is not logically equivalent (unless the definition of the view enforces uniqueness); on the other hand tracking, back through the original thread to the MoS article where this query comes from, we can see that even if the query could return duplicates we don’t actually need to see them.

And this is the point of the blog note – it’s a general principle (that happens to be a very obvious strategy in this case): if a query takes too long, how does it compare with a simplified version of the query that might be a couple of steps short of the final target. If it’s easy to spot the options for simplification, and if the simplified version operates efficiently, them isolate it (using a no_merge hint if necessary), and work forwards from there. Just be careful that your rewrite remains logically equivalent to the original (if it really needs to).

In the case of this query, the two parts took 5 seconds and 9 seconds to complete, returning 209 rows and 815 rows respectively. Combining the two queries with a minus really should get the required result in no more than 14 seconds.

Footnote

The “distinct” in the second query is technically redundant as the minus operation applies a sort unique operation to both the two intermediate result sets before comparing them.  Similarly the  “distinct” was also redundant when the second query was used for the “in subquery” construction – again there would be an implied uniqueness operation if the optimizer decided to do a simple unnest of the subquery.

 

 

 

 

A new use for DML error logging

Many moons ago I did a short video on the DML error logging feature in Oracle. The feature has been around for many years now, and is a great tool for capturing errors during a large load without losing all of the rows that successfully loaded. You can watch that video below if you’re new to DML error logging.

But here is a possible new use case for DML error logging, even if you are not doing large scale loads. Let me describe the problem first, and then show how DML error logging might be a solution.

I’ll create a table with a constraint on it’s column


SQL> create table t1 (val number not null);

Table created.

One of the nice things that came into Oracle (I think) way back in version 8.0 was more detailed information when you violate those constraints. So when I try to insert a null into that table


SQL> insert into t1 values (null);
insert into t1 values (null)
                       *
ERROR at line 1:
ORA-01400: cannot insert NULL into ("SCOTT"."T1"."VAL")

not only am I told that I got an ORA-1400, I am also told the column (VAL) that caused the error. Even if this INSERT is performed from a PL/SQL routine, I still get that information:


SQL> exec insert into t1 values (null);
BEGIN insert into t1 values (null); END;

*
ERROR at line 1:
ORA-01400: cannot insert NULL into ("SCOTT"."T1"."VAL")
ORA-06512: at line 1

That might not seem particularly beneficial in this instance, but consider a more true to life application, which might have tables with dozens of columns, or even hundreds of columns. Knowing which column was in error is a handy piece of information to have when debugging.

That all seems just sweet, until I throw a little bulk binding into the mix. Here’s a package that will bulk bind an insert into the T1 table.


SQL> create or replace package bulk_ins as
  2    type t1_tt is table of t1%rowtype;
  3    procedure bulk_insert;
  4  end bulk_ins;
  5  /

Package created.

SQL> create or replace package body bulk_ins as
  2    procedure bulk_insert is
  3      l_data t1_tt:=t1_tt();
  4    begin
  5      l_data.extend(2);
  6      l_data(1).val:=999;
  7      l_data(2).val:=null;
  8      
  9      begin
 10        forall i in 1..l_data.count save exceptions
 11          insert into t1 values l_data(i);
 12      exception
 13        when others then
 14          for i in 1..sql%bulk_exceptions.count loop
 15            dbms_output.put_line( sqlerrm( -sql%bulk_exceptions(i).error_code ) );
 16          end loop;
 17      end forall_loop_with_save_except;
 18      commit;
 19    end bulk_insert;
 20  end bulk_ins;
 21  /

Package body created.

Even without running the code, you can see on line 6 and 7 that we are seeding an array with a null value in one of the entries, so table T1 is not going to like that upon insertion! Luckily though, we will be using the SAVE EXCEPTIONS extension to capture any errors and report on them to the calling environment. Here is what happens


SQL> set serverout on
SQL> exec bulk_ins.bulk_insert;
ORA-01400: cannot insert NULL into ()

PL/SQL procedure successfully completed.

The SAVE EXCEPTIONS has captured the error, but notice that the vital column name has been lost. Because SAVE EXCEPTIONS captures the error code, but the time we convert that to standard error message text, the precision has been lost.

Here is where DML error handling can step in and improve things. I’ll recode the package body to use DML error logging and dispense with the SAVE EXCEPTIONS


SQL>  exec   DBMS_ERRLOG.create_error_log(dml_table_name=>'T1')

PL/SQL procedure successfully completed.

SQL> create or replace package body bulk_ins as
  2    procedure bulk_insert is
  3      l_data t1_tt:=t1_tt();
  4    begin
  5      l_data.extend(2);
  6      l_data(1).val:=999;
  7      l_data(2).val:=null;
  8
  9      forall i in 1..l_data.count
 10        insert into t1 values l_data(i) LOG ERRORS REJECT LIMIT UNLIMITED;
 11
 12      commit;
 13    end bulk_insert;
 14  end bulk_ins;
 15  /

Package body created.

SQL>
SQL> exec bulk_ins.bulk_insert;

PL/SQL procedure successfully completed.

SQL>
SQL> select * from err$_t1
  2  @pr
==============================
ORA_ERR_NUMBER$               : 1400
ORA_ERR_MESG$                 : ORA-01400: cannot insert NULL into ("SCOTT"."T1"."VAL")
ORA_ERR_ROWID$                :
ORA_ERR_OPTYP$                : I
ORA_ERR_TAG$                  :
VAL                           :

Notice that the error message now contains the complete information as it did from the standard SQL insert statement. That’s pretty cool. I’m not saying that you should rush out and replace all of your SAVE EXCEPTIONS code with DML error logging. You can see from the example, that there are overheads in the management of this. You need to have the error logging table defined in advance, you would need to manage the rows from multiple sessions, and obviously every load must be suffixed with a query to the error logging table once to see if any errors occurred. But if you really really need that column name, using DML error logging might be the way forward for you.

More on DML error logging here

Optimizer Tricks 1

I’ve got a number of examples of clever little tricks the optimizer can do to transform your SQL before starting in on the arithmetic of optimisation. I was prompted to publish this one by a recent thread on ODC. It’s worth taking note of these tricks when you spot one as a background knowledge of what’s possible makes it much easier to interpret and trouble-shoot from execution plans. I’ve labelled this one “#1” since I may publish a few more examples in the future, and then I’ll have to catalogue them – but I’m not making any promises about that.

Here’s a table definition, and a query that’s hinted to use an index on that table.


rem
rem     Script:         optimizer_tricks_01.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Aug 2019
rem     Purpose:        
rem
rem     Last tested 
rem             19.3.0.0
rem             11.2.0.4
rem

create table t1 (
        v1      varchar2(10),
        v2      varchar2(10),
        v3      varchar2(10),
        padding varchar2(100)
);

create index t1_i1 on t1(v1, v2, v3);


explain plan for
select
        /*+ index(t1 (v1, v2, v3)) */
        padding 
from 
        t1
where
        v1 = 'ABC'
and     nvl(v3,'ORA$BASE') = 'SET2'
;

select * from table(dbms_xplan.display);

The query uses the first and third columns of the index, but wraps the 3rd column in an nvl() function. Because of the hint the optimizer will generate a plan with an index range scan, but the question is – what will the Predicate Information tell us about Oracle’s use of my two predicates:


Plan hash value: 3320414027

---------------------------------------------------------------------------------------------
| Id  | Operation                           | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |       |     1 |    66 |     0   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1    |     1 |    66 |     0   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | T1_I1 |     1 |       |     0   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("V1"='ABC')
       filter(NVL("V3",'ORA$BASE')='SET2')

The nvl() test is used during the index range scan (from memory I think much older versions of Oracle would have postponed the predicate test until they had accessed the table itself). This means Oracle will do a range scan over the whole section of the index where v1 = ‘ABC’, testing every index entry it finds against the nvl() predicate.

But what happens if we modify column v3 to be NOT NULL? (“alter table t1 modify v3 not null;”) Here’s the new plan:


Plan hash value: 3320414027

---------------------------------------------------------------------------------------------
| Id  | Operation                           | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |       |     1 |    66 |     0   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| T1    |     1 |    66 |     0   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | T1_I1 |     1 |       |     0   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("V1"='ABC' AND "V3"='SET2')
       filter("V3"='SET2')


The optimizer will decide that with the NOT NULL status of the column the nvl() function can be eliminated and the predicate can be replaced with a simple column comparison. At this point the v3 predicate can now be used to reduce the number of index entries that need to be examined by using a type of skip-scan/iterator approach, but Oracle still has to test the predciate against the index entries it walks through – so the predicate still appears as a filter predicate as well.

You might notice, by the way, that the Plan hash value does not change as the predicate use changes – even though the change in use of predicates could make a huge difference to the performance. (As indicated in the comments at the top of the script, I’ve run this model against 11.2.0.4 – which is the version used in the ODC thread – and 19.3.0.0: the behaviour is the same in both versions, and the Plan hash value doesn’t change from version to version.)

Footnote

The reason why I decided to publish this note is that the original thread on the ODC forums reported the Following contradictory details – an index definition and the optimizer’s use of that index as shown in the predicate section of the plan:


Index column name      Column position
---------------------- ----------------
FLEX_VALUE_SET_ID      1
PARENT_FLEX_VALUE      2
RANGE_ATTRIBUTE        3
CHILD_FLEX_VALUE_LOW   4
CHILD_FLEX_VALUE_HIGH  5
ZD_EDITION_NAME        6

---------------------------------------------------------------------------
|* 17 |      INDEX RANGE SCAN             | FND_FLEX_VALUE_NORM_HIER_U1   |
---------------------------------------------------------------------------
  17 - access("FLEX_VALUE_SET_ID"=:B1 AND NVL("ZD_EDITION_NAME",'ORA$BASE')='SET2')  
       filter((NVL("ZD_EDITION_NAME",'ORA$BASE')='SET2'  ..... lots more bits of filter predicate.

Since the expression nvl(zd_edition_name, ‘ORA$BASE’) = ‘SET2’ appears as an access predicate and a filter predicate it must surely be a column in the index. So either this isn’t the definition of the index being used or, somehow, there’s a trick that allows zd_edition_name to appear as a column name in the index when it really means nvl(zd_edition_name,’ORA$BASE’) at run-time. (And if there is I want to know what it is – edition-based redefinition and tricks with virtual columns spring to mind, but I avoid thinking about complicated explanations when a simpler one might be available.)

 

Speaking at Oracle OpenWorld 2019

It’s been remarkably 9 years since I’ve been to Oracle OpenWorld, but will finally get the opportunity to present there again this year (with many thanks to the Oracle ACE Director program for making this possible). Details of my presentation are as follows: Conference: Oracle OpenWorld Session Type: Conference Session Session ID: CON1432 Session Title: […]

sql_patch

This note is a short follow-up to a note I wrote some time ago about validating foreign key constraints where I examined the type of SQL Oracle generates internally to do the validation between parent and child tables.  In that article I suggested (before testing) that you could create an SQL patch for the generated SQL to over-ride the plan taken by Oracle – a plan dictated to some extent by hints (including a “deprecated” ordered hint) embedded in the code. I did say that the strategy might not work for SQL optimised by SYS, but it turned out that it did.

Here’s a little script I ran to test a few variations on the theme:


declare
        v1      varchar2(128);
begin
        v1 :=   dbms_sqldiag.create_sql_patch(
                        sql_id  => 'g2z10tbxyz6b0',
                        name    => 'validate_fk',
                        hint_text => 'ignore_optim_embedded_hints'
--                      hint_text => 'parallel(a@sel$1 8)'      -- worked
--                      hint_text => 'parallel(8)'              -- worked
--                      hint_text => q'{opt_param('_fast_full_scan_enabled' 'false')}'  -- worked
                );
        dbms_output.put_line(v1);
end;
/

I’ve tested this on 12.2.0.1 and 19.3.0.0, but for earlier versions of Oracle, and depending what patches you’ve applied, you will need to modify the code.

The SQL_ID represents the query for my specific tables, of course, so you will have to do a test run to find the query and SQL_ID for the validation you want to do. This is what the statement for my parent/child pair looked like (cosmetically adjusted):

select /*+ all_rows ordered dynamic_sampling(2) */ 
        A.rowid, :1, :2, :3
from
        "TEST_USER"."CHILD" A , 
        "TEST_USER"."PARENT" B 
where
        ("A"."OBJECT_ID" is not null) 
and     ("B"."OBJECT_ID"(+) = "A"."OBJECT_ID")
and     ("B"."OBJECT_ID" is null)

The patch that the script creates simply tells Oracle to ignore the embedded hints (in particular I don’t want that ordered hint), but I’ve left a few other options in the text, commenting them out.

Without the patch I got the following plan:.


Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         0          0          0  NESTED LOOPS ANTI (cr=399 pr=279 pw=0 time=47801 us starts=1 cost=70 size=22000 card=1000)
    100000     100000     100000   INDEX FAST FULL SCAN CHI_FK_PAR (cr=250 pr=247 pw=0 time=19943 us starts=1 cost=32 size=1700000 card=100000)(object id 73191)
     10000      10000      10000   INDEX UNIQUE SCAN PAR_PK (cr=149 pr=32 pw=0 time=3968 us starts=10000 cost=0 size=49995 card=9999)(object id 73189)

Rerunning the validation test with the patch in place I got the following plan – clearly the patch had had an effect.

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         0          0          0  HASH JOIN RIGHT ANTI (cr=246 pr=242 pw=0 time=96212 us starts=1 cost=39 size=22000 card=1000)
     10000      10000      10000   INDEX FAST FULL SCAN PAR_PK (cr=24 pr=23 pw=0 time=1599 us starts=1 cost=4 size=50000 card=10000)(object id 73235)
    100000     100000     100000   INDEX FAST FULL SCAN CHI_FK_PAR (cr=222 pr=219 pw=0 time=27553 us starts=1 cost=32 size=1700000 card=100000)(object id 73237)
(object id 73229)

Don’t worry too much about the fact that in my tiny example, and with a very new, nicely structured, data set the original plan was a little faster. In a production environment creating a hash table from the parent keys and probing it with the child keys may reduce the CPU usage and random I/O quite dramatically.

Bear in mind that the best possible plan may depend on many factors, such as the number of child rows per parent, the degree to which the parent and child keys arrive in sorted (or random) order, and then you have to remember that Oracle gets a little clever with the original anti-join (note that there are only 10,000 probes for 100,000 child rows – there’s an effect similar to the scalar subquery caching going on there), so trying to patch the plan the same way for every parent/child pair may not be the best strategy.

If you want to drop the patch after playing around with this example a call to execute dbms_sqldiag.drop_sql_patch(name=>’validate_fk’) will suffice.

 

AUSOUG Connect 2019 Conference Series

  AUSOUG will again be running their excellent CONNECT 2019 conference series this year at the following great venues: Monday 14th October – Rendezvous Hotel In Melbourne Wednesday 16th October –  Mercure Hotel in Perth As usual, there’s a wonderful lineup of speakers from both Australia and overseas including: Connor McDonald Scott Wesley Guy Harrison […]

Join View

It’s strange how one thing leads to another when you’re trying to check some silly little detail. This morning I wanted to find a note I’d written about the merge command and “stable sets”, and got to a draft about updatable join views that I’d started in 2016 in response to a question on OTN (as it was at the time) and finally led to a model that I’d written in 2008 showing that the manuals were wrong.

Since the manual – even the 19c manual – is still wrong regarding the “Delete Rule” for updatable (modifiable) join views I thought I’d quickly finish off the draft and post the 2008 script. Here’s what the manual says about deleting from join views (my emphasis on “exactly”):

Rows from a join view can be deleted as long as there is exactly one key-preserved table in the join. The key preserved table can be repeated in the FROM clause. If the view is defined with the WITH CHECK OPTION clause and the key preserved table is repeated, then the rows cannot be deleted from the view.

But here’s a simple piece of code to model a delete from a join view that breaks the rule:


rem
rem     Script:         delete_join.sql 
rem     Dated:          Dec 2008
rem     Author:         J P Lewis
rem

create table source
as
select level n1
from dual
connect by level <= 10
/ 
 
create table search
as
select level n1
from dual
connect by level <= 10
/ 

alter table source modify n1 not null;
alter table search modify n1 not null;

create unique index search_idx on search(n1);
-- create unique index source_idx on source(n1)

I’ve set up a “source” and a “search” table with 10 rows each and the option for creating unique indexes on each table for a column that’s declared non-null. Initially, though, I’ve only created the index on search to see what happens when I run a couple of “join view” deletes using “ANSI” syntax.

prompt  ===============================
prompt  Source referenced first in ANSI
prompt  ===============================

delete from (select * from source s join search s1 on s.n1 = s1.n1);
select count(1) source_count from source;
select count(1) search_count from search;
rollback;
 
prompt  ===============================
prompt  Search referenced first in ANSI
prompt  ===============================

delete from (select * from search s join source s1 on s.n1 = s1.n1);
select count(1) source_count from source;
select count(1) search_count from search;
rollback;

With just one of the two unique indexes in place the order of the tables in the inline view makes no difference to the outcome. Thanks to the unique index on search any row in the inline view corresponds to exactly one row in the source table, while a single row in the search table could end up appearing in many rows in the view – so the delete implictly has to operate as “delete from source”. So both deletes will result in the source_count being zero, and the search_count remaining at 10.

If we now repeat the experiment but create BOTH unique indexes, both source and search will be key-preserved in the join. According to the manual the delete should produce some sort of error. In fact the delete works in both cases – but the order that the tables appear makes a difference. When source is the first table in the in-line view the source_count drops to zero and the search_count stays at 10; when search is the first table in the in-line view the search_count drops to zero and the source_count stays at 10.

I wouldn’t call this totally unreasonable – but it’s something you need to know if you’re going to use the method, and something you need to document very carefully in case someone editing your code at a later date (or deciding that they could add a unique index) doesn’t realise the significance of the table order.

This does lead on to another important test – is it the order that the tables appear in the from clause that matters, or the order they appear in the join order that Oracle uses to optimise the query. (We hope – and expect – that it’s the join order as written, not the join order as optimised, otherwise the effect of the delete could change from day to day as the optimizer chose different execution plans!). To confirm my expectation I switched to traditional Oracle syntax with hints (still with unique indexes on both tables), writing a query with search as the first table in the from clause, but hinting the inline view to vary the optimised join order.


prompt  ============================================
prompt  Source hinted as leading table in join order 
prompt  ============================================

delete from (
        select 
                /*+ leading(s1, s) */
                * 
        from 
                search s,
                source s1 
        where
                s.n1 = s1.n1
        )
;

select count(1) source_count from source; 
select count(1) search_count from search;
rollback;

prompt  ============================================
prompt  Search hinted as leading table in join order 
prompt  ============================================

delete from (
        select 
                /*+ leading(s, s1) */
                * 
        from 
                search s,
                source s1 
        where
                s.n1 = s1.n1
        )
;

select count(1) source_count from source; 
select count(1) search_count from search;
rollback;

In both cases the rows were deleted from search (the first table in from clause). And, to answer the question you should be asking, I did check the execution plans to make sure that the hints had been effective:


============================================
Source hinted as leading table in join order
============================================

------------------------------------------------------------------
| Id  | Operation           | Name       | Rows  | Bytes | Cost  |
------------------------------------------------------------------
|   0 | DELETE STATEMENT    |            |    10 |    60 |     1 |
|   1 |  DELETE             | SEARCH     |       |       |       |
|   2 |   NESTED LOOPS      |            |    10 |    60 |     1 |
|   3 |    INDEX FULL SCAN  | SOURCE_IDX |    10 |    30 |     1 |
|*  4 |    INDEX UNIQUE SCAN| SEARCH_IDX |     1 |     3 |       |
------------------------------------------------------------------

============================================
Search hinted as leading table in join order
============================================

------------------------------------------------------------------
| Id  | Operation           | Name       | Rows  | Bytes | Cost  |
------------------------------------------------------------------
|   0 | DELETE STATEMENT    |            |    10 |    60 |     1 |
|   1 |  DELETE             | SEARCH     |       |       |       |
|   2 |   NESTED LOOPS      |            |    10 |    60 |     1 |
|   3 |    INDEX FULL SCAN  | SEARCH_IDX |    10 |    30 |     1 |
|*  4 |    INDEX UNIQUE SCAN| SOURCE_IDX |     1 |     3 |       |
------------------------------------------------------------------

Summary

Using updatable join views to handle deletes can be very efficient but the manual’s statement of the “Delete Rule” is incorrect. It is possible to have several key-preserved tables in the view that you’re using, and if that’s the case you need to play safe and ensure that the table you want to delete from is the first table in the from clause. This means taking steps to eliminate the risk of someone editing some code at a later date without realising the importance of the table order.

Update (very shortly after publication)

Iduith Mentzel has pointed out in comment #1 below that the SQL Language Reference Guide and the DBA Administration Guide are not consistent in their descriptions of deleting from a join view, and that the SQL Language Reference Guide correctly states that the delete will be applied to the first mentioned key-preserved table.

 

 

pgbench retry for repeatable read transactions — first (re)tries

pgbench retry for repeatable read transactions — first (re)tries

Trying a no-patch solution for pgbench running on repeatable read transactions, using a custom script with PL/pgSQL

In a previous post I was running pgBench on YugaByteDB in serializable isolation level. But Serializable is optimistic and requires that the transactions are re-tried when failed. But pgBench has no retry mode. There was a patch proposed in several commit fests for that, but patch acceptance is a long journey in PostgreSQL:

WIP: Pgbench Errors and serialization/deadlock retries

Rather than patching pgbench.c I’m trying to implement a retry logic with the preferred procedural code: plpgsql. This is not easy because there are many limits with transaction management in procedures. So let’s start simple with a Repetable Read isolation level and trying to get most of the threads not failing before the end.

pgBench simple-update builtin

I create a database for my demo with Repeatable Read default isolation level, and initialize the pgBench schema:

## restart the server
pg_ctl -l $PGDATA/logfile restart
## create the database and the procedure
psql -e <<'PSQL'
-- re-create the demo database and set serializable as default
drop database if exists franck;
create database franck;
alter database franck set default_transaction_isolation='repeatable read';
PSQL
pgbench --initialize --init-steps=dtgvpf franck

alter database set default_transaction_isolation=’repeatable read’;
\! pgbench — initialize — init-steps=dtgvpf

Then I run the built-in simple update, from 10 threads:

pgbench --builtin=simple-update --time 30 --jobs=10 --client=10 franck

Here is the disappointing result:

My clients fail quickly as soon as they encounter a serialization error and all finishes with few threads only. Those errors are normal in a MVCC database. Maximum concurrency is achieved with optimistic locking. It is up to the application to re-try when encountering a serialization error.

But pgBench has no option for that. To workaround this I create a procedure that does the same job as the simple-update builtin, so that I have a procedural language (PL/pgSQL) to do those retries.

pgBench simple-update script

Here is what is actually harcoded for the simple-update builtin:

cat > /tmp/simple-update.sql <<'CAT'
-- simple-update
\set aid random(1, 100000 * :scale)
\set bid random(1, 1 * :scale)
\set tid random(1, 10 * :scale)
\set delta random(-5000, 5000)
BEGIN;
UPDATE pgbench_accounts SET abalance = abalance + :delta
WHERE aid = :aid;
SELECT abalance FROM pgbench_accounts WHERE aid = :aid;
INSERT INTO pgbench_history (tid, bid, aid, delta, mtime)
VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);
END;
CAT

I can run it with:

pgbench --file=/tmp/simple-update.sql --time 30 --jobs=10 --client=10 franck

Of course, the result is the same as with the built-in: threads die as soon as they encounter a serialization error.

pgBench simple-update anonymous block

My first idea was to run an anonymous block in order to add some procedural code for retries:

cat > /tmp/simple-update.sql <<'CAT'
-- simple-update
\set aid random(1, 100000 * :scale)
\set bid random(1, 1 * :scale)
\set tid random(1, 10 * :scale)
\set delta random(-5000, 5000)
DO
$$
BEGIN;
UPDATE pgbench_accounts SET abalance = abalance + :delta
WHERE aid = :aid;
SELECT abalance FROM pgbench_accounts WHERE aid = :aid;
INSERT INTO pgbench_history (tid, bid, aid, delta, mtime)
VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);
END;
$$
CAT

but this is not possible because bind variables are not possible in anonymous blocks:

ERROR: bind message supplies 7 parameters, but prepared statement “” requires 0

pgBench simple-update procedure

Then I create a stored procedure for that procedural code.

Here is the call:

cat > /tmp/simple-update.sql <<'CAT'
-- simple-update
\set aid random(1, 100000 * :scale)
\set bid random(1, 1 * :scale)
\set tid random(1, 10 * :scale)
\set delta random(-5000, 5000)
call SIMPLE_UPDATE_RETRY(:aid, :bid, :tid, :delta);
SELECT abalance FROM pgbench_accounts WHERE aid = :aid;
CAT

I keep the SELECT in the client as I cannot return a resultset from a procedure.

The UPDATE and INSERT are run into the procedure:

psql -e <<'PSQL'
\connect franck
create or replace procedure SIMPLE_UPDATE_RETRY
(p_aid integer, p_bid integer, p_tid integer, p_delta integer) AS $$
BEGIN
UPDATE pgbench_accounts SET abalance = abalance + p_delta
WHERE aid = p_aid;
INSERT INTO pgbench_history (tid, bid, aid, delta, mtime)
VALUES (p_tid, p_bid, p_aid, p_delta, CURRENT_TIMESTAMP);
END;
$$ language plpgsql;
PSQL

Of course, I’ll get the same error as there’s no retry logic yet:

pgbench --file=/tmp/simple-update.sql --time 30 --jobs=10 --client=10 franck

ERROR: could not serialize access due to concurrent update

pgBench simple-update procedure with retry

Now that I have a procedure I can add the retry logic:

  • a loop to be able to retry
  • in the loop, a block with exception
  • when no error is encountered, exit the loop
  • when error is encountered, rollback and continue the loop
  • in case there are too many retries, finally raise the error
psql -e <<'PSQL'
\connect franck
create or replace procedure SIMPLE_UPDATE_RETRY
(p_aid integer, p_bid integer, p_tid integer, p_delta integer) AS $$
declare
retries integer:=0;
BEGIN
loop
begin
UPDATE pgbench_accounts SET abalance = abalance + p_delta
WHERE aid = p_aid;
INSERT INTO pgbench_history (tid, bid, aid, delta, mtime)
VALUES (p_tid, p_bid, p_aid, p_delta, CURRENT_TIMESTAMP);
-- exits the loop as we got no error after all DML
exit;
exception
when serialization_failure then
-- need to rollback to run in a new snapshot
rollback;
-- give up after 10 retries
if retries >10 then
raise notice 'Give Up after % retries. tid=%',retries,p_tid;
raise;
end if;
-- continue the retry loop
end;
retries=retries+1;
end loop;
commit;
if retries > 2 then
raise notice 'Required % retries (tid=%)',retries,p_tid;
end if;
END;
$$ language plpgsql;
PSQL

Note that the “rollback” is required. The exception block does a rollback to savepoint, but then it continues to run in the same snapshot. Without starting another transaction the same serialization error will be encountered.

Here is one run without any serialization error:

However, with less luck I can encounter this:

The serialization errors now occur on COMMIT. That’s another big surprise when coming from Oracle. Oracle has weakest isolation levels, but the goal is to avoid errors at commit, which is critical for OLTP and even more with distributed transactions. With PostgreSQL, the commit can raise an error. But unfortunately, my PL/pgSQL procedure cannot prevent that because I cannot commit within an exception block. That’s another limitation of transaction control in procedures.

protocol=prepared

I used the default “simple” protocol here. Obviously, I want to use “prepared” because I don’t want to parse each execute: my benchmark goal is to measure execution rate, not parsing. But running what I have here (call to procedure with a rollback in the exception handler) crashes with prepared statement:

And with PostgreSQL it seems that when a client crashes, many others fear a memory corruption and prefer to abort. I even got some case where instance recovery was required. That’s annoying and not easy to troubleshoot. The core dump says segmentation fault on pquery.c:1241 when reading pstmt->utilityStmt.

In summary… small solutions and big questions

The retry logic is mandatory when benchmarking on repeatable read or serializable isolation level. Pgbench is there to do benchmarks for PostgreSQL flavors. I want to also use it to benchmark other databases with postgres-compatible APIs, like CockroachDB or YugaByteDB. Those databases require a serializable isolation level, at least for some features. But I faced many limitations:

  • I cannot use anonymous blocks with bind variables ➛ stored procedure
  • I cannot return the select result from a stored procedure (and cannot do transaction management in functions) ➛ SELECT is done out of the transaction
  • cannot commit in exception blocks ➛ remains the possibility of non-retried error

If you have any remark about this, please tell me (Twitter: @FranckPachot). Of course I’ll investigate further on the core dump and other possibilities. In my opinion, those retries must be done in the server and not in a client loop. It makes no sense to add additional round-trips and context switches for this. Pgbench is there to do TPC-B-like benchmarks where each transaction should be at most one call to the database.

DDL invalidates your SQL right ?

I stumbled upon this post by optimizer guru Nigel Bayliss last week, so please have a read of that first before proceeding. But I wanted to show a simple demo of how management of cursors continues to improve with each version of the database.

Throughout the many years and versions of using Oracle, a common mantra has been: if you perform DDL on a table, then any SQL cursors that reference that table will become invalidated and re-parsed on next execution. That makes a good deal of sense because if you (say) drop a column from a table, then a “SELECT * FROM MY_TABLE WHERE ID = 123” is a significantly different proposition than it was before the drop of the column. The asterisk (‘*’) now resolves to something different, and the SQL might not even be valid anymore if the column that was dropped was the ID column. A re-parse is mandatory.

But not all DDL’s are equal, or perhaps a better phrasing would be “not all DDLs are as severe as others”, and since parsing is an expensive operation, any time that we can avoid having to do it is a good thing. Some DDLs are so “innocuous” that we know that a re-parse is not required. Here is an example of that in action in 18c.

I’ll create a table, issue a simple query on it, and check its details in V$SQL. I’m using the cool “set feedback on sql_id” facility in SQL*Plus 18c to immediately get the SQL_ID.


SQL> create table t as select * from all_objects;

Table created.

SQL> set feedback on sql_id
SQL> select count(*) from t;

  COUNT(*)
----------
     76921

1 row selected.

SQL_ID: cyzznbykb509s

SQL> select
  2       sql_text
  3      ,sql_fulltext
  4      ,sql_id
  5      ,executions
  6      ,parse_calls
  7      ,invalidations
  8      ,ddl_no_invalidate
  9      ,is_rolling_invalid
 10      ,is_rolling_refresh_invalid
 11   from v$sql where sql_id = 'cyzznbykb509s'
 12  @pr

==============================
SQL_TEXT                      : select count(*) from t
SQL_FULLTEXT                  : select count(*) from t
SQL_ID                        : cyzznbykb509s
EXECUTIONS                    : 1
PARSE_CALLS                   : 1
INVALIDATIONS                 : 0
DDL_NO_INVALIDATE             : N
IS_ROLLING_INVALID            : N
IS_ROLLING_REFRESH_INVALID    : N

PL/SQL procedure successfully completed.

Now I’ll perform a DDL on the table that does not change the structure, the data or the security privileges on the table


SQL> alter table t read only;

Table altered.

In versions of yester year, this would invalidate any cursors even though nothing about the table has changed. But with 18c, the DDL_NO_VALIDATE column tells us that even though a DDL was performed, we did not need to invalidate the cursor.


SQL> select
  2       sql_text
  3      ,sql_fulltext
  4      ,sql_id
  5      ,executions
  6      ,parse_calls
  7      ,invalidations
  8      ,ddl_no_invalidate
  9      ,is_rolling_invalid
 10      ,is_rolling_refresh_invalid
 11   from v$sql where sql_id = 'cyzznbykb509s'
 12  @pr
==============================
SQL_TEXT                      : select count(*) from t
SQL_FULLTEXT                  : select count(*) from t
SQL_ID                        : cyzznbykb509s
EXECUTIONS                    : 1
PARSE_CALLS                   : 1
INVALIDATIONS                 : 0
DDL_NO_INVALIDATE             : Y
IS_ROLLING_INVALID            : N
IS_ROLLING_REFRESH_INVALID    : N

PL/SQL procedure successfully completed.

SQL>

Note that this column in V$SQL is also present in most versions of 12c, but does not appear to be populated reliably until you get to 18c.