Search

Top 60 Oracle Blogs

Recent comments

Oakies Blog Aggregator

Why you cannot use #Oracle’s SQL Developer to connect to #Exasol

Many of our customers are using Oracle together with SQL Developer, so this question comes up regularly: Can we use SQL Developer also for Exasol?

Short answer is: Unfortunately not.

I tried myself to make that work with no success. Then I found this on Stackoverflow:

Jeff Smith: “No, that’s not supported. SQL Developer’s 3rd party JDBC connectivity is provided for one use case – migrations to Oracle Database.
There’s no support on that for Exasol DB, so there’s no connectivity support provided.
If you want a generic jdbc db client, that’s not Oracle SQL Developer.” [Highlighted by me]

https://uhesse.files.wordpress.com/2019/06/sqldeveloper.png?w=150&h=104 150w, https://uhesse.files.wordpress.com/2019/06/sqldeveloper.png?w=300&h=209 300w, https://uhesse.files.wordpress.com/2019/06/sqldeveloper.png 759w" sizes="(max-width: 620px) 100vw, 620px" />

Jeff Smith is not just someone from the internet: Besides of having a high reputation for being helpful in public forums, he’s also Oracle’s Product Manager for SQL Developer.

So that means SQL Developer doesn’t connect to Exasol because it’s not supposed to do that. It’s not so much a technical but a “political” reason behind it.

We as Exasol can’t do anything about it. You as an Oracle customer who wants this to work could request that from Oracle. But don’t hold your breath until they allow it </p />
</p></div>

    	  	<div class=

Can’t Unnest

In an echo of a very old “conditional SQL” posting, a recent posting on the ODC general database discussion forum ran into a few classic errors of trouble-shooting. By a lucky coincidence this allowed me to rediscover and publish an old example of parallel execution gone wild before moving on to talk about the fundamental problem exhibited in the latest query.

The ODC thread started with a question along the lines of “why isn’t Oracle using the index I hinted”, with the minor variation that it said “When I hint my SQL with an index hint it runs quickly so I’ve created a profile that applies the hint, but the hint doesn’t get used in production.”

The query was a bit messy and, as is often the case with ODC, the formatting wasn’t particularly readable, so I’ve extracted the where clause from the SQL that was used to generate the profile and reformatted it below. See if you can spot the hint clue that tells you why there might be a big problem using this SQL to generate a profile to use in the production environment:


WHERE   
        MSG.MSG_TYP_CD = '210_CUSTOMER_INVOICE' 
AND     MSG.MSG_CAPTR_STG_CD = 'PRE_BCS' 
AND     MSG.SRCH_4_FLD_VAL = '123456'   
AND     (
            (    'INVOICENUMBER' = 'INVOICENUMBER' 
             AND MSG.MSG_ID IN (
                        SELECT  *   
                        FROM    TABLE(CAST(FNM_GN_IN_STRING_LIST('123456') AS TABLE_OF_VARCHAR)))
            ) 
         OR (    'INVOICENUMBER' = 'SIEBELORDERID' 
             AND MSG.SRCH_3_FLD_VAL IN (
                        SELECT  *   
                        FROM    TABLE(CAST(FNM_GN_IN_STRING_LIST('') AS TABLE_OF_VARCHAR)))
            )
        ) 
AND     MSG.MSG_ID = TRK.INV_NUM(+) 
AND     (   TRK.RESEND_DT IS NULL 
         OR TRK.RESEND_DT = (
                        SELECT  MAX(TRK1.RESEND_DT)   
                        FROM    FNM.BCS_INV_RESEND_TRK TRK1   
                        WHERE   TRK1.INV_NUM = TRK.INV_NUM
                )
        )

If the SQL by itself doesn’t give you an inportant clue, compare it with the Predicate Information from the “good” execution plan that it produced:


Predicate Information (identified by operation id):  
---------------------------------------------------  
   2 - filter(("TRK"."RESEND_DT" IS NULL OR "TRK"."RESEND_DT"=))  
   8 - filter(("MSG"."SRCH_4_FLD_VAL"='123456' AND "MSG"."MSG_CAPTR_STG_CD"='PRE_BCS'))  
   9 - access("MSG"."MSG_ID"="COLUMN_VALUE" AND "MSG"."MSG_TYP_CD"='210_CUSTOMER_INVOICE')  
       filter("MSG"."MSG_TYP_CD"='210_CUSTOMER_INVOICE')  
  10 - access("MSG"."MSG_ID"="TRK"."INV_NUM")  
  13 - access("TRK1"."INV_NUM"=:B1)  

Have you spotted the thing that isn’t there in the predicate information ?

What happened to the ‘INVOICENUMBER’ = ‘INVOICENUMBER’ predicate and the ‘INVOICENUMBER’ = ‘SIEBELORDERID’ predicate? They’ve disappeared because the optimizer knows that the first predicate is always true and doesn’t need to be tested at run-time and the second one is always false and doesn’t need to be tested at run-time. Moreover both predicates are part of a conjunct (AND) – so in the second case the entire two-part predicate can be eliminated; so the original where clause can immediately be reduced to:


WHERE   
        MSG.MSG_TYP_CD = '210_CUSTOMER_INVOICE' 
AND     MSG.MSG_CAPTR_STG_CD = 'PRE_BCS' 
AND     MSG.SRCH_4_FLD_VAL = '123456'   
AND     (
                 MSG.MSG_ID IN (
                        SELECT  *   
                        FROM    TABLE(CAST(FNM_GN_IN_STRING_LIST('123456') AS TABLE_OF_VARCHAR)))
        ) 
AND     MSG.MSG_ID = TRK.INV_NUM(+) 
AND     (   TRK.RESEND_DT IS NULL 
         OR TRK.RESEND_DT = (
                        SELECT  MAX(TRK1.RESEND_DT)   
                        FROM    FNM.BCS_INV_RESEND_TRK TRK1   
                        WHERE   TRK1.INV_NUM = TRK.INV_NUM
                )
        )

Looking at this reduced predicate you may note that the IN subquery referencing the fnm_gn_in_string_list() collection could now be unnested and used to drive the final execution plan, and the optimizer will even recognize that it’s a rowsource with at most one row. So here’s the “good” execution plan:


---------------------------------------------------------------------------------------------------------------------------------------------------------------  
| Id  | Operation                               | Name                  | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |  
---------------------------------------------------------------------------------------------------------------------------------------------------------------  
|   0 | SELECT STATEMENT                        |                       |      1 |        |      2 |00:00:00.08 |      12 |      7 |       |       |          |  
|   1 |  SORT ORDER BY                          |                       |      1 |      1 |      2 |00:00:00.08 |      12 |      7 |  2048 |  2048 | 2048  (0)|  
|*  2 |   FILTER                                |                       |      1 |        |      2 |00:00:00.08 |      12 |      7 |       |       |          |  
|   3 |    NESTED LOOPS OUTER                   |                       |      1 |      1 |      2 |00:00:00.08 |      10 |      7 |       |       |          |  
|   4 |     NESTED LOOPS                        |                       |      1 |      1 |      2 |00:00:00.06 |       6 |      5 |       |       |          |  
|   5 |      VIEW                               | VW_NSO_1              |      1 |      1 |      1 |00:00:00.01 |       0 |      0 |       |       |          |  
|   6 |       HASH UNIQUE                       |                       |      1 |      1 |      1 |00:00:00.01 |       0 |      0 |  1697K|  1697K|  487K (0)|  
|   7 |        COLLECTION ITERATOR PICKLER FETCH| FNM_GN_IN_STRING_LIST |      1 |      1 |      1 |00:00:00.01 |       0 |      0 |       |       |          |  
|*  8 |      TABLE ACCESS BY INDEX ROWID        | FNM_VSBL_MSG          |      1 |      1 |      2 |00:00:00.06 |       6 |      5 |       |       |          |  
|*  9 |       INDEX RANGE SCAN                  | XIE2FNM_VSBL_MSG      |      1 |      4 |      4 |00:00:00.04 |       4 |      3 |       |       |          |  
|* 10 |     INDEX RANGE SCAN                    | XPKBCS_INV_RESEND_TRK |      2 |      1 |      2 |00:00:00.01 |       4 |      2 |       |       |          |  
|  11 |    SORT AGGREGATE                       |                       |      1 |      1 |      1 |00:00:00.01 |       2 |      0 |       |       |          |  
|  12 |     FIRST ROW                           |                       |      1 |      1 |      1 |00:00:00.01 |       2 |      0 |       |       |          |  
|* 13 |      INDEX RANGE SCAN (MIN/MAX)         | XPKBCS_INV_RESEND_TRK |      1 |      1 |      1 |00:00:00.01 |       2 |      0 |       |       |          |  
---------------------------------------------------------------------------------------------------------------------------------------------------------------  

The plan looks great – Oracle predicts a single row driver (operation 5) which can use a very good index (XIE2FNM_VSBL_MSG) in a nested loop, followed by a second nested loop, followed by a filter subquery and a sort of a tiny amount of data. Predictions match actuals all the way down the plan, and the workload is tiny. So what goes wrong in production?

You’ve probably guessed the flaw in this test. Why would anyone include a predicate like ‘INVOICENUMBER’ = ‘INVOICENUMBER’ in production code, or even worse ‘INVOICENUMBER’ = ‘SIEBELORDERID’. The OP has taken a query using bind variables picked up the actual values that were peeked when the query was executed, and substituted them into the test as literals. This has allowed the optimizer to discard two simple predicates and one subquery when the production query would need a plan that catered for the possibility that the second subquery would be the one that had to be executed and the first one bypassed. Here’s the corrected where clause using SQL*Plus variables (not the substitution type, the proper type) for the original bind variables:


WHERE
        MSG.MSG_TYP_CD = '210_CUSTOMER_INVOICE'
AND     MSG.MSG_CAPTR_STG_CD = 'PRE_BCS'
AND     MSG.SRCH_4_FLD_VAL = :BindInvoiceTo
AND     (
            (    :BindSearchBy = 'INVOICENUMBER' 
             AND MSG.MSG_ID IN (
                        SELECT  *
                        FROM    TABLE(CAST(FNM_GN_IN_STRING_LIST(:BindInvoiceList) AS TABLE_OF_VARCHAR)))
            )
         OR (    :BindSearchBy = 'SIEBELORDERID' 
             AND MSG.SRCH_3_FLD_VAL IN (
                        SELECT  *
                        FROM    TABLE(CAST(FNM_GN_IN_STRING_LIST(:BindSeibelIDList) AS TABLE_OF_VARCHAR)))
            )
        )
AND     MSG.MSG_ID = TRK.INV_NUM(+)
AND     (   TRK.RESEND_DT IS NULL
         OR TRK.RESEND_DT = (
                        SELECT  MAX(TRK1.RESEND_DT)
                        FROM    FNM.BCS_INV_RESEND_TRK TRK1
                        WHERE   TRK1.INV_NUM = TRK.INV_NUM
                )
        )

And this, with the “once good” hint in place to force the use of the XIE2FNM_VSBL_MSG index, is the resulting execution plan


---------------------------------------------------------------------------------------------------------  
| Id  | Operation                           | Name                  | E-Rows |  OMem |  1Mem | Used-Mem |  
---------------------------------------------------------------------------------------------------------  
|   0 | SELECT STATEMENT                    |                       |        |       |       |          |  
|   1 |  SORT ORDER BY                      |                       |      1 | 73728 | 73728 |          |  
|*  2 |   FILTER                            |                       |        |       |       |          |  
|   3 |    NESTED LOOPS OUTER               |                       |      1 |       |       |          |  
|*  4 |     TABLE ACCESS BY INDEX ROWID     | FNM_VSBL_MSG          |      1 |       |       |          |  
|*  5 |      INDEX FULL SCAN                | XIE2FNM_VSBL_MSG      |   4975K|       |       |          |  
|*  6 |     INDEX RANGE SCAN                | XPKBCS_INV_RESEND_TRK |      1 |       |       |          |  
|*  7 |    COLLECTION ITERATOR PICKLER FETCH| FNM_GN_IN_STRING_LIST |      1 |       |       |          |  
|*  8 |    COLLECTION ITERATOR PICKLER FETCH| FNM_GN_IN_STRING_LIST |      1 |       |       |          |  
|   9 |    SORT AGGREGATE                   |                       |      1 |       |       |          |  
|  10 |     FIRST ROW                       |                       |      1 |       |       |          |  
|* 11 |      INDEX RANGE SCAN (MIN/MAX)     | XPKBCS_INV_RESEND_TRK |      1 |       |       |          |  
---------------------------------------------------------------------------------------------------------  
 
Predicate Information (identified by operation id):  
---------------------------------------------------  
   2 - filter((((:BINDSEARCHBY='INVOICENUMBER' AND  IS NOT NULL) OR  
              (:BINDSEARCHBY='SIEBELORDERID' AND  IS NOT NULL)) AND ("TRK"."RESEND_DT" IS NULL OR  
              "TRK"."RESEND_DT"=)))  
   4 - filter(("MSG"."SRCH_4_FLD_VAL"=:BINDINVOICETO AND "MSG"."MSG_CAPTR_STG_CD"='PRE_BCS'))  
   5 - access("MSG"."MSG_TYP_CD"='210_CUSTOMER_INVOICE')  
       filter("MSG"."MSG_TYP_CD"='210_CUSTOMER_INVOICE')  
   6 - access("MSG"."MSG_ID"="TRK"."INV_NUM")  
   7 - filter(VALUE(KOKBF$)=:B1)  
   8 - filter(VALUE(KOKBF$)=:B1)  
  11 - access("TRK1"."INV_NUM"=:B1)  

The “unnested driving subquery” approach can no longer be used – we now start with the fnm_vsbl_msg table (accessing it using a most inefficient execution path because that’s what the hint does for us, and we can obey the hint), and for each row check which of the two subqueries we need to execute. There is, in fact, no way we can hint this query to operate efficiently [at least, that’s my opinion, .I may be wrong].

The story so far

If you’re going to try to use SQL*Plus (or similar) to test a production query with bind variables you can’t just use a sample of literal values in place of the bind variables (though you may get lucky sometimes, of course), you should set up some SQL*Plus variables and assign values to them.

Though I haven’t said it presiously in this article this is an example where a decision that really should have been made by the front-end code has been embedded in the SQL and passed to the database as SQL which cannot be run efficiently. The front end code should have been coded to recognise the choice between invoice numbers and Siebel order ids and sent the appropriate query to the database.

Next Steps

WIthout making a significant change to the front-end mechanism wrapper is it possible to change the SQL so something the optimizer can handle efficiently? Sometimes the answer is yes; so I’ve created a simpler model to demonstrate the basic problem and supply a solution for cases like this one. The key issue is finding a way of working around the OR clauses that are trying to allow the optimizer to choose between two subqueries but make it impossible for either to be unnested into a small driving data set.

First, some tables:


rem
rem     Script:         or_in_twice.sql
rem     Author:         Jonathan Lewis
rem     Dated:          June 2019
rem
rem     Last tested 
rem             18.3.0.0
rem             12.2.0.1
rem

create table t1
as
with generator as (
        select 
                rownum id
        from dual 
        connect by 
                level <= 1e4 -- > comment to avoid WordPress format issue
)
select
        rownum                          id,
        mod(rownum,371)                 n1,
        lpad(rownum,10,'0')             v1,
        lpad('x',100,'x')               padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e4 -- > comment to avoid WordPress format issue
;

alter table t1 add constraint t1_pk primary key(id);

create table t2
as
with generator as (
        select 
                rownum id
        from dual 
        connect by 
                level <= 1e4 -- > comment to avoid WordPress format issue
)
select
        rownum                          id,
        mod(rownum,372)                 n1,
        lpad(rownum,10,'0')             v1,
        lpad('x',100,'x')               padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e4 -- > comment to avoid WordPress format issue
;

create table t3
as
with generator as (
        select 
                rownum id
        from dual 
        connect by 
                level <= 1e4 -- > comment to avoid WordPress format issue
)
select
        rownum                          id,
        mod(rownum,373)                 n1,
        lpad(rownum,10,'0')             v1,
        lpad('x',100,'x')               padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e4 -- > comment to avoid WordPress format issue
;


Now a query – first setting up a variable in SQL*Plus to allow us to emulate a production query with bind variables. Since I’m only going to use Explain Plan the variable won’t be peekable, so there would still be some scope for this plan not matching a production plan, but it’s adequate to demonstrate the structural problem:


variable v1 varchar2(10)
exec :v1 := 'INVOICE'

explain plan for
select
        t1.v1 
from
        t1
where
        (
            :v1 = 'INVOICE' 
        and t1.id in (select id from t2 where n1 = 0)
        )
or      (
            :v1 = 'ORDERID' 
        and t1.id in (select id from t3 where n1 = 0)
        )
;

select * from table(dbms_xplan.display);

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |    10 |   150 |    26   (4)| 00:00:01 |
|*  1 |  FILTER            |      |       |       |            |          |
|   2 |   TABLE ACCESS FULL| T1   | 10000 |   146K|    26   (4)| 00:00:01 |
|*  3 |   TABLE ACCESS FULL| T2   |     1 |     8 |    26   (4)| 00:00:01 |
|*  4 |   TABLE ACCESS FULL| T3   |     1 |     8 |    26   (4)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter(:V1='INVOICE' AND  EXISTS (SELECT 0 FROM "T2" "T2" WHERE
              "ID"=:B1 AND "N1"=0) OR :V1='ORDERID' AND  EXISTS (SELECT 0 FROM "T3"
              "T3" WHERE "ID"=:B2 AND "N1"=0))
   3 - filter("ID"=:B1 AND "N1"=0)
   4 - filter("ID"=:B1 AND "N1"=0)

As you can see, thanks to the OR that effectively gives Oracle the choice between running the subquery against t3 or the one against t2, Oracle is unable to do any unnesting. (In fact different versions of Oracle allow different levels of sophistication with disjuncts (OR) of subqueries, so this is the kind of example that’s always useful to keep for tests against future versions.)

Since we know that we are going to use one of the data sets supplied in one of the subqueries and have no risk of double-counting or eliminating required duplicates, one strategy we could adopt for this query is to rewrite the two subqueries as a single subquery with a union all – because we know the optimizer can usually handle a single IN subquery very nicely. So let’s try the following:


explain plan for
select
        t1.v1
from
        t1
where
        t1.id in (
                select  id 
                from    t2 
                where   n1 = 0
                and     :v1 = 'INVOICE'
                union all
                select  id 
                from    t3 
                where   n1 = 0
                and     :v1 = 'ORDERID'
        )
;

select * from table(dbms_xplan.display);

-----------------------------------------------------------------------------------
| Id  | Operation              | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |          |    54 |  1512 |    77   (3)| 00:00:01 |
|*  1 |  HASH JOIN             |          |    54 |  1512 |    77   (3)| 00:00:01 |
|   2 |   VIEW                 | VW_NSO_1 |    54 |   702 |    51   (2)| 00:00:01 |
|   3 |    HASH UNIQUE         |          |    54 |   432 |    51   (2)| 00:00:01 |
|   4 |     UNION-ALL          |          |       |       |            |          |
|*  5 |      FILTER            |          |       |       |            |          |
|*  6 |       TABLE ACCESS FULL| T2       |    27 |   216 |    26   (4)| 00:00:01 |
|*  7 |      FILTER            |          |       |       |            |          |
|*  8 |       TABLE ACCESS FULL| T3       |    27 |   216 |    26   (4)| 00:00:01 |
|   9 |   TABLE ACCESS FULL    | T1       | 10000 |   146K|    26   (4)| 00:00:01 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T1"."ID"="ID")
   5 - filter(:V1='INVOICE')
   6 - filter("N1"=0)
   7 - filter(:V1='ORDERID')
   8 - filter("N1"=0)


Thanks to the FILTERs at operations 5 and 7 this plan will pick the data from just one of the two subqueries, reduce it to a unique list and then use that as the build table to a hash join. Of course, with different data (or suitable hints) that hash join could become a nested loop using a high precision index.

But there’s an alternative. We manually rewrote the two subqueries as a single union all subquery and as we did so we moved the bind variable comparisons inside their respective subqueries; maybe we don’t need to introduce the union all. What would happen if we simply take the original query and move the “constant” predicates inside their subqueries?


explain plan for
select
        t1.v1
from
        t1
where
        t1.id in (select id from t2 where n1 = 0 and :v1 = 'INVOICE')
or      t1.id in (select id from t3 where n1 = 0 and :v1 = 'ORDERID')
;

select * from table(dbms_xplan.display);


-----------------------------------------------------------------------------------
| Id  | Operation              | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |          |    54 |  1512 |    77   (3)| 00:00:01 |
|*  1 |  HASH JOIN             |          |    54 |  1512 |    77   (3)| 00:00:01 |
|   2 |   VIEW                 | VW_NSO_1 |    54 |   702 |    51   (2)| 00:00:01 |
|   3 |    HASH UNIQUE         |          |    54 |   432 |    51   (2)| 00:00:01 |
|   4 |     UNION-ALL          |          |       |       |            |          |
|*  5 |      FILTER            |          |       |       |            |          |
|*  6 |       TABLE ACCESS FULL| T3       |    27 |   216 |    26   (4)| 00:00:01 |
|*  7 |      FILTER            |          |       |       |            |          |
|*  8 |       TABLE ACCESS FULL| T2       |    27 |   216 |    26   (4)| 00:00:01 |
|   9 |   TABLE ACCESS FULL    | T1       | 10000 |   146K|    26   (4)| 00:00:01 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T1"."ID"="ID")
   5 - filter(:V1='ORDERID')
   6 - filter("N1"=0)
   7 - filter(:V1='INVOICE')
   8 - filter("N1"=0)

In 12.2.0.1 and 18.3.0.0 it gets the same plan as we did with our “single subquery” rewrite – the optimizer is able to construct the union all single subquery (although the ordering of the subqueries has been reversed) and unnest without any other manual intervention. (You may find that earlier versions of Oracle don’t manage to do this, but you might have to go all the way back to 10g.

Conclusion

Oracle doesn’t like disjuncts (OR) and finds conjuncts (AND) much easier to cope with. Mixing OR and subqueries is a good way to create inefficient execution plans, especially when you try to force the optimizer to handle a decision that should have been taken in the front-end code. The optimizer, however, gets increasingly skilled at handling the mixture as you move through the newer versions; but you may have to find ways to give it a little help if you see it running subqueries as filter subqueries when you’re expecting it to unnest a subquery to produce a small driving data set.

 

A Jupyter notebook on Google Collab to connect to the Oracle Cloud ATP

A Jupyter notebook on Google Colab to connect to the Oracle Cloud ATP

In a previous post I tested a Jupyter notebook on my laptop to connect to a PostgreSQL database. In this post, there are more ideas:

Test bench for LHC magnets at 1.9°K in SM18
  • I’m connecting to an Oracle Database, downloading the required client software from the Oracle YUM repository
  • I connect to an Oracle Cloud database accessible from Internet with the Client Credential wallet
  • The notebook can be run on Google Colab (providing the credentials to your database)

The idea: maybe one day I’ll use this to give a presentation where I can show the notebook on the screen while having attendees being able to run it on their laptop. That’s just an idea for the moment, which I got when attending a workshop/presentation on IA given by the EPFL Extension School. All the demos were also on mybinder.org, Google Colab, and CodePen so that we can even try some variations on our laptop while attending the presentation.

The EPFL Extension School

Back to the topic and… no need for more text here as everything is on the notebook:

Google Colaboratory

I’ve exported it as a GitHub Gist as Medium can display it automatically:

JSON_TABLE() and date/time columns in Oracle 19c

While researching the use of JSON in Oracle 19c I came some interesting behaviour that wasn’t immediately obvious (to me). With this post I am hoping to save you a couple of minutes scratching your head when working with JSON_TABLE(). This is Oracle 19.3.0 on Linux and I’m connecting to it using SQLcl 19.1.

Some background

As part of my JSON-support-in-Oracle research I had a good look at JSON_TABLE. Although complex at first sight, it is a lot less intimidating if you know how to use XMLTABLE :) My goal for this post is to convert a JSON document to a relational structure.

To convert JSON to (relational) rows and columns, I believe I need to use the JSON_value_column from the list of available options in JSON_TABLE‘s JSON_column_definition. Interestingly, and different from the syntax diagram for XMLTABLE(), I’m allowed to specify a JSON_value_return_type. There are quite a a few of them, and I guess the most commonly used ones are going to be varchar2, date and timestamp. This post is about the latter two, since parsing out a text string is pretty simple.

The Test Case

I created a very simple table to test the functionality:

SQL> create table t1 (
   2   id number not null,
   3   json_txt clob not null,
   4   constraint pk_t1 primary key (id),
   5   constraint c_is_json check (json_txt is json)
   6  );

 Table created.

Essentially a table to hold a JSON column plus a check constraint to make sure that I’m not inserting data that isn’t JSON. Here’s the initial JSON document I started with:

SQL> select json_query(json_txt, '$' pretty error on error) 
  2   from t1 where id = 1;

JSON_QUERY(JSON_TXT,'$'PRETTYERRORONERROR)                  
------------------------------------------------------------
{                                                           
  "results" :                                               
  [                                                         
    {                                                       
      "resultID" : 1,                                       
      "resultDate" : "13-JUN-19 08:10:00",                  
      "details" : "none"                                    
    }                                                       
  ]                                                         
}                 

Thanks to my check constraint I know this is valid JSON. I specified the pretty keyword to make the output more readable. When developing using new(-ish) features I prefer Oracle to raise an error so I’m going with ERROR ON ERROR.

These are my NLS settings by the way, this is a system installed with the default locale set to en_US. I have added these settings for reference and they will play a role later on.

SQL> select * from nls_session_parameters
  2* where regexp_like(parameter,'(DATE|TIME)');
PARAMETER                      VALUE                         
------------------------------ ------------------------------
NLS_DATE_FORMAT                DD-MON-RR                     
NLS_DATE_LANGUAGE              AMERICAN                      
NLS_TIME_FORMAT                HH24.MI.SSXFF                 
NLS_TIMESTAMP_FORMAT           DD-MON-RR HH24.MI.SSXFF       
NLS_TIME_TZ_FORMAT             HH24.MI.SSXFF TZR             
NLS_TIMESTAMP_TZ_FORMAT        DD-MON-RR HH24.MI.SSXFF TZR   

6 rows selected. 

I am now ready to start testing.

First Test

So let’s get started with the testing. My query is shown here:

SELECT
    jt.*
FROM
    t1,
    JSON_TABLE ( t1.json_txt, '$.results[*]' ERROR ON ERROR
            COLUMNS (
                resID   NUMBER   PATH '$.resultID',
                resDATE DATE     PATH '$.resultDate'
            )
        )
    AS jt
WHERE
    t1.id = 1;

Not too dissimilar to using XMLTABLE, I need to pass the column containing my JSON document to the function, followed by the path to the data within the JSON document I want to parse. I could have stored multiple results in the array (and I did initially, but considered that a little too confusing), hence the reference to all of them.

The columns() clause (of type JSON_value_column) allows me to define columns, their respective data type and the path relative to the second argument to JSON_TABLE. In my case that’s $.results.resultID and $.results.resultDate. These two are mapped to resID and resDATE in the output. I am limiting the query to my first row in t1.

When I execute this query against the first record in the table, I get an error:

SQL> SELECT
  2      jt.*
  3  FROM
  4      t1,
  5      JSON_TABLE ( t1.json_txt, '$.results[*]' ERROR ON ERROR
  6              COLUMNS (
  7                  resID   NUMBER   PATH '$.resultID',
  8                  resDATE DATE     PATH '$.resultDate'
  9              )
 10          )
 11      AS jt
 12  WHERE
 13      t1.id = 1;

Error starting at line : 1 in command -
SELECT
    jt.*
FROM
    t1,
    JSON_TABLE ( t1.json_txt, '$.results[*]' ERROR ON ERROR
            COLUMNS (
                resid   NUMBER   PATH '$.resultID',
                resdate DATE     PATH '$.resultDate'
            )
        )
    AS jt
WHERE
    t1.id = 1
Error report -
ORA-01861: literal does not match format string

This is kind of surprising, my suspicion at the time was that something is wrong with the NLS_DATE_FORMAT settings. But whatever I tried, I always got the same error. Using MOS and other resources on the net turned out blank, which might well be down to my inadequate search efforts. Tracing didn’t really help either.

Second Test

I tried a few other things until, with the help of an educated guess, I got lucky: I removed the time portion from the resultDate and formatted it to the output you get from selecting sysdate from dual, as shown here:

SQL> select json_query(json_txt, '$.results' pretty error on error) 
  2  from t1 where id = 2; 

JSON_QUERY(JSON_TXT,'$.RESULTS'PRETTYERRORONERROR)          
------------------------------------------------------------
[                                                           
  {                                                         
    "resultID" : 2,                                         
    "resultDate" : "13-JUN-19",                             
    "details" : "none"                                      
  }                                                         
]                                                           

This time around, the error message was different:

SQL> SELECT
  2      jt.*
  3  FROM
  4      t1,
  5      JSON_TABLE ( t1.json_txt, '$.results[*]' ERROR ON ERROR
  6              COLUMNS (
  7                  resID   NUMBER    PATH '$.resultID',
  8                  resDATE DATE      PATH '$.resultDate'
  9              )
 10          )
 11      AS jt
 12  WHERE
 13      t1.id = 2;

Error starting at line : 1 in command -
SELECT
    jt.*
FROM
    t1,
    JSON_TABLE ( t1.json_txt, '$.results[*]' ERROR ON ERROR
            COLUMNS (
                resID   NUMBER    PATH  '$.resultID',
                resDATE DATE      PATH '$.resultDate'
            )
        )
    AS jt
WHERE
    t1.id = 2
Error report -
ORA-40688: date or time stamp string value does not conform
to ISO 8601 format

Aha! That’s much better than ORA-01861: literal does not match format string. So it would appear as if the date the system expected to find in the JSON document must be ISO 8601 compliant. Ummm, what exactly is an ISO 8601 date format? The Internet knew!

Third Test

So maybe using an ISO 8601 time format in the JSON document is the answer? Worth a try, the JSON document now looks like this:

SQL> select json_query(json_txt, '$.results' pretty error on error) 
  2  from t1 where id = 3;

JSON_QUERY(JSON_TXT,'$.RESULTS'PRETTYERRORONERROR)          
------------------------------------------------------------
[                                                           
  {                                                         
    "resultID" : 3,                                         
    "resultDate" : "2019-06-13T16:39:00+00:00",             
    "details" : "none"                                      
  }                                                         
]     

It seems that ISO 8601 requires a slightly more complex time expression, much more like a timestamp in Oracle. Let’s see if this works:

SQL> SELECT
  2      jt.*
  3  FROM
  4      t1,
  5      JSON_TABLE ( t1.json_txt, '$.results[*]' ERROR ON ERROR
  6              COLUMNS (
  7                  resID   NUMBER    PATH  '$.resultID',
  8                  resDATE DATE      PATH '$.resultDate'
  9              )
 10          )
 11      AS jt
 12  WHERE
 13      t1.id = 3;

     RESID RESDATE  
---------- ---------
         3 13-JUN-19

Hurray, no errors, but result data for a change!

Caveat

But wait, that’s not the whole story. I have stored a time component in the input document, so how do I get that? I’d normally use to_char() for that purpose, so let’s give that a go:

SQL> SELECT
  2      to_char(jt.resDATE, 'dd.mm.yyyy hh24:mi:ss') as formattedDate
  3  FROM
  4      t1,
  5      JSON_TABLE ( t1.json_txt, '$.results[*]' ERROR ON ERROR
  6              COLUMNS (
  7                  resID   NUMBER    PATH  '$.resultID',
  8                  resDATE DATE      PATH '$.resultDate'
  9              )
 10          )
 11      AS jt
 12  WHERE
 13      t1.id = 3;

FORMATTEDDATE      
-------------------
13.06.2019 00:00:00

Hmmm, that’s midnight, and not 16:39:00+00:00 …. Unless I have missed something important it seems as if the DATE column type in the columns() clause stripped the input of its time component. Not to worry though, there’s a timestamp type. Maybe that can help?

SQL> SELECT
  2      jt.*
  3  FROM
  4      t1,
  5      JSON_TABLE ( t1.json_txt, '$.results[*]' ERROR ON ERROR
  6              COLUMNS (
  7                  resID   NUMBER    PATH '$.resultID',
  8                  resDATE TIMESTAMP PATH '$.resultDate'
  9              )
 10          )
 11      AS jt
 12  WHERE
 13      t1.id = 3;

     RESID RESDATE                     
---------- ----------------------------
         3 13-JUN-19 16.39.00.000000000

Voila! I’ve got a date and time.

Summary

From my very limited testing it appears as if input date in the JSON document has to be provided in ISO 8601 format, or otherwise Oracle raises an error. Unlike in SQLLDR’s control file there appears to be no option to tell Oracle about the formatting. Changing the NLS_DATE_FORMAT (not shown here) did affect the output of the query only. It didn’t appear to be applied when parsing the input.

Free online courses to learn about #Exasol

Why should you bother? Because Exasol is the fastest analytical database in the world, outperforming any competitor. Therefore, expertise about Exasol might soon be very valuable also in your company.
Free training helps us to spread the knowledge in a scalable way, empowering customers across the globe to get the best out of Exasol and supporting our rapid growth.

You can register here. The free online courses are branded as “Exacademy”:

https://uhesse.files.wordpress.com/2019/06/exacademy.png?w=1240&h=776 1240w, https://uhesse.files.wordpress.com/2019/06/exacademy.png?w=150&h=94 150w, https://uhesse.files.wordpress.com/2019/06/exacademy.png?w=300&h=188 300w, https://uhesse.files.wordpress.com/2019/06/exacademy.png?w=768&h=481 768w, https://uhesse.files.wordpress.com/2019/06/exacademy.png?w=1024&h=641 1024w" sizes="(max-width: 620px) 100vw, 620px" />

The Database Administration course there is my baby </p />
</p></div>

    	  	<div class=

Trouble-shooting

Here’s an answer I’ve just offered on the ODC database forum to a fairly generic type of problem.

The question was about finding out why a “program” that used to take only 10 minutes to complete is currently taking significantly longer. The system is running Standard Edition, and the program runs once per day. There’s some emphasis on the desirability of taking action while the program is still running with the following as the most recent statement of the requirements:

We have a program which run daily 10minutes and suddenly one day,it is running for more than 10minutes…in this case,we are asked to look into the backend session to check what exactly the session is doing.I understand we have to check the events,last sql etc..but we need to get the work done by that session in terms of buffergets or physical reads(in case of standard edition)

1)In this case,we cannot enable trace to the session as it already started running and 10 minutes passed way.If we enable trace at this point,will it gives the required info?

2)To check the statistics of this specific session,what is the best way to proceed and this is my initial question.One of my friend pointed out to check v$sess_io and he is not sure whether it will return cumulative values because this view has only sid and not having serial#..and sid is not unique

So, answering the questions as stated, with a little bit of padding:

1) In this case, we cannot enable trace to the session as it already started running and 10 minutes passed way.If we enable trace at this point,will it gives the required info?

If the session has started and has been running for 10 minutes it’s still possible to force tracing into the session and, depending what the program does, you may be lucky enough to get enough information in the trace/tkprof file to help you. The “most-approved” way of doing this for a session is probably through a call to dbms_monitor.session_trace_enable(), but if that’s a package that Standard Edition is not licensed to use then there’s dbms_system.set_sql_trace_in_session().

If this doesn’t help, and if the session is still live and running, you could also look at v$open_cursor for that SID to see what SQL statements (sql_id, child_address, last_active_time and first 60 characters of the text) are still open for that session, then query v$sql for more details about those statements (using sql_id and child_address). The stats you find in those statements are accumulated across all executions by all sessions from the moment the cursor went into the library cache, but if this is a program that runs once per day then it’s possible that the critical statement will only be run by that one session, and the cursor from the previous day will have aged out of the library cache so that what you see is just today’s run.

Since you’re on Standard Edition and don’t have access to the AWR you should have installed Statspack – which gives you nearly everything that you can get from the AWR reports (the most significant difference being the absence of the v$active_session_history – but there are open-source emulators that you can install as a fairly good substitute for that functionality). If there is one statement in your program that does a lot of work then it’s possible that it might be one of the top N statements in a Statspack snapshot.

If this program is a known modular piece of code could you alter the mechanism that calls it to include calls to enable session tracing at the start of the program (and disable it, perhaps, at the end of the progam).  This might be by modifying the code directly, or by modifying the wrapper that drive the program, or by adding a logon trigger if there’s a mechanism that would allow Oracle to recognise the specific session that runs this particular program, or if something sets an identifiable (and unambiguous) module and/or action as part of calling the program then you could use the dbms_monitor package to enable tracing for (e.g.) a particular module and action on a permanent basis.

2) To check the statistics of this specific session,what is the best way to proceed and this is my initial question.One of my friend pointed out to check v$sess_io and he is not sure whether it will return cumulative values because this view has only sid and not having serial#..and sid is not unique

An answer: the stats are for one SID and SERIALl#, whether you’re looking at v$sess_io, v$sesstat, v$session_event, v$sess_time_model and any other v$sesXXX views that I can’t recall off-hand.  In passing, if you can add a wrapper to the calling code, capturing sessions activity stats (v$sesstat) wait time (v$session_event) and time model summary (v$sess_time_model) is a very useful debugging aid.

And an explanation: the “session” array is a fixed size array, and the SID is the array subscript of the entry your session acquired at logon. Since the array is fixed size Oracle has to keep re-using the array entries so each time it re-uses an array entry it increments the serial# so that (sid, serial#) becomes the unique identifier across history[1]. As it acquires the array entry it also resets/re-initializes all the other v$sesXXX arrays/linked list/structures.

The one thing to watch out for when you try to capture any of the session numbers is that you don’t query these v$sesXXX things twice to find a difference unless you also capture the serial# at the same time so that you can be sure that the first session didn’t log off and a second session log on and reuse the same SID between your two snapshots.  (It’s quite possible that this will be obvious even if you haven’t captured the serial#, because you may spot that some of the values that should only increase with time have decreased)

 

Footnote

[1] I think there was a time when restarting an instance would reset the serial# to 1 and the serial# has to wrap eventually and the wrap used to be at 65536 because it was stored as a two-byte number – which doesn’t seem to have changed.  Nowadays the serial# seems to be maintained across instance restart (I wonder how that works with carefully timed instance crashes), and given the amount of history that Oracle could maintain in the AWR I suppose there could be some scope for connect two pieces of history that were really from two different sessions.

 

Parallel Fun – 2

I started writing this note in March 2015 with the following introductory comment:

A little while ago I wrote a few notes about a very resource-intensive parallel query. One of the points I made about it was that it was easy to model, and then interesting to run on later versions of Oracle. So today I’m going to treat you to a few of the observations and notes I made after modelling the problem; and here’s the SQL to create the underlying objects:

Unfortunately I failed to do anything more with the model I had created until a few days ago (June 2019 – in case I stall again) when a related question came up on the ODC database forum. This time I’m ready to go a little further – so I’ll start with a bait-and-switch approach. Here are the first few lines (omitting the SQL) of an SQL Monitor report from an instance of 18.3 – is this a power-crazed machine or what ?


Global Information
------------------------------
 Status              :  DONE (ALL ROWS)
 Instance ID         :  1
 Session             :  TEST_USER (169:11324)
 SQL ID              :  73y5quma4jnw4
 SQL Execution ID    :  16777216
 Execution Started   :  06/13/2019 22:06:32
 First Refresh Time  :  06/13/2019 22:06:32
 Last Refresh Time   :  06/13/2019 22:07:03
 Duration            :  31s
 Module/Action       :  MyModule/MyAction
 Service             :  SYS$USERS
 Program             :  sqlplus@linux183.localdomain (TNS V1-V3)
 Fetch Calls         :  591

Global Stats
=========================================================================================
| Elapsed |   Cpu   |    IO    | Concurrency |  Other   | Fetch | Buffer | Read | Read  |
| Time(s) | Time(s) | Waits(s) |  Waits(s)   | Waits(s) | Calls |  Gets  | Reqs | Bytes |
=========================================================================================
|      14 |    3.18 |     0.00 |        0.05 |       11 |   591 |  25978 |   62 |  13MB |
=========================================================================================

Parallel Execution Details (DOP=3 , Servers Allocated=6730)
==========================================================================================

It didn’t take long to run the query, only about 31 seconds. But the thing to notice in the report is that while the DOP is reported as 3, the number of “Servers Allocated” is a massive 6,730. So the big question – before I show you more of the report, explain what’s happening, and supply the code to build the model: how many PX processes did I actually start.

Here’s a little more of the output:


Parallel Execution Details (DOP=3 , Servers Allocated=6730)
==========================================================================================================================================================
|      Name      | Type  | Group# | Server# | Elapsed |   Cpu   |    IO    | Concurrency |  Other   | Buffer | Read | Read  |        Wait Events         |
|                |       |        |         | Time(s) | Time(s) | Waits(s) |  Waits(s)   | Waits(s) |  Gets  | Reqs | Bytes |         (sample #)         |
==========================================================================================================================================================
| PX Coordinator | QC    |        |         |      14 |    3.13 |          |        0.05 |       11 |  23727 |      |     . | PX Deq: Join ACK (5)       |
|                |       |        |         |         |         |          |             |          |        |      |       | PX Deq: Signal ACK EXT (2) |
|                |       |        |         |         |         |          |             |          |        |      |       | sql_id: 6405a2hc50bt4 (1)  |
| p004           | Set 1 |      1 |       1 |    0.00 |    0.00 |          |             |          |    180 |      |     . | library cache: mutex X (1) |
|                |       |        |         |         |         |          |             |          |        |      |       |                            |
| p005           | Set 1 |      1 |       2 |    0.00 |    0.00 |          |             |          |    100 |      |     . |                            |
| p006           | Set 1 |      1 |       3 |    0.00 |    0.00 |          |             |          |     90 |      |     . |                            |
| p000           | Set 1 |      2 |       1 |    0.01 |    0.01 |          |             |          |        |      |     . |                            |
| p001           | Set 1 |      2 |       2 |    0.02 |    0.02 |          |             |          |        |      |     . |                            |
| p002           | Set 2 |      2 |       1 |    0.01 |    0.01 |     0.00 |             |          |    944 |   32 |   7MB |                            |
| p003           | Set 2 |      2 |       2 |    0.01 |    0.01 |     0.00 |             |          |    937 |   30 |   7MB |                            |
==========================================================================================================================================================

Despite “allocating” 6,730 servers Oracle is only admitting to having used 7 of them -so let’s take a closer look at how they’re used. There are two groups, and we have one set of 3 slaves in group 1, and two sets of two slaves in group 2. (It looks to me as if the Group# and Type columns should be the other way around given the hierarchy of group / type / server#). We can understand a little more of what these numbers mean if we look at the execution plan – particularly the special columns relating to Data Flow Operations (DFOs) and “DFO trees”.


SQL Plan Monitoring Details (Plan Hash Value=3398913290)
========================================================================================================================================================================
| Id |          Operation           |   Name   |  Rows   | Cost |   Time    | Start  | Execs |   Rows   | Read | Read  |  Mem  | Activity |      Activity Detail       |
|    |                              |          | (Estim) |      | Active(s) | Active |       | (Actual) | Reqs | Bytes | (Max) |   (%)    |        (# samples)         |
========================================================================================================================================================================
|  0 | SELECT STATEMENT             |          |         |      |        32 |     +0 |     1 |     8846 |      |       |     . |     2.70 | Cpu (1)                    |
|  1 |   FILTER                     |          |         |      |        32 |     +0 |     1 |     8846 |      |       |     . |     5.41 | PX Deq: Signal ACK EXT (2) |
|  2 |    PX COORDINATOR            |          |         |      |        32 |     +0 |     5 |     8846 |      |       |     . |          |                            |
|  3 |     PX SEND QC (RANDOM)      | :TQ20002 |    9146 |  128 |        29 |     +2 |     2 |     8846 |      |       |     . |          |                            |
|  4 |      HASH JOIN BUFFERED      |          |    9146 |  128 |        29 |     +2 |     2 |     8846 |      |       |   9MB |          |                            |
|  5 |       PX RECEIVE             |          |    8846 |   11 |        14 |     +2 |     2 |     8846 |      |       |     . |          |                            |
|  6 |        PX SEND HYBRID HASH   | :TQ20000 |    8846 |   11 |         1 |     +0 |     2 |     8846 |      |       |     . |          |                            |
|  7 |         STATISTICS COLLECTOR |          |         |      |         1 |     +0 |     2 |     8846 |      |       |     . |          |                            |
|  8 |          PX BLOCK ITERATOR   |          |    8846 |   11 |         1 |     +0 |     2 |     8846 |      |       |     . |          |                            |
|  9 |           TABLE ACCESS FULL  | T2       |    8846 |   11 |         1 |     +0 |    23 |     8846 |   24 |   1MB |     . |          |                            |
| 10 |       PX RECEIVE             |          |   50000 |  116 |        14 |     +2 |     2 |     2509 |      |       |     . |          |                            |
| 11 |        PX SEND HYBRID HASH   | :TQ20001 |   50000 |  116 |         1 |     +0 |     2 |     2509 |      |       |     . |          |                            |
| 12 |         PX BLOCK ITERATOR    |          |   50000 |  116 |         1 |     +0 |     2 |     2509 |      |       |     . |          |                            |
| 13 |          TABLE ACCESS FULL   | T1       |   50000 |  116 |         1 |     +0 |    26 |     2509 |   38 |  12MB |     . |          |                            |
| 14 |    PX COORDINATOR            |          |         |      |        31 |     +1 |  8978 |     2252 |      |       |     . |    13.51 | PX Deq: Join ACK (5)       |
| 15 |     PX SEND QC (RANDOM)      | :TQ10000 |       1 |   77 |        32 |     +0 |  6667 |     3692 |      |       |     . |          |                            |
| 16 |      PX BLOCK ITERATOR       |          |       1 |   77 |        32 |     +0 |  6667 |    92478 |      |       |     . |     2.70 | Cpu (1)                    |
| 17 |       TABLE ACCESS FULL      | T3       |       1 |   77 |        32 |     +0 | 53118 |    92478 |   32 |   8MB |     . |    67.57 | Cpu (25)                   |
========================================================================================================================================================================

The “Name” column shows us that we have two DFO trees (:TQ2nnnn, and :TQ1nnnn) – this is why we see two “groups” in PX server detail, and why those groups can have difference deggrees of parallelism.

Looking at the general shape of the plan you can see that operation 1 is a FILTER operation with two child operations, one at operation 2 the other at operation 14. So we probably have a filter subquery in place operated as DFO tree #1 while the main query is operated as DFO tree #2. This means the main query is running at DOP = 2 (it’s a hash join with hash distribution so it needs two sets of slave processes so all the details agree with what we’ve learned abaout Group# 2 above); and the subquery is operating a DOP = 3 – and it’s using only one set of slave processes.

There is a little anomaly in the number of Execs of operation 14 – at some point I will examine this more closely, but it might simply be a reporting error that has added the number of Execs of its child operations to its own Execs, it might be something to do with counting in Exec calls by its parent, it might be a side effect of scalar subquery caching. I’ll worry about it when I have a good reason to do so. What I want to look at is the Execs of operations 15/16, the PX Block Iterator / PX Send QC. There are 6,667 reports of PX slave executing, and that matches up quite nicely with the 6,730 reported “Servers Allocated” – so it would appear that Oracle says it’s allocating a server whenever it uses a server. But does it really “allocate” (and, presumably, de-allocate).

Here’s how you find out – you run the query again, taking various snapshot and looking for cross-references. I’ve got some results from v$pq_tqstat and v$pq_slace for the run that produced the SQL Monitor report above, and some of the QC session stats and enqueue stats for a subsequent run. This is what we see:


select  process, count(*) 
from    v$pq_tqstat 
group by 
        process 
order by 
        process
;


PROCESS                    COUNT(*)
------------------------ ----------
P000                              3
P001                              3
P002                              2
P003                              2
P004                           2225
P005                           2214
P006                           2218
QC                             2243


SQL> select slave_name, sessions from V$pq_slave order by slave_name;

SLAV   SESSIONS
---- ----------
P000          1
P001          1
P002          1
P003          1
P004       2242
P005       2242
P006       2242

Key Session Stats
=================
Name                                                                         Value                                                                          
----                                                                         -----                                                                          
opened cursors cumulative                                                    6,955                                                                          
user calls                                                                  20,631                                                                          
recursive calls                                                             20,895                                                                          
enqueue requests                                                            22,699                                                                          
enqueue conversions                                                         13,610                                                                          
enqueue releases                                                            15,894                                                                          
parse count (total)                                                          6,857                                                                          
execute count                                                                6,966                                                                          
DFO trees parallelized                                                           2
Parallel operations not downgraded                                           2,268

Key Enqueue Stats
=================
Type Short name                   Requests       Waits     Success      Failed    Wait m/s                                                                  
---- ----------                   --------       -----     -------      ------    --------                                                                  
DA   Slave Process Array             2,272          13       2,272           0          43                                                                  
PS   contention                     27,160       1,586      27,080           7         415                                                                  
SE   contention                      6,784           0       6,785           0           0                                                                  

TYPE                 DESCRIPTION
-------------------- ------------------------------------------------------------------------
PS                   Parallel Execution Server Process reservation and synchronization
DA                   Slave Process Spawn reservation and synchronization
SE                   Lock used by transparent session migration

Oracle really did start and stop something like 6,700 PX session (constantly re-using the same small set of PX slave processes) for each execution of the filter subquery. This is definitely a performance threat – we keep acquiring and releasing PX slaves, we keep creating new sessions (yes, really), and we keep searching for cursors in the library cache. All these activities are highly contentious. If you start running multiple queries that do this sort of thing you find that you see increasing amounts of time being spent on latch contention, PX slave allocation, mutex waits, and all the other problems you get with sessions that log on, do virtually nothing, then log off in rapid succession.

So how do you write SQL that does this type of thing. Here’s my data model (you may want to limit the number of rows in the tables:


create table t1 as
select * from all_source;

create table t2 as
select * from all_source where mod(line,20) = 1;

create table t3 as
select * from all_source;

And here’s all you have to do to start creating problems – I’ve added explicit hints to force parallelism (particularly for the subquery), it’s more likely that it has been introduced accidentally by table or index definitions, or by an “alter session” to “force parallel”:


set feedback only

select
        /*+ 
                parallel(t1 2) 
                parallel(t2 2)
                leading(t1 t2)
                use_hash(t2)
                swap_join_inputs(t2)
                pq_distribute(t2 hash hash)
                cardinality(t1,50000)
        */
        t1.owner,
        t1.name,
        t1.type
from
        t1
join
        t2
on      t2.owner = t1.owner
and     t2.name = t1.name
and     t2.type = t1.type
where
        t1.line = 1
and     (
           mod(length(t1.text), 10) = 0
        or exists (
                select --+ parallel(t3 3) 
                        null
                from    t3
                where   t3.owner = t1.owner
                and     t3.name = t1.name
                and     t3.type = t1.type
                and     t3.line >= t1.line
                )
        )
;

set feedback on

I’ve written notes in the past about SQL that forces the optimizer to run subqueries as filter subqueries instead of unnesting them – this is just an example of that type of query, pushed into parallelism. It’s not the only way (see comment #1 from Dmitry Remizov below) to end up with scalar subqueries being executed many times as separate DFO trees even though Oracle has enhanced the optimizer several times over the years in ways that bypass the threat – but the probalm can still appear and it’s important to notice in development that you’ve got a query that Oracle can’t work around.

 

One Year at Microsoft

Hard to believe its been one year, but it was June, 2018 when I joined the unstoppable company known as Microsoft.

All-in Analytics

I joined, with the expectation that I would leave much of what I had specialized in behind me- Oracle, along with other non-Microsoft database platforms, Linux, optimization and DevOps. I was excited to start my journey in business intelligence. Power BI was already starting to take over the world. I’d noticed the patterns, having only arrived on the BI scene in 2015, it was encompassing a larger percentage of speaker sessions and focus of content on the web. Users were incredibly interested in doing more with their data, not just having someplace to relationally work with it and as it grew, deal with the challenge of big data.

It was an exciting change, no doubt.

A year in, a lot has chanced, but not as much as I thought.  I’m still excited to learn about more Power BI, Analysis Services, Databricks and other analytics features and I’m still coming up to speed. Its not easy to take it all in, as so much changes so fast. There is a new release just of the Power BI desktop software monthly. Features buzz past my head and I learn fast, but my role has also surpassed my expectations. As a data platform architect, the role was to perform the same tasks as my peers, but soon after I joined, we realized I had arrived with different skills and that this was not a deficit, but a benefit.

All-in…Everything

Before, where we all acted as individual armies of one and worked more isolated from our team members, we’ve become more interactive and separated by specialties. Where I’m now focused on many of the customers with cross-platform database migrations and automation quests that we once wouldn’t have taken on, my team mates, Hope Foley and Dustin Ryan, focus on the deeper analytics and AI demands. We’re not as territory driven, although we do have territories assigned to us. The sales and project roles around us are starting to recognize who to reach out to when a specialty is required, (or we refer to who we should on our team…)

So here I am, writing a ton of BASH automation scripts, architecting massive Oracle VM environments on Azure, helping partners automate their deployments with customers, teaching Linux and working with other database platforms on Azure. I still spend about 30% of my engagements on analytics, but I sure didn’t envision that I would be spending as much time as I am using my skills from my previous roles entwined with Azure. It’s all the warm and fuzzy wrapped up in the new, which is damn good, fun.

This couldn’t happen without good leadership and we have that at Microsoft. Our Manager, Denny Ramsey is very good about letting me run, well amok and as many folks know, my amok is a good thing. I try to find ways to be more productive and valuable to the group. I’m also pretty open about where my deficits are. I would like to be farther advanced in my analytics skills, but I also know that something has to give. This year, with what I had to learn in Azure alone took up most of the year. I’m definitely not bored…</p />
</p></div>

    	  	<div class=

Why does my REST Services menu not show up in SQL Developer?

Oracle SQL Developer has excellent support for Oracle Restful Data Services (ORDS). A lot of the functionality is just a mouse click away. With so many people speaking about RESTful APIs I wanted to see what they are like. However, when I first tried to use SQL Developer to administer ORDS in the database I was surprised at first to not find the menu item to do so. This post might be stating the (insert colourful adjective) obvious, but it took me a little time to work it out and I’m hoping this post saves you 5 minutes.

What’s the problem?

When right-clicking my connection node in the Connections tree I should be shown a menu named “REST Services”. Which I wasn’t, as shown in the figure below.

https://martincarstenbach.files.wordpress.com/2019/06/rest-menu-missing-... 150w, https://martincarstenbach.files.wordpress.com/2019/06/rest-menu-missing-... 300w, https://martincarstenbach.files.wordpress.com/2019/06/rest-menu-missing-... 1024w" sizes="(max-width: 768px) 100vw, 768px" />

And no, I did of course not read the documentation beforehand.

And how do you solve it?

After a little research it turns out that said menu appears only after ORDS has been installed. As part of the installation ORDS connects to the database and deploys code, triggering the appearance of “REST Services” menu in SQL Developer when you connect next. Here’s what I think is the relevant output from the installation:

$ java -jar ords.war install simple

...

Installing Oracle REST Data Services version 19.1.0.r0921545
... Log file written to /home/oracle/ords_install_core_2019-06-13_092412_00708.log
... Verified database prerequisites
... Created Oracle REST Data Services proxy user
... Created Oracle REST Data Services schema
... Granted privileges to Oracle REST Data Services
... Created Oracle REST Data Services database objects
... Log file written to /home/oracle/ords_install_datamodel_2019-06-13_092420_00287.log
... Log file written to /home/oracle/ords_install_apex_2019-06-13_092421_00372.log
Completed installation for Oracle REST Data Services version 19.1.0.r0921545. Elapsed time: 00:00:09.825

...

Once the deployment is complete, the “REST Services” menu appears (you might have to re-connect), as you can see in the following figure:

https://martincarstenbach.files.wordpress.com/2019/06/rest-menu-present-... 150w, https://martincarstenbach.files.wordpress.com/2019/06/rest-menu-present-... 300w, https://martincarstenbach.files.wordpress.com/2019/06/rest-menu-present-... 1024w" sizes="(max-width: 768px) 100vw, 768px" />

Hope this helps!

SMON_SCN_TIME and ORA-8161? Digging deeper

In the recent versions of the Oracle database, we’ve had the ability to convert between a System Change Number (SCN) and the approximate time to which that SCN pertains. These functions are unsurprisingly called SCN_TO_TIMESTAMP and TIMESTAMP_TO_SCN. The only potential misnomer here is that even though the functions are called “timestamp” and return a datatype of timestamp, on most platforms you are going to notice that the granularity doesn’t run down into fractions of seconds


SQL> select scn_to_timestamp(14816563713652) from dual;

SCN_TO_TIMESTAMP(14816563713652)
---------------------------------------------------------
08-JUN-19 02.30.59.000000000 AM

This all looks great until you start poking around too far into the past, and you end up in territory like this:


SQL> select scn_to_timestamp(14816563693489) from dual;
select scn_to_timestamp(14816563693489) from dual
       *
ERROR at line 1:
ORA-08181: specified number is not a valid system change number
ORA-06512: at "SYS.SCN_TO_TIMESTAMP", line 1

If I poke around in V$LOG_HISTORY, then I can pretty easily confirm that SCN 14816563693489 was indeed a valid SCN for this database at some stage in the past, so the fact that we can encounter ORA-08181 suggests that there is a finite structure that holds the mapping between SCNs and the time at which those SCNs pertain to. And indeed there is. The table is called SYS.SMON_SCN_TIME


SQL> desc sys.smon_scn_time
 Name                                      Null?    Type
 ----------------------------------------- -------- --------------------
 THREAD                                             NUMBER
 TIME_MP                                            NUMBER
 TIME_DP                                            DATE
 SCN_WRP                                            NUMBER
 SCN_BAS                                            NUMBER
 NUM_MAPPINGS                                       NUMBER
 TIM_SCN_MAP                                        RAW(1200)
 SCN                                                NUMBER
 ORIG_THREAD                                        NUMBER

and in most instances, it will hold around a week’s worth of SCN to timestamp mapping information.


SQL> select min(time_dp), max(time_dp) from sys.smon_scn_time;

MIN(TIME_ MAX(TIME_
--------- ---------
07-JUN-19 13-JUN-19

SQL> select count(*) from  sys.smon_scn_time;

  COUNT(*)
----------
      1603

When SMON_SCN_TIME first came into existence, the granularity of data in this table was an entry approximately every 5 minutes. Hence you could map an SCN to, at best, a 5 minute window. In more recent releases, this has been improved to a granularity of 3 seconds. But the table does not appear to reflect that evolution if you look at some sample data.


SQL> select scn, time_dp
  2  from sys.smon_scn_time s
  3  where rownum <= 10;

               SCN TIME_DP
------------------ -------------------
    14816565366476 11/06/2019 08:25:11
    14816565366679 11/06/2019 08:30:38
    14816565366808 11/06/2019 08:35:11
    14816565367164 11/06/2019 08:40:44
    14816565367291 11/06/2019 08:45:12
    14816565367475 11/06/2019 08:50:32
    14816565029366 10/06/2019 01:28:13
    14816565029605 10/06/2019 01:33:40
    14816565032515 10/06/2019 01:38:13
    14816565032779 10/06/2019 01:43:40

It still has a row every 5 minutes to handle backward compatibility, and to drill down to the 3 second level, you are expected to use the functions SCN_TO_TIMESTAMP and TIMESTAMP_TO_SCN already mentioned.

But what if you wanted to store this SCN to timestamp mapping for longer? Activating flashback data archive is one way of achieving this, but what if you just wanted to store the mapping forever without doing this? At first, it seems to be a simple task. I can simply get the lowest SCN from SMON_SCN_TIME:


SQL> select min(scn) scn from   sys.SMON_SCN_TIME;

               SCN
------------------
    14816563714099
    

and then use a familiar CONNECT BY technique to roll through the SCNs from that point onward


SQL> select 14816563714099+rownum scn,  
  2         scn_to_timestamp(14816563714099+rownum) ts
  3  from ( select 1 from dual connect by level <= 50 );

               SCN TS
------------------ -----------------------------------
    14816563714100 08-JUN-19 02.41.53.000000000 AM
    14816563714101 08-JUN-19 02.41.56.000000000 AM
    14816563714102 08-JUN-19 02.41.59.000000000 AM
    14816563714103 08-JUN-19 02.41.59.000000000 AM
    14816563714104 08-JUN-19 02.41.59.000000000 AM
    14816563714105 08-JUN-19 02.41.59.000000000 AM
    14816563714106 08-JUN-19 02.41.59.000000000 AM
    14816563714107 08-JUN-19 02.41.59.000000000 AM
    14816563714108 08-JUN-19 02.41.59.000000000 AM
    14816563714109 08-JUN-19 02.41.59.000000000 AM
    14816563714110 08-JUN-19 02.41.59.000000000 AM
    14816563714111 08-JUN-19 02.42.05.000000000 AM
    14816563714112 08-JUN-19 02.42.08.000000000 AM
    14816563714113 08-JUN-19 02.42.11.000000000 AM
    14816563714114 08-JUN-19 02.42.14.000000000 AM
    14816563714115 08-JUN-19 02.42.17.000000000 AM
    14816563714116 08-JUN-19 02.42.20.000000000 AM
    14816563714117 08-JUN-19 02.42.23.000000000 AM
    14816563714118 08-JUN-19 02.42.26.000000000 AM
    14816563714119 08-JUN-19 02.42.29.000000000 AM
    14816563714120 08-JUN-19 02.42.32.000000000 AM
    14816563714121 08-JUN-19 02.42.35.000000000 AM
    14816563714122 08-JUN-19 02.42.38.000000000 AM
    14816563714123 08-JUN-19 02.42.41.000000000 AM
    14816563714124 08-JUN-19 02.42.44.000000000 AM
    14816563714125 08-JUN-19 02.42.47.000000000 AM
    14816563714126 08-JUN-19 02.42.50.000000000 AM
    14816563714127 08-JUN-19 02.42.53.000000000 AM
    14816563714128 08-JUN-19 02.42.56.000000000 AM
    14816563714129 08-JUN-19 02.42.59.000000000 AM
    14816563714130 08-JUN-19 02.42.59.000000000 AM
    14816563714131 08-JUN-19 02.42.59.000000000 AM
    14816563714132 08-JUN-19 02.42.59.000000000 AM
    14816563714133 08-JUN-19 02.42.59.000000000 AM
    ...
    ...
  

But if we are going to store this data, we can start to see some things that can be improved.

Firstly, a lot of rows have the same timestamp. As mentioned, there is only a granularity of 3 seconds, and we can certainly burn through a lot of SCNs in a 3 second period. I can use some simple analytics to see if a timestamp matches the trailing one and filter it out if that is the case.


SQL> with
  2  t as
  3    ( select min(scn) lo from   sys.smon_scn_time ),
  4  all_scns as
  5    ( select lo+rownum scn, scn_to_timestamp(lo+rownum) ts
  6      from  t
  7      connect by level <= 100 ),
  8  delta_scns as
  9    ( select a.*, row_number() over ( partition by ts order by scn ) as rn
 10      from all_scns a )
 11  select * from delta_scns
 12  where rn = 1;

               SCN TS                                               RN
------------------ ---------------------------------------- ----------
    14816563714100 08-JUN-19 02.41.53.000000000 AM                   1
    14816563714101 08-JUN-19 02.41.56.000000000 AM                   1
    14816563714102 08-JUN-19 02.41.59.000000000 AM                   1
    14816563714111 08-JUN-19 02.42.05.000000000 AM                   1
    14816563714112 08-JUN-19 02.42.08.000000000 AM                   1
    14816563714113 08-JUN-19 02.42.11.000000000 AM                   1
    14816563714114 08-JUN-19 02.42.14.000000000 AM                   1
    14816563714115 08-JUN-19 02.42.17.000000000 AM                   1
    14816563714116 08-JUN-19 02.42.20.000000000 AM                   1
    14816563714117 08-JUN-19 02.42.23.000000000 AM                   1
    14816563714118 08-JUN-19 02.42.26.000000000 AM                   1
    14816563714119 08-JUN-19 02.42.29.000000000 AM                   1
    14816563714120 08-JUN-19 02.42.32.000000000 AM                   1
    14816563714121 08-JUN-19 02.42.35.000000000 AM                   1
    14816563714122 08-JUN-19 02.42.38.000000000 AM                   1
    14816563714123 08-JUN-19 02.42.41.000000000 AM                   1
    14816563714124 08-JUN-19 02.42.44.000000000 AM                   1
    14816563714125 08-JUN-19 02.42.47.000000000 AM                   1
    14816563714126 08-JUN-19 02.42.50.000000000 AM                   1
    14816563714127 08-JUN-19 02.42.53.000000000 AM                   1
    14816563714128 08-JUN-19 02.42.56.000000000 AM                   1
    14816563714129 08-JUN-19 02.42.59.000000000 AM                   1
    ...
    ...
 

Secondly, we need to stop at a particular point. I can easily get the highest SCN from the table


SQL> select max(scn) scn from   sys.smon_scn_time;

               SCN
------------------
    14816565838367
 

but that is when things start to get problematic. Even for just this small set of data (just a few days), on my almost entirely idle database (its my laptop):


SQL> select max(scn) - 14816563714100 from   sys.smon_scn_time;

MAX(SCN)-14816563714100
-----------------------
                2124267
    

you can see that I’ve burned through over 2million SCNs. When I put that back into my query


SQL> with
  2  t as
  3    ( select min(scn) lo, max(scn) hi from   sys.smon_scn_time ),
  4  all_scns as
  5    ( select lo+rownum scn, scn_to_timestamp(lo+rownum) ts
  6      from  t
  7      connect by level <= hi-lo  )
  8  delta_scns as
  9    ( select a.*, row_number() over ( partition by ts order by scn ) as rn
 10      from all_scns a )
 11  select * from delta_scns
 12  where rn = 1
 

it took a looonnnggg time to come back. This is understandable – it is millions of calls to extract a timestamp, followed by some heavy duty analytics to remove duplicates. It has become a very expensive operation to perform.

So that got me thinking – how does the database do it? I started with – how does the database get down to 3 second granularity when there is still only a row every 5 minutes in SMON_SCN_TIME? This is where the TIM_SCN_MAP column comes into play.


SQL> select scn, num_mappings, tim_scn_map
  2  from sys.smon_scn_time
  3  where scn = 14816565797824

               SCN NUM_MAPPINGS TIM_SCN_MAP
------------------ ------------ -----------------------------------------------------------------------------
    14816565797824          100 2F90015DC12324C0790D00003290015DC22324C0790D00003590015DC32324C0790D0000...

Searching along raw data and looking for patterns, I could see that the “D0000” was repeated every 24 characters (or 12 bytes). A simple check confirmed that this appears to equate to the NUM_MAPPINGS column, so it is reasonable to assume that the raw data is a set of time mappings. Given that the maximum value for NUM_MAPPINGS is 100, this also fits the hypothesis because 100 x 3 seconds granularity yields 5 minutes, which is how often we get a row in SMON_SCN_TIME


SQL> select scn, num_mappings, length(tim_scn_map)/24
  2  from   sys.smon_scn_time
  3  /

               SCN NUM_MAPPINGS LENGTH(TIM_SCN_MAP)/24
------------------ ------------ ----------------------
    14816565366476          100                    100
    14816565366679           83                     83
    14816565366808          100                    100
    14816565367164           82                     82
    14816565367291          100                    100
    14816565367475           86                     86
    14816565029366          100                    100
 ...
 

So 24 characters (12 bytes) is probably an occurrence of a SCN to timestamp mapping. The task now is dig out that information from that raw encoding. I’ll grab that first 12 bytes from the row above 2F90015DC12324C0790D0000 and see if we can align it to other values in the table. The first thing I noticed is that ‘790D0000’ rarely changes across the whole table, so I converted that to decimal to see if I could match it to anything. We have to swap the bytes first, and then I got:


SQL> select to_number('0D79','xxxx') from dual;

TO_NUMBER('0D79','XXXX')
------------------------
                    3449

That matched the SCN_WRP column in SMON_SCN_TIME. Grabbing the next 4 bytes gives


SQL> select to_number('C02423C1','xxxxxxxx') from dual;

TO_NUMBER('C02423C1','XXXXXXXX')
--------------------------------
                      3223593921

That matched the SCN_BAS column in SMON_SCN_TIME. So SCN_WRP and SCN_BAS are enough to form a complete SCN number. Which would mean that the remaining 4 bytes should be the time information.


SQL> select to_number('5D01902F','xxxxxxxx') from dual;

TO_NUMBER('5D01902F','XXXXXXXX')
--------------------------------
                      1560383535
   

Obviously that does not look like a date or a time, but this is an internal format that is seen in various places around Oracle database, for example, in redo logs etc. To explain where it comes from, it is better to approach it from the other direction (starting with time).

Lets say I wanted a unique number for clock time of just hours and minutes (eg: 09:37). I could do something like:

unique val = hours * 100 + minutes

That would work because I know that minutes is capped at 59, so there is no chance of an overlap by using a multiplier of 100. So here is how that concept can be taken to its full extent to get a unique number for a full date and time. I’ll pick a date/time of 2019-6-12 23:52:15 as an example

  • Take the year, 2019.
  • We don’t need anything before Oracle was “born”, so subtract 1976 = 43
  • For the months, multiply by 13 (anything more than 12 months) = 559
  • Because we’re IT nerds, we’ll start months at zero (0=January), so we add 5 for June = 564
  • For the day in a month, multiply by 32 (anything more than 31) = 18048
  • Because we’re IT nerds, we’ll start day of month at zero (=1st), so we add 11 for the 12th = 18059
  • For the hours, multiply by 24 (because hours never exceed 23) = 433416
  • No need for nerdy stuff here, because hours are already normalised at 0-23, so we add 23 = 433439
  • For the minutes, multiply by 60 (minutes are 0 to 59) = 26006340
  • Then add the minutes = 26006392
  • Same for the seconds, multiply by 60 = 1560383520
  • and finally we add the seconds = 1560383535

And voila! You can that it matches the decimal expansion of the 4 bytes of 5D01902F above. Now that we know how the time to number conversion is done, a little PL/SQL lets the reverse be done (number to time):


SQL> declare
  2    t int := 1560383535;
  3    secs int;
  4    mins int;
  5    hrs int;
  6    dy int;
  7    mon int;
  8    yr int;
  9  begin
 10   secs := mod(t,60);
 11   t := t - secs;
 12   t := floor(t / 60);
 13   mins := mod(t,60);
 14   t := t - mins;
 15   t := floor(t / 60);
 16   hrs := mod(t,24);
 17   t := t - hrs;
 18   t := floor(t / 24);
 19   dy := mod(t,32);
 20   t := t - dy;
 21   t := floor(t / 32);
 22   dy := dy + 1;
 23   mon := mod(t,13);
 24   t := t - mon;
 25   mon := mon + 1;
 26   t := floor(t / 13);
 27   yr := t + 1976;
 28   dbms_output.put_line(yr||'-'||mon||'-'||dy||' '||hrs||':'||mins||':'||secs);
 29  end;
 30  /
2019-6-12 23:52:15

To make all of this simpler and modular, I’ll build a couple of functions to bring all the bits and pieces we’ve learnt together. First a function to take a non-byte swapped hex string to a SCN


SQL> create or replace
  2  function raw_to_scn(p_scn varchar2) return number is
  3  begin
  4    return to_number(
  5                  substr(p_scn,7,2)||
  6                  substr(p_scn,5,2)||
  7                  substr(p_scn,3,2)||
  8                  substr(p_scn,1,2),'xxxxxxxx');
  9  end;
 10  /

Function created.

and then a function that will extend the anonymous block above to take a non-byte swapped hex string and return a timestamp (well, a date in reality).


SQL> create or replace
  2  function raw_to_date(p_time varchar2) return date is
  3    t int := to_number(
  4                  substr(p_time,7,2)||
  5                  substr(p_time,5,2)||
  6                  substr(p_time,3,2)||
  7                  substr(p_time,1,2),'xxxxxxxx');
  8    secs int;
  9    mins int;
 10    hrs int;
 11    dy int;
 12    mon int;
 13    yr int;
 14  begin
 15   secs := mod(t,60);
 16   t := t - secs;
 17   t := floor(t / 60);
 18   mins := mod(t,60);
 19   t := t - mins;
 20   t := floor(t / 60);
 21   hrs := mod(t,24);
 22   t := t - hrs;
 23   t := floor(t / 24);
 24   dy := mod(t,32);
 25   t := t - dy;
 26   t := floor(t / 32);
 27   dy := dy + 1;
 28   mon := mod(t,13);
 29   t := t - mon;
 30   mon := mon + 1;
 31   t := floor(t / 13);
 32   yr := t + 1976;
 33   return to_date(yr||'-'||mon||'-'||dy||' '||hrs||':'||mins||':'||secs,'yyyy-mm-dd hh24:mi:ss');
 34  end;
 35  /

Function created.

With these in place, I can create a couple of types to serve as return datatypes for a pipelined function


SQL> create or replace
  2  type scn_dte as object
  3    ( scn int, dte date );
  4  /

Type created.

SQL>
SQL> create or replace
  2  type scn_dte_nt as table of scn_dte
  3  /

Type created.

and here is a function that will take important components from SMON_SCN_TIME namely SCN_BAS, NUM_MAPPINGS, SCAN_WRAP and the raw data in TIM_SCN_MAP to spit out the 3-second interval data rather than just 1 row per 5 minutes.


SQL> create or replace
  2  function full_smon_scn_timestamp(p_start_scn int default 0) return scn_dte_nt pipelined as
  3    l_map varchar2(2400);
  4  begin
  5    for i in ( select * from sys.smon_scn_time
  6               where scn > p_start_scn
  7               order by scn
  8             )
  9    loop
 10      pipe row ( scn_dte(i.scn_wrp * 4294967296 + i.scn_bas,i.time_dp));
 11      l_map := i.tim_scn_map;
 12      for j in 1 .. i.num_mappings
 13      loop
 14         pipe row (
 15           scn_dte(
 16             i.scn_wrp * 4294967296 + raw_to_scn(substr(l_map,9,8)),
 17             raw_to_date(substr(l_map,1,8))
 18             )
 19          );
 20         l_map := substr(l_map,25);
 21      end loop;
 22    end loop;
 23  end;
 24  /

Function created.

Let’s give that function a run to see what comes out:


SQL> select * from full_smon_scn_timestamp()
  2  where rownum <= 20;

               SCN DTE
------------------ -------------------
    14816563726597 07/06/2019 21:50:43
    14816563726598 07/06/2019 21:50:46
    14816563726599 07/06/2019 21:50:49
    14816563726600 07/06/2019 21:50:52
    14816563726601 07/06/2019 21:50:55
    14816563726602 07/06/2019 21:50:58
    14816563726603 07/06/2019 21:51:01
    14816563726604 07/06/2019 21:51:04
    14816563726605 07/06/2019 21:51:07
    14816563726606 07/06/2019 21:51:10
    14816563726607 07/06/2019 21:51:13
    14816563726608 07/06/2019 21:51:16
    14816563726609 07/06/2019 21:51:19
    14816563726610 07/06/2019 21:51:22
    14816563726611 07/06/2019 21:51:25
    14816563726612 07/06/2019 21:51:28
    14816563726633 07/06/2019 21:51:34
    14816563726634 07/06/2019 21:51:37
    14816563726635 07/06/2019 21:51:40
    14816563726636 07/06/2019 21:51:43

Each row in SMON_SCN_TIME contains a starting value for the SCN plus all of the mappings, so the total number of 3-second intervals is:


SQL> select sum(num_mappings+1) from sys.smon_scn_time;

SUM(NUM_MAPPINGS+1)
-------------------
             145262
  

and I can compare that to the total number of rows that will be returned from my function


SQL> select count(*) from full_smon_scn_timestamp();

  COUNT(*)
----------
    145262
   

So there you have a way to grab all of those 3-second granularity entries in SMON_SCN_TIME without having to call SCN_TO_TIMESTAMP millions of time. You could run this function (say) daily by passing in a starting SCN value equal to the highest value you have so far retrieved to build up your own custom SMON_SCN_TIME variant that you can keep for as long as you want.

Of course, poking around in raw fields is totally unsupported, so don’t come crying to me if you decide to use this for anything more than exploration and one day it all blows up in your face </p />
</p></div>

    	  	<div class=