Search

Top 60 Oracle Blogs

Recent comments

Oakies Blog Aggregator

The Oracle database museum: running old versions of the Oracle database

All Oracle database professionals know the current versions of the Oracle database (12.2, 18, 19, 20 at the moment of writing), and we also know the pace Oracle corporation keeps is so high that a lot of companies are having a hard time keeping up with the current versions. A prominent one is Oracle corporation itself for their E-Business suite software, where Oracle extended the support for the database for version 12.1.0.2 and 11.2.0.4 for E-Business suite licenses only. But this blog isn’t about bitching about the pace of Oracle support and versions getting desupported.

What I do regularly encounter is that for all kinds of reasons a database version is not updated. Most of the time the versions that are encountered are 12.1.0.2 (the long term supported version of the 12.1 version of the database), 11.2.0.4 (the long term supported version of the 11.2 version of the database), and more and more seldom 11.2.0.3. If things truly have been left without updating you might encounter 11.2.0.2, and god forbid if you still have 11.2.0.1, that version had a lot if issues.

Now what if you encounter even older versions? Probably younger Oracle database consultants might never even have seen older versions than 11.2.0.1. But what if you need to work with a truly old version? Or are just interested in such an old version to see how it looks like, and what the state of the database was at that version?

For that, I created an automatic installation script to install either:
– Release 10.2 versions: 10.2.0.1, 10.2.0.2, 10.2.0.3, 10.2.0.4, 10.2.0.5.
– Release 9.2 versions: 9.2.0.4, 9.2.0.5*, 9.2.0.6*, 9.2.0.7, 9.2.0.8.
(*=the patch install is fully scripted, but linking oracle throws an error)

Yes, this is extremely old, and if you must work with it, there has been neglect and somebody not paying attention at a quite massive scale. There are also licensing implications that do not work in your favour there.

There is a huge caveat too: the installation media for these Oracle database versions is not available for download anywhere as far as I know, and some of the patches are restricted downloads on My Oracle Support too. Since it’s Oracle proprietary software, the only way to obtain it is via Oracle.

Outside of these, if you must or want to use these ancient versions, and you got the required files, you can use:
https://gitlab.com/FritsHoogland/ol48_oracle92 for installing Oracle 9.2 on Oracle Linux 4.8 or
https://gitlab.com/FritsHoogland/ol511_oracle102 for installing Oracle 10.2 on oracle Linux 5.11

Clone the repository, put the required files in the files directory, edit the Vagrantfile to your liking and then build the database server by typing ‘vagrant up’.

In case you’re wondering how the operating system images are build, this is done using ‘packer’, I have a repository where you can see how these are build too: https://gitlab.com/FritsHoogland/packer-oracle-linux-museum

In case you’re wondering: there are even older versions; the first public version of the Oracle database on linux, as far as I know, is Oracle 8.0.5. However, the linux version to use with versions like 8.1.7, RHEL/Centos 2.1, is so old that it doesn’t play nicely with VirtualBox and packer, so in al reasonability, Oracle 9.2/oracle linux 4.8 is the earliest version that can be used without severe issues.

Fresh install: Autonomous Health Framework for single instance Oracle 19c

I have been deploying Tracefile Anlyzer (TFA) to every system I’m looking after, whenever possible. Going back in the history of this weblog I created quite a few posts about how you can do so, too. When writing these posts back in 2018, TFA was available on its own.

In the meantime TFA has been merged into another, even more powerful tool: Autonomous Health Framewowrk (AHF). Its documentation appears a little bit scattered over Oracle’s website, and since this post is about Oracle 19c I decided to use the documentation as embedded in the 19c database documentation bookshelf. If I missed a more current version, please let me know via my twitter handle @MartinDBA.

This post covers a fresh installation of AHF 20.1.2 – the current version at the time of writing – on Oracle Linux 7.8/UEK 5 with my LANGuage set to en_US.UTF-8. I have a single instance 19c database, patched to 19.7.0 running on the same host. This is purely an RDBMS deployment, the host does not contain any trace of Grid Infrastructure. I’ll write two more posts covering the installation on Oracle Restart and Real Application Clusters. I’m only covering Oracle release 19c as it’s the long-term-support release for the 12c Release 2 family.

Warning

AHF, or rather its subcomponents, can easily consume plenty of space. I create my systems with a separate logical volume for /opt, where I’m installing AHF. This way I can’t run out of space on my root file system, a situation to be avoided at all cost. That’s as bad as running out of space on the logical volumes containing the Oracle binaries and ADR… So please make sure you have enough space for AHF and its components, and separate them from other critical system components!

Installation

Before you can install AHF, you need to first download it from My Oracle Support (MOS). AHF requires perl-Data-Dumper and perl-Digest-MD5 to be present on your system.

Downloading AHF

This is pretty straight forward: simply navigate to MOS note 2550798.1 and download the version for your platform. At the time of writing, 20.1.2 was the current release for Linux.

Avoiding problems during the installation

Before you can start the installation, you need to absolutely make sure to have perl’s Data::Dumper and Digest::MD5 installed or you’ll run into nasty trouble during the installation.

[root@server5 ~]# yum install -y perl-Digest-MD5 perl-Data-Dumper
Loaded plugins: ulninfo
Resolving Dependencies
--> Running transaction check
---> Package perl-Data-Dumper.x86_64 0:2.145-3.el7 will be installed
---> Package perl-Digest-MD5.x86_64 0:2.52-3.el7 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

=================================================================================================================
 Package                        Arch                 Version                      Repository                Size
=================================================================================================================
Installing:
 perl-Data-Dumper               x86_64               2.145-3.el7                  ol7_latest                47 k
 perl-Digest-MD5                x86_64               2.52-3.el7                   ol7_latest                29 k

Transaction Summary
=================================================================================================================
Install  2 Packages

Total download size: 76 k
Installed size: 151 k
Downloading packages:
(1/2): perl-Data-Dumper-2.145-3.el7.x86_64.rpm                                            |  47 kB  00:00:00     
(2/2): perl-Digest-MD5-2.52-3.el7.x86_64.rpm                                              |  29 kB  00:00:00     
-----------------------------------------------------------------------------------------------------------------
Total                                                                            426 kB/s |  76 kB  00:00:00     
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
Warning: RPMDB altered outside of yum.
  Installing : perl-Digest-MD5-2.52-3.el7.x86_64                                                             1/2 
  Installing : perl-Data-Dumper-2.145-3.el7.x86_64                                                           2/2 
  Verifying  : perl-Data-Dumper-2.145-3.el7.x86_64                                                           1/2 
  Verifying  : perl-Digest-MD5-2.52-3.el7.x86_64                                                             2/2 

Installed:
  perl-Data-Dumper.x86_64 0:2.145-3.el7                    perl-Digest-MD5.x86_64 0:2.52-3.el7                   

Complete!
[root@server5 ~]# 

Install AHF

Once you finished downloading AHF, copy the zipfile to the machine where you want it to be installed. After unzipping it, you will file the installer (ahf_setup) and a README file. Please make sure you check the README before invoking the installer.

The installation can be automated using Ansible for example, this is the list of options offered:

[root@server5 ahf]# ./ahf_setup -h

AHF Installer for Platform Linux Architecture x86_64

   Usage for ./ahf_setup

   ./ahf_setup [[-ahf_loc ][-data_dir ][-nodes ][-extract[orachk|exachk|-notfasetup] [-force]][-local][-silent][-tmp_loc ][-debug [-level <1-6>]]]

        -ahf_loc          -    Install into the directory supplied. (Default /opt/oracle.ahf)
        -data_dir         -    AHF Data Directory where all the collections, metadata, etc. will be stored 
        -nodes            -    Comma separated Remote Node List 
        -extract          -    Extract only files from Installer. (Default for non-root users)
        -notfasetup       -    Do not Configure TFA when used with -extract
        -force            -    Force the mentioned compliance type with -extract
        -local            -    Only install on the local node
        -silent           -    Do not ask any install questions 
        -tmp_loc          -    Temporary location directory for AHF to extract the install archive to (must exist) (Default /tmp)
        -perlhome         -    Custom location of perl binaries
        -debug            -    Debug AHF Install Script 
        -level            -    AHF Instal Debug Level 1-6 (Default 4 with option -debug) 
                               [FATAL - 1, ERROR - 2, WARNING - 3, INFO - 4, DEBUG - 5, TRACE - 6] 

   Note : 
   1. Without parameters AHF will take you through an interview process for installation 
   2. If -silent option is used, then ensure that you use the -data_dir option, otherwise, the installer script will fail 
   3. The -force option is applicable only when -extract orachk|exachk option is passed else it is ignored. 

Alternatively you can install AHF interactively, I’ll leave that as an exercise to the reader.

I tend to install AHF in /opt; please see above for a quick warning about space usage! All my systems use LVM for /opt, a configuration setting I’m enforcing via the kickstart file.

Oracle recommends installing AHF as root, and I’ll comply with the suggestion:

[root@server4 ahf]# ./ahf_setup -ahf_loc /opt/oracle.ahf -data_dir /opt -silent

AHF Installer for Platform Linux Architecture x86_64

AHF Installation Log : /tmp/ahf_install_24137_2020_04_24-10_04_50.log

Starting Autonomous Health Framework (AHF) Installation

AHF Version: 20.1.2 Build Date: 202004031134

AHF Location : /opt/oracle.ahf

AHF Data Directory : /opt/oracle.ahf/data

Extracting AHF to /opt/oracle.ahf

Configuring TFA Services

Discovering Nodes and Oracle Resources
Successfully generated certificates. 

Starting TFA Services
Created symlink from /etc/systemd/system/multi-user.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service.
Created symlink from /etc/systemd/system/graphical.target.wants/oracle-tfa.service to /etc/systemd/system/oracle-tfa.service.

.-----------------------------------------------------------------------------.
| Host    | Status of TFA | PID   | Port  | Version    | Build ID             |
+---------+---------------+-------+-------+------------+----------------------+
| server4 | RUNNING       | 25035 | 11631 | 20.1.2.0.0 | 20120020200403113404 |
'---------+---------------+-------+-------+------------+----------------------'

Running TFA Inventory...

Adding default users to TFA Access list...

.-----------------------------------------------------.
|             Summary of AHF Configuration            |
+-----------------+-----------------------------------+
| Parameter       | Value                             |
+-----------------+-----------------------------------+
| AHF Location    | /opt/oracle.ahf                   |
| TFA Location    | /opt/oracle.ahf/tfa               |
| Orachk Location | /opt/oracle.ahf/orachk            |
| Data Directory  | /opt/oracle.ahf/data              |
| Repository      | /opt/oracle.ahf/data/repository   |
| Diag Directory  | /opt/oracle.ahf/data/server4/diag |
'-----------------+-----------------------------------'


Starting orachk daemon from AHF ...

AHF binaries are available in /opt/oracle.ahf/bin

AHF is successfully installed

Moving /tmp/ahf_install_24137_2020_04_24-10_04_50.log to /opt/oracle.ahf/data/server4/diag/ahf/ 

AHF can send notification emails to an address you provide when running the interactive installer. I haven’t seen an option to provide an email address as part of the silent installation, but it’s not a problem as you can always update the configuration later.

If you are running AHF 20.1.2 OSWatcher won’t start automatically and you need to fix your configuration manually as described in a separate post.

Once this has completed, you should check if TFA is running.

[oracle@server4 ~]$ tfactl print status

.-----------------------------------------------------------------------------------------------.
| Host    | Status of TFA | PID  | Port  | Version    | Build ID             | Inventory Status |
+---------+---------------+------+-------+------------+----------------------+------------------+
| server4 | RUNNING       | 1986 | 21511 | 20.1.2.0.0 | 20120020200403113404 | COMPLETE         |
'---------+---------------+------+-------+------------+----------------------+------------------'
[oracle@server4 ~]$ 

This looks as if it does indeed be ready for business. And I have all the support tools at my disposal:

[oracle@server4 ~]$ tfactl toolstatus

.------------------------------------------------------------------.
|                   TOOLS STATUS - HOST : server4                  |
+----------------------+--------------+--------------+-------------+
| Tool Type            | Tool         | Version      | Status      |
+----------------------+--------------+--------------+-------------+
| Development Tools    | orachk       |   19.3.0.0.0 | DEPLOYED    |
|                      | oratop       |       14.1.2 | DEPLOYED    |
+----------------------+--------------+--------------+-------------+
| Support Tools Bundle | darda        | 2.10.0.R6036 | DEPLOYED    |
|                      | oswbb        |        8.3.2 | RUNNING     |
|                      | prw          | 12.1.13.11.4 | NOT RUNNING |
+----------------------+--------------+--------------+-------------+
| TFA Utilities        | alertsummary |   19.3.0.0.0 | DEPLOYED    |
|                      | calog        |   19.3.0.0.0 | DEPLOYED    |
|                      | dbcheck      |   18.3.0.0.0 | DEPLOYED    |
|                      | dbglevel     |   19.3.0.0.0 | DEPLOYED    |
|                      | grep         |   19.3.0.0.0 | DEPLOYED    |
|                      | history      |   19.3.0.0.0 | DEPLOYED    |
|                      | ls           |   19.3.0.0.0 | DEPLOYED    |
|                      | managelogs   |   19.3.0.0.0 | DEPLOYED    |
|                      | menu         |   19.3.0.0.0 | DEPLOYED    |
|                      | param        |   19.3.0.0.0 | DEPLOYED    |
|                      | ps           |   19.3.0.0.0 | DEPLOYED    |
|                      | pstack       |   19.3.0.0.0 | DEPLOYED    |
|                      | summary      |   19.3.0.0.0 | DEPLOYED    |
|                      | tail         |   19.3.0.0.0 | DEPLOYED    |
|                      | triage       |   19.3.0.0.0 | DEPLOYED    |
|                      | vi           |   19.3.0.0.0 | DEPLOYED    |
'----------------------+--------------+--------------+-------------'

Note :-
  DEPLOYED    : Installed and Available - To be configured or run interactively.
  NOT RUNNING : Configured and Available - Currently turned off interactively.
  RUNNING     : Configured and Available. 

I guess that’s it for single instance deployments.

Didn’t you say TFA was present in 12.2 and later?

I did, and it was, at least with 12.2.0.1 single instance. So I was a little bit surprised to see a change in behaviour during the database installation. After invoking runInstaller, you need to run 2 scripts as root, one to set the permissions on the inventory, the second one performs database specific tasks; it’s called root.sh. TFA can potentially be installed when invoking root.sh:

[oracle@server5 ~]$ cat /u01/app/oracle/product/19.0.0/dbhome_1/root.sh
#!/bin/sh
unset WAS_ROOTMACRO_CALL_MADE
. /u01/app/oracle/product/19.0.0/dbhome_1/install/utl/rootmacro.sh "$@"
. /u01/app/oracle/product/19.0.0/dbhome_1/install/utl/rootinstall.sh
/u01/app/oracle/product/19.0.0/dbhome_1/suptools/tfa/release/tfa_home/install/roottfa.sh
/u01/app/oracle/product/19.0.0/dbhome_1/install/root_schagent.sh

#
# Root Actions related to network
#
/u01/app/oracle/product/19.0.0/dbhome_1/network/install/sqlnet/setowner.sh 

#
# Invoke standalone rootadd_rdbms.sh
#
/u01/app/oracle/product/19.0.0/dbhome_1/rdbms/install/rootadd_rdbms.sh

/u01/app/oracle/product/19.0.0/dbhome_1/rdbms/install/rootadd_filemap.sh 

I didn’t build this environment, Ansible did. All Ansible calls to Oracle software invoke the tools in silent mode. Invoking root.sh in silent mode merely emits a message at the end, asking you to check the logfile for more information. So I did, and this is what it read:

Performing root user operation.

The following environment variables are set as:
    ORACLE_OWNER= oracle
    ORACLE_HOME=  /u01/app/oracle/product/19.0.0/dbhome_1
   Copying dbhome to /usr/local/bin ...
   Copying oraenv to /usr/local/bin ...
   Copying coraenv to /usr/local/bin ...


Creating /etc/oratab file...
Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Oracle Trace File Analyzer (TFA) is available at : /u01/app/oracle/product/19.0.0/dbhome_1/bin/tfactl  

It looks like TFA is available after the database software is deployed. I wanted to see if it could interfere with AHF (at this point AHF wasn’t deployed):

[oracle@server5 ~]$ /u01/app/oracle/product/19.0.0/dbhome_1/bin/tfactl status
TFA-00519 Oracle Trace File Analyzer (TFA) is not installed. 

It doesn’t seem as if TFA was present in Oracle 19c single instance. It will become usable once AHF has been deployed.

Execution Plans

In a recent blog note I made the point that there is a very simple rule (“first child first”) for reading execution plans if the query (as written or after transformation by the optimizer) consists of a single “query block”. However, if you have a plan that is reporting multiple query blocks you have to be careful that you identify the boundaries of the individual query blocks and manage to link them together correctly.

In this note I’m going to work through an example that appeared on the Oracle Developer Forum a couple of years ago where someone produced a query that may have fooled them into misreading the execution plan. It’s a very good example of a production plan that is sufficiently complex to be interesting and make a couple of useful points but still produces a plan that is short enough to pick apart in just a couple of pages of text.

The initial question was about an oddity (actually a bug) in the optimizer’s cardinality estimate for one of the plan operations, but my answer to the question produced a follow-up question which said:

“Okay so it’s an error, was wondering why for one of the subqueries it’s using a hash join (7,8) and the other a nested loop.”

Here’s the query that produced this question. Before jumping straight to the plan it’s a good idea to take a close look at the query, take note of any optimizer strategies you might see, any features of the query that might make a big difference to the optimizer strategies, and any details that might mean the plan doesn’t do what you might expect.

Since the Oracle version can have a big impact on what the optimizer can do, I’ll point out that this was running on 12.1.0.2


select  /*+ gather_plan_statistics */ 
        * 
from 
        dwh_prod.ip_spells a
where 
        not exists (
                select  1
                from
                        dwh_prod.ip_episodes e
                inner join 
                        dwh_prod.ip_diagnosis d 
                on
                        d.episode_id = e.episode_id
                where
                        a.spell_id = e.spell_id
                and     (
                           substr(d.diagnosis_code,1,1) = 'C'
                        or substr(d.diagnosis_code,1,3) between 'D37' and 'D48'
                        or substr(d.diagnosis_code,1,1)  = 'V'
                        or d.diagnosis_code = 'Z511'
                )
        )
and     exists (
                select  1
                from
                        dwh_prod.ip_episodes e
                left join 
                        dwh_prod.ip_diagnosis d 
                on
                        d.episode_id = e.episode_id
                where
                        a.spell_id = e.spell_id
                and     (
                            e.episode_order = '1'
                        and substr(d.diagnosis_code,1,1) <> 'O'
                        and d.diagnosis_sequence = 1
                        and e.main_specialty_code not in ('501','560','610')
                        )
                )
and     substr(a.admission_method_code,1,1) = '2'
and     a.admission_dttm >= to_date('01-jan-2011', 'dd-mon-yyyy')
and     administrative_category_code = '01'

Point to note:

  • It’s a fairly simple query – one table, with two correlated subqueries.
  • The correlating predicate for the first (not exists) subquery is at line 15
  • The correlating predicate for the second (exists) subquery is at line 32
  • Both subqueries are joins between two tables, and the two tables are the same in both cases.
  • The first subquery is an “inner join” (line 10), the second subquery is a “left join” (line 27)
  • There’s an error in the code at line 36!
    • the predicate “d.diagnosis_sequence = 1” will eliminate any ip_episode (e) rows that have been preserved by the outer join
    • so the optimizer will automatically treat the outer join as an inner join
  • We might see either subquery running as a filter subquery
  • We might see either subquery unnested into an inline view – with, or without, “pushed predicate”
  • We might then see see complex view merging turn the “exists” subquery into a semi-join and/or the “not exists” into an anti-join

That last comment is something that makes it easy to jump to conclusions while reading the plan and head off in the wrong direction. So let’s take a careful look at the execution plan – which is an actual run-time plan reporting the rowsource execution statistics:


SELECT * FROM TABLE(DBMS_XPLAN.display_cursor(format=>'ALLSTATS LAST'));

Plan hash value: 2721384176
----------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name               | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |                    |      1 |        |     50 |00:00:06.13 |     283K|  94413 |       |       |          |
|*  1 |  FILTER                               |                    |      1 |        |     50 |00:00:06.13 |     283K|  94413 |       |       |          |
|*  2 |   HASH JOIN ANTI                      |                    |      1 |    156 |     51 |00:00:05.70 |     282K|  94247 |   143M|  7865K|  143M (0)|
|   3 |    PARTITION RANGE ALL                |                    |      1 |  15592 |    425K|00:00:00.65 |   79553 |      0 |       |       |          |
|*  4 |     TABLE ACCESS FULL                 | IP_SPELLS          |     24 |  15592 |    425K|00:00:00.58 |   79553 |      0 |       |       |          |
|   5 |    VIEW                               | VW_SQ_1            |      1 |    530K|    464K|00:00:04.74 |     203K|  94247 |       |       |          |
|*  6 |     HASH JOIN                         |                    |      1 |    530K|    464K|00:00:04.65 |     203K|  94247 |    26M|  3954K|   34M (0)|
|*  7 |      TABLE ACCESS FULL                | IP_DIAGNOSIS       |      1 |    528K|    464K|00:00:03.12 |     109K|      0 |       |       |          |
|   8 |      TABLE ACCESS FULL                | IP_EPISODES        |      1 |   2491K|   2495K|00:00:00.44 |   94253 |  94247 |       |       |          |
|   9 |   NESTED LOOPS SEMI                   |                    |     51 |      1 |     50 |00:00:00.36 |     514 |    166 |       |       |          |
|* 10 |    TABLE ACCESS BY INDEX ROWID BATCHED| IP_EPISODES        |     51 |      1 |     51 |00:00:00.15 |     229 |     92 |       |       |          |
|* 11 |     INDEX RANGE SCAN                  | IDX02_IP_EPISODES  |     51 |      1 |     76 |00:00:00.13 |     153 |     51 |       |       |          |
|* 12 |    TABLE ACCESS BY INDEX ROWID BATCHED| IP_DIAGNOSIS       |     51 |   2688K|     50 |00:00:00.21 |     285 |     74 |       |       |          |
|* 13 |     INDEX RANGE SCAN                  | IDX01_IP_DIAGNOSIS |     51 |      5 |    174 |00:00:00.21 |     153 |     74 |       |       |          |
----------------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter( IS NOT NULL)
   2 - access(A.SPELL_ID=ITEM_1)
   4 - filter((SUBSTR(A.ADMISSION_METHOD_CODE,1,1)='2' AND A.ADMISSION_DTTM>=TO_DATE(' 2011-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss')
              AND ADMINISTRATIVE_CATEGORY_CODE='01'))
   6 - access(D.EPISODE_ID=E.EPISODE_ID)
   7 - filter((SUBSTR(D.DIAGNOSIS_CODE,1,1)='C' OR SUBSTR(D.DIAGNOSIS_CODE,1,1)='V' OR (SUBSTR(D.DIAGNOSIS_CODE,1,3)>='D37' AND
              SUBSTR(D.DIAGNOSIS_CODE,1,3)<='D48') OR D.DIAGNOSIS_CODE='Z511'))
  10 - filter((E.EPISODE_ORDER=1 AND E.MAIN_SPECIALTY_CODE<>'501' AND E.MAIN_SPECIALTY_CODE<>'560' AND
              E.MAIN_SPECIALTY_CODE<>'610'))
  11 - access(E.SPELL_ID=:B1)
  12 - filter((D.DIAGNOSIS_SEQUENCE=1 AND SUBSTR(D.DIAGNOSIS_CODE,1,1)<>'O'))
  13 - access(D.EPISODE_ID=E.EPISODE_ID)

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)
   - this is an adaptive plan
   - 6 Sql Plan Directives used for this statement
   -

Take note, by the way, that we’ve been told that the plan was “adaptive” and we’ve used 6 SQL Plan Directives, which would (probably) have been telling the optimizer to use dynamic sampling of the data in various ways to get to the best possible plan.

Given that the query starts as three separate query blocks and uses two of the tables twice it would have been nice to see the Query Block / Object Alias information for this query as that would have made it just a little easier to connect the plan to the query.

At first sight we can spot that we’ve got a hash join anti at operation 2 and a nested loop semi at operation 9 and given my earlier comments about how the optimizer can treat subqueries it would be easy to jump to the conclusion that the anti represented the “not exists” subquery and the semi was the “exists” subquery and work onwards from there. However operation 1 is a filter, and when we check the Predicate Information for operation 1 it holds the slighltly cryptic content “filter (IS NOT NULL)” this is an example of how filter subqueries “go missing” from the predicate information when you pull the execution plan from memory. Operation 1 is a multi-child filter with operation 2 and operation 9 as its child operations; for each row it receives from operation 2 it calls operation 9. If we could see the full predicate information for operation 1 we would see that it was the existence subquery. The nested loop semi isn’t the result of Oracle executing the existence subquery as a sem-join, it’s a semi-join that has appeared for some other reason – that we will pursue in a little while.

Let’s examine operation 2 (and its descendents) in detail.  It’s a hash join so it’s first child will be used as the build table and it’s second child will be used as the probe table.  The first child is a full tablescan (operation 4)  reading every partition of a range partitioned (operation 3)  table – IP_SPELLS table, which is the one table in the main body of the query. The second child is a view called VW_SQ_1 – an example of an internally named view that can appear when Oracle unnests, but doesn’t merge, a subquery – it represents the body of the “not exists” subquery. Oracle has used a hash join with IP_DIAGNOSIS as the build table and IP_EPISODES as the probe table.

To make things a little easier to comprehend I’ve created tables and indexes that let me emulate this query and plan, using the alias format option to report the query block names and fully qualified aliases that were finally used. Then I’ve edited the output to put the query block names and aliases beside the operation they refer to (rather than leaving the in a block under the body of the plan) and I’ve inserted line-breaks to help highlight the separate query blocks . This is the result:

----------------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name               | Query Block  / Alias                |
----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |                    |                                     |
|*  1 |  FILTER                               |                    | SEL$2B969259                        |
|*  2 |   HASH JOIN ANTI                      |                    |                                     |
|   3 |    PARTITION RANGE ALL                |                    |                                     |
|*  4 |     TABLE ACCESS FULL                 | IP_SPELLS          | SEL$2B969259 / A@SEL$4              |

|   5 |    VIEW                               | VW_SQ_1            | SEL$8D33959D / VW_SQ_1@SEL$F49409E0 |
|*  6 |     HASH JOIN                         |                    | SEL$8D33959D                        |
|*  7 |      TABLE ACCESS FULL                | IP_DIAGNOSIS       | SEL$8D33959D / D@SEL$1              |
|   8 |      TABLE ACCESS FULL                | IP_EPISODES        | SEL$8D33959D / E@SEL$1              |

|   9 |   NESTED LOOPS SEMI                   |                    | SEL$2B0A33EA                        |
|* 10 |    TABLE ACCESS BY INDEX ROWID BATCHED| IP_EPISODES        | SEL$2B0A33EA / E@SEL$3              |
|* 11 |     INDEX RANGE SCAN                  | IDX02_IP_EPISODES  | SEL$2B0A33EA / E@SEL$3              |
|* 12 |    TABLE ACCESS BY INDEX ROWID BATCHED| IP_DIAGNOSIS       | SEL$2B0A33EA / D@SEL$3              |
|* 13 |     INDEX RANGE SCAN                  | IDX01_IP_DIAGNOSIS | SEL$2B0A33EA / D@SEL$3              |
----------------------------------------------------------------------------------------------------------

As I’ve said before – if an execution plan is too complex to read in one go you can pick virtually any line of the plan and examine that line and its descendants in isolations as a way of getting started. Now I want to point out that if you can see the final query blocks this simply then any operations that starts a query block is a useful starting point for reading part of the plan because each query block has been optimized separately, and once you’ve understood how that block operates you can replace it (mentally) with a one line “here’s a rowsource”. So we might separate this plan into pieces as following:


----------------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name               | Query Block  / Alias                |
----------------------------------------------------------------------------------------------------------
|   5 |    VIEW                               | VW_SQ_1            | SEL$8D33959D / VW_SQ_1@SEL$F49409E0 |
|*  6 |     HASH JOIN                         |                    | SEL$8D33959D                        |
|*  7 |      TABLE ACCESS FULL                | IP_DIAGNOSIS       | SEL$8D33959D / D@SEL$1              |
|   8 |      TABLE ACCESS FULL                | IP_EPISODES        | SEL$8D33959D / E@SEL$1              |
----------------------------------------------------------------------------------------------------------

----------------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name               | Query Block  / Alias                |
----------------------------------------------------------------------------------------------------------
|   9 |   NESTED LOOPS SEMI                   |                    | SEL$2B0A33EA                        |
|* 10 |    TABLE ACCESS BY INDEX ROWID BATCHED| IP_EPISODES        | SEL$2B0A33EA / E@SEL$3              |
|* 11 |     INDEX RANGE SCAN                  | IDX02_IP_EPISODES  | SEL$2B0A33EA / E@SEL$3              |
|* 12 |    TABLE ACCESS BY INDEX ROWID BATCHED| IP_DIAGNOSIS       | SEL$2B0A33EA / D@SEL$3              |
|* 13 |     INDEX RANGE SCAN                  | IDX01_IP_DIAGNOSIS | SEL$2B0A33EA / D@SEL$3              |
----------------------------------------------------------------------------------------------------------

----------------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name               | Query Block  / Alias                |
----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |                    |                                     |
|*  1 |  FILTER                               |                    | SEL$2B969259                        |
|*  2 |   HASH JOIN ANTI                      |                    |                                     |
|   3 |    PARTITION RANGE ALL                |                    |                                     |
|*  4 |     TABLE ACCESS FULL                 | IP_SPELLS          | SEL$2B969259 / A@SEL$4              |
|   5 |    Rowsource "not exists"             | VW_SQ_1            | SEL$8D33959D / VW_SQ_1@SEL$F49409E0 |
|   9 |   Rowsource "exists"                  |                    | SEL$2B0A33EA                        |
----------------------------------------------------------------------------------------------------------

  • The first sub-plan is the unnested “not exists” subquery – which is a very simple plan to analyze.
  • The second sub-plan is the “exists” subquery – which is a very simple plan to analyze
  • The third sub-plan (with the substitutes for the other two query blocks) says: do a hash (anti) join between IP_SPELLS and the “not exists” rowsource and for each row that is produced create and check the “exists” rowsource to see if it produces a match, and forward the row if it does.

In this particular case the shape of the two subsidiary query blocks, and the pattern that we use to stitch the pieces together makes it look as if “first child first” applies across the whole plan. This, I think, is a key reason why people have had difficulty interpreting complex plans in the past. If every plan you’ve previously examined looks as if “first child first” always works you’re going to be baffled by a plan where the combination of query blocks seems to breaks that rule.

Putting the first pieces together, this is what happens as the FILTER operation calls its first child:

  • Operation 1 (FILTER) calls operation 2 (hash join anti) which calls operation 3 (partition range all) which calls operation 4 (table access full of IP_SPELLS)
  • Operation 4 is the first operation to generate a rowsource,which is passes up to operation 3.
  • Operation 3 immediately passes the rowsource on up to operation 2 (making operation 3 the second operation to generate a rowsource)
  • Operation 2 creates its build table then calls operation 5 (VIEW – its second rowsource) to supply the probe table.
  • Operation 5 calls operation 6 (hash join) which calls its first child, operation 7 (table scan full of IP_DIAGNOSIS)
  • Operation 7 is the third  operation to generate a rowsource, which it passes up to operation 6 to use as its build table.
  • Operation 6 creates its build table from the rowsource then calls operation 8 (table access full of IP_EPISODE)
  • Operation 8 is the fourth operation to generate a rowsource, which it passes up to operation 6 as its probe table
  • Operation 6 probes its build table as rows arrive from operation 8, and passes join results up to operation 5
  • Operation 6 is the fifth operation to generate a rowsource
  • Operation 5 immediately passes the rows up to its parent (operation 2), making it the sixth operaton to generate a rowsource.
  • Operation 2 is an “anti” hash join, so as each row arrives from operation 5 it probes its build table and if it finds a match it marks the row in the build table as one to be discarded. When operation 2 has exhausted the input from operation 5 it can pass any unmarked rows up to its parent (operation 1), making it the seventh operation to generate a rowsource.

So we’re now at operation 1 with an incoming rowsource which is all the rows in IP_SPELLS where a certain type of match in the IP_DIAGNOSIS and IP_EPISODE tables does not exist. This is how the order looks so far if we strip the plan to a minimum and edit in an “order” column:

----------------------------------------------------------------------------
| Id  | Operation                             | Name               | Order |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |                    |       |
|*  1 |  FILTER                               |                    |       |
|*  2 |   HASH JOIN ANTI                      |                    |     7 |
|   3 |    PARTITION RANGE ALL                |                    |     2 |
|*  4 |     TABLE ACCESS FULL                 | IP_SPELLS          |     1 |
|   5 |    VIEW                               | VW_SQ_1            |     6 |
|*  6 |     HASH JOIN                         |                    |     5 |
|*  7 |      TABLE ACCESS FULL                | IP_DIAGNOSIS       |     3 |
|   8 |      TABLE ACCESS FULL                | IP_EPISODES        |     4 |
----------------------------------------------------------------------------

So now we come to the part of the plan that matches the “exists” subquery. For each row that operation 1 (FILTER) receives from its first child it calls operation 9 to run the “exists” subquery, which is a correlated subquery with a join between IP_EPISODES and IP_DIAGNOSIS. So why does Oracle use a nested loop join in the subquery rather than a hash join, and why. in particular, does that nested loop turn into a semi-join? What’s the difference between the “not exists” and the “exists”?

Imagine that for every row in IP_EPISODES there are 10 matching rows in IP_DIAGNOSIS.  When we check for “not exists” correlated to IP_EPISODES we will have to join to every single matching IP_DIAGNOSIS row because it might be the 10th which fails the test because it matches some extra filter predicate on IP_DIAGNOSIS. So the join is going to be a “high volume” join (hence a hash join is likely to become more appropriate).  On the other hand when we test for “exists” we may get lucky and only get as far as the first IP_DIAGNOSIS for an IP_EPISODES to prove existence – so it makes more sense to adopt a nested loop so that we can stop on the first match – and that, in fact, is why the nested loop in this case is a nested loop semi, it’s a code path specially designed to stop early.  (In fact, it’s also capable of using the same caching mechanism as scalar subquery caching so it can be even more efficient than just “stop on first match” – it can even stop before trying because “I’ve done that one before”).

So for each row that operation 1 (FILTER) receives from its first child it calls its second child (operation 9) to see if it will return a  row.

  • Operation 9 (nested loop semi) calls operation 10 (table access by rowid) which calls operation 11 (index range scan)
  • Operation 11 will be the eighth operation to return a rowsource, containing rowids, to operation 10
  • Operation 10 will be the ninth operation to return a rowsource, containing row data, to operation 9
  • For each row it receives operation 9 will call its second child (operation 12 (table access by rowid)) which calls operation 13 (index range scan)
  • Operation 13 will be the tenth operation to return a rowsource, containing rowids, to operation 12
  • Operation 12 will be the eleventh operation to return a rowsource, containing row data, to operation 9
  • Operation 9 will stop on the first row it receives and pass it up to opereation 1, making operation 9 the twelfth operation to return a rowsource
  • Operation 1, if it receives a row from operation 9, will pass the current row to the client interface, making it the thirteenth operation to return a rowsource.

So the final ordering is:

----------------------------------------------------------------------------
| Id  | Operation                             | Name               | Order |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |                    |       |
|*  1 |  FILTER                               |                    |    13 |
|*  2 |   HASH JOIN ANTI                      |                    |     7 |
|   3 |    PARTITION RANGE ALL                |                    |     2 |
|*  4 |     TABLE ACCESS FULL                 | IP_SPELLS          |     1 |
|   5 |    VIEW                               | VW_SQ_1            |     6 |
|*  6 |     HASH JOIN                         |                    |     5 |
|*  7 |      TABLE ACCESS FULL                | IP_DIAGNOSIS       |     3 |
|   8 |      TABLE ACCESS FULL                | IP_EPISODES        |     4 |
|   9 |   NESTED LOOPS SEMI                   |                    |    12 |
|* 10 |    TABLE ACCESS BY INDEX ROWID BATCHED| IP_EPISODES        |     9 |
|* 11 |     INDEX RANGE SCAN                  | IDX02_IP_EPISODES  |     8 |
|* 12 |    TABLE ACCESS BY INDEX ROWID BATCHED| IP_DIAGNOSIS       |    11 |
|* 13 |     INDEX RANGE SCAN                  | IDX01_IP_DIAGNOSIS |    10 |
----------------------------------------------------------------------------

Numbers

As a closing point it’s worth picking up a couple of numbers.

One number that raised the original question is the E-Rows of 2688K at operation 12 – it’s obviously wrong give that it’s supposed to be the number of rows you get from the table after getting 5 rowids from the index range scan at operation 13. Odd little glitches like this seem to appear from time to time as new internal features are tweaked and odd bits of code aren’t made totally consistent. It’s a pity that it’s there, but if an odd cardinality glitch hasn’t been echoed into the cost (which could result in a change in execution plan) it’s not really important. Since I created a model of this query to get at the query block names I’ve also run the model against 19.3 and the error is still present in that version.

You will note that the E-rows at operation 2 is 156 – when you compare it with the 15,592 for operation 3 you can see that it’s a classic 1% guess (in this case for the effects of a “not exists” subquery).

It’s also significant that the E-Rows for operation 3 is only 15,592 when the A-Rows is 425K: the error is fairly large and if there’s a performance problem with this query this error might be the first point of investigation. Maybe we need a histogram on administrative_category_code, or need to do something to help Oracle with the (1% guess from) substr(admission_method_code,1,1) – but we may have some trouble because you can’t mix and match virtual columns and column groups in 12.1.0.2

 

Video : View Expansion : DBMS_UTILITY.EXPAND_SQL_TEXT

Today’s video demonstrates how to expand SQL references to views using the DBMS_UTILITY.EXPAND_SQL_TEXT procedure. This functionality was introduced in Oracle 12.1.

The video was based on this article.

The stars of today’s video are Sanjay Kumar, Baljeet Bhasin and Jagjeet Singh, who I met and chatted to at a previous OpenWorld. </p />
</p></div>

    	  	<div class=

APEX 20.1 : Vagrant and Docker Builds (and Some Comments)

About 2 days ago we saw the announcement of the release of APEX 20.1.

https://oracle-base.com/blog/wp-content/uploads/2020/04/apex-20.1-300x29... 300w" sizes="(max-width: 287px) 85vw, 287px" />

I normally set myself quite an aggressive timetable to get new APEX releases to production. So much of what we do lags behind the curve, to the point where I just want to see it burn, so when I get the opportunity to force the issue, like I do with APEX, I do.

In my typical fashion I move all my builds to the latest release and kind-of forget any prior release ever existed. As result, you will see all my Vagrant and Docker builds have been updated to use APEX 20.1, along with updates of OpenJDK and Tomcat.

The basic installation is the same as always (here), so there is no drama there.

Next week I’ll start upgrading a few of our development environments, and check everything looks OK. Assuming I hit no problems, it will be full steam ahead. Fingers crossed.

The new features are always a big draw for most people, but I tend to start by looking at the deprecated features and desupported features.

  • When I saw APEX_UTIL.STRING_TO_TABLE and APEX_UTIL.TABLE_TO_STRING had been deprecated I was about to lose my shit, then I saw they have just been relocated to the APEX_STRING package. Composure regained. </p />
</li></ul></div>

    	  	<div class=

What’s new with Oracle database 19.7 versus 19.6

This blogpost takes a look at the technical differences between Oracle database 19 PSU 6 (january 2020) and 7 (april 2020). This gives technical specialists an idea of the differences, and gives them the ability to assess if the PSU impacts anything.

Functions


code symbol names unique in version 19.6 versus 19.7 (top 5)

keb                                                keb                                                kernel event MMON/MMNL infrastructure                                                     4
qes                                                qes                                                query execute services                                                                    5
qos                                                qos                                                query optimizer statistics                                                                6
kpd                                                (kp)d                                              kernel programmatic interface ??                                                         10
kub                                                (ku)b                                              kernel utility ??                                                                        11

code symbol names unique in version 19.7 versus 19.6 (top 5)

ksf                                                ksf                                                kernel service  functions                                                                 9
kpd                                                (kp)d                                              kernel programmatic interface ??                                                         11
kqr                                                kqr                                                kernel query dictionary/row cache                                                        12
kzr                                                kzr                                                kernel security virtual private database                                                 20
kub                                                (ku)b                                              kernel utility ??                                                                        34

kub: most of the functions starting with kub that have been removed are starting with kubsprq: kernel utility big data sql parquet. Most functions starting with kub are actually kubscrf, so big data sql related probably, but I have no idea currently what ‘crf’ means. Suggestions are welcome. Also 4 kubsprq functions are added.
kpd: the kpd function that are gone are actually kpdb, which likely means kernel programmatic interface pluggable database. In fact, there are two ‘layers’ which are unknown to me: kpdboc and kpdbocws. The kpd functions that have been added seem to have letters added to indicate extra layers besides kpdb, but only two of them are kpdboc, the rest are quite random.
qos: the qos functions removed are actually qosd, query optimizer statistics sql plan directive. There have been no functions returned starting with ‘qosd’.
qes: most of the qes functions are actually starting with qesdpi, which according to my current insights means query execute services adaptive features internationalisation (timezones). Of course this is speculation.
keb: the keb (kernel event MMON/MMNL infrastructure) functions seem to have something to do with ‘auto tasks’. There are some keb functions returned too, but not in the top 5 of returned function groupings.
kzr: the kzr functions are actually kzra, kernel security virtual private database privileges. I don’t know what the work done here is, but it seems there has been work done.
kqr: these are quite normal dictionary cache functions, with some letters added. This looks like work done to make enhancements to the dictionary cache processing.
ksf: most of the functions with ksf are actually ksfs, kernel service OFS filesystem, and a few ksfd, kernel service functions disk IO.

Parameters

parameters unique in version 19.6 versus 19.7

NAME
--------------------------------------------------
_eighth_spare_parameter
_fifteenth_spare_parameter
_hang_bool_spare1
_ninth_spare_parameter
_one-hundred-and-fifty-ninth_spare_parameter
_one-hundred-and-sixtieth_spare_parameter
_seventh_spare_parameter
_twelfth_spare_parameter

parameters unique in version 19.7 versus 19.6

NAME
--------------------------------------------------
_asm_max_kfsg_free_heap_perc
_asm_max_kfsg_free_heap_size
_bug30186319_lm_hb_lgio_timeout
_bug30352623_asm_enable_vfreloc_on_rm
_hang_mark_some_idle_waits_terminal
_kswsas_pdb_auto_close_timeout
_link_ts_force_online
_pdb_datapatch_violation_restricted

A number of spare parameters have been used to create undocumented parameters. We see the two ASM related parameters, just like with version 18 and 12.2. Also two “bug parameters”.

Waitevents

waitevents unique in version 19.6 versus 19.7

NAME
----------------------------------------------------------------------------------------------------
first spare wait event

waitevents unique in version 19.7 versus 19.6

NAME
----------------------------------------------------------------------------------------------------
CRS get service attributes

It seems that with version 19.7, a spare parameter was changed for an actual wait event. It’s unknown to me what the wait event times currently.

DBA/CDB tables columns

dba tables columns unique to 19.6 versus 19.7

NAME												     COLUMN_NAME
---------------------------------------------------------------------------------------------------- --------------------------------------------------------------------------------------------------------------------------------
DBA_APP_VERSIONS										     APP_ROOT_CLONE_NAME
DBA_WORKLOAD_CAPTURES										     ENCRYPTION_VERIFIER

cdb tables column unique to 19.7 versus 19.6

NAME												     COLUMN_NAME
---------------------------------------------------------------------------------------------------- --------------------------------------------------------------------------------------------------------------------------------
CDB_WORKLOAD_CAPTURES										     ENCRYPTION_VERIFIER

One column in two different DBA tables have been removed from version 19.7, and one of them came back as a CDB table column. The weird thing here is not the switch of a column going from a DBA table to its CDB counterpart, that is something that I saw between 18.9 and 18.10, but that with Oracle 19.7 this is only one column of the two gone missing from DBA tables came back in a CDB table.

Friday Philosophy: The Intersecting Worlds Around Oracle

Some of you may have noticed something about the Oracle Community: How certain other aspects of human nature, factors, and outside activities are unusually common.  An abiding love of the works of Douglas Adams (If you have never read “The Hitch Hikers Guide To The Galaxy” you should question if you are right for this community – and if you have read it/seen the series/watched the film and disliked it, I’m afraid you have to leave now); Lego was probably an important part of your childhood (and quite possibly your adulthood, though some “project” this fixation on to their kids). A lot of the most talented people, especially presenters, are called “Martin” or similar :-}.

There are two other groups of people that are large within the Oracle community and that I fit into.

  1. Oracle people who have a thing about cats. A positive thing, not those weird people who don’t like cats. It seems to me a lot of people in the Oracle community are happy to serve our feline overlords. This can polarise the community though, so introduce the topic of cats carefully. If the other person mentions how evil or unfriendly cats are, put them on The List Of The Damned and move on to something else.
  2. Making bread, especially of the sourdough variety. This is a growing passion I’ve noticed (quite literally, given the careful tendering of starter mixtures and also expanding waistlines). It seems to be especially common with technical Oracle people. More often than not, when I get together with a flange of Oracle Professionals (or is it a whoop or a herd?) the topic of baking bread will come up. Unlike technical topics, such as what is the fastest way to get a count of all the rows in a table, baking topics are rarely contentious and lead to fights. If you want to put spelt wheat in you mix, that’s just fine.

https://mwidlake.files.wordpress.com/2020/04/img_1999.jpg?w=600&h=600 600w, https://mwidlake.files.wordpress.com/2020/04/img_1999.jpg?w=150&h=150 150w" sizes="(max-width: 300px) 100vw, 300px" />Mrs Widlake and I were talking about this last night (one of the problems with all this social isolation business is that Mrs Widlake is being forced to spend a lot of time with me – after 27 years of marriage idle conversation was already a challenge for us and now with over a month together all the time, we are getting desperate for topics). She asked how many of my Oracle friends liked both cats AND baking bread?

It struck me that it seemed to be very, very few. Unusually few. I think this is something that needs to be investigated.  This pattern would suggest that bread makers are cat haters. But in my non-Oracle world, this is not the case. The best people are, of course,  Ailurophiles and many of my feline-fixated friends are also bakers of bread. Just not in the Oracle world.

What makes Oracle people so weird?

Does anyone have any ideas? And have you noticed any other common areas of interest (excluding computers of course, that’s just obvious)?

A few that spring to mind are:

  • Terry Pratchett and the Discworld
  • Running
  • Weird science
  • XKCD
  • The Far Side
  • Star bloody Wars.

Let me know. Or don’t.

And for all of you who don’t like cats…

AWS Certified Database Specialty (DBS-C01)

Here is my feedback after preparing and passing the AWS Database Specialty certification. There are tips about the exam but also some thoughts that came to my mind during the preparation when I had to mind-shift from a multi-purpose database system to purpose-built database services.
https://blog.dbi-services.com/wp-insides/uploads/sites/2/2020/04/AWS-Dat... 300w, https://blog.dbi-services.com/wp-insides/uploads/sites/2/2020/04/AWS-Dat... 768w, https://blog.dbi-services.com/wp-insides/uploads/sites/2/2020/04/AWS-Dat... 1376w" sizes="(max-width: 1024px) 100vw, 1024px" />

Exam Availability

This exam was in beta between last December/January and then was planned for production starting April 6, 2020. I initially planned to take the exam this first day but COVID situation, and then family reasons, I had to re-schedule two times. PearsonVUE is really good for that: free re-cancel, and the ability to take the exam at home. This last point was a bit stressful for me and here is my little feedback about taking the exam from home:

  • Wi-Fi: an ethernet cable is always more reliable. When working remotely, it happens that I have to re-connect or re-start the router (or ask kids to do it as they work from home as well and they know how internet access is important) but you can’t leave the room or talk during the exam.
  • Room: it must be closed, and nobody enters for 3 hours. A Post-It on the door is a nice reminder for kids. I also asked them to be quiet (and this without playing Fortnite because I want full bandwidth). When working, I can put headsets to concentrate, but that’s not allowed for the exam.
  • Clean desk: no paper, no second monitor,… that is not a problem. The problem is: I work in a room that is messy. For online conferences, the webcam is framed correctly to hide this. But for the exam, you have to take pictures of the room. But no worry, they are not there to judge your home and the stack of laundry to iron that is just behind

Conversion Errors

I’ve been meaning to write this note for at least three years and was prompted to write up my draft notes this morning as a follow-up to yesterday’s note on the perils of applying a to_date() function to a date column. But then I took a look at the most recent questions on the Oracle Developer Forum and discovered that Tim Hall (@oraclebase) had (inevitably) already done the necessary write-up, so I’ve just left a brief note here (more for my own benefit than anything else) of the highlights with a link to his page.

Key features available in 12.2 to avoid conversion errors are:

  • The validate_conversion() function that returns a 1 or 0 depending whether on not an expression can be converted successfully to a specific type using a particular – returns null if the expression evaluates to null.
  • Extension to generic conversion functions (e.g. to_date()) that allow a “default” value to be used to replace the supplied value if the attempted conversion raises a conversion error.

Code Sample – report rows with a valid date (in French) that is earlier than the start of this year’s (English) tax year:


rem
rem     Script:         validate_conversion.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Mar 2017
rem     Purpose:
rem
rem     Last tested
rem             12.2.0.1
rem

create table t1 (v1 varchar2(42));
insert into t1 values('15-June-2016');
insert into t1 values('15-Juin-2016');
commit;

prompt  ==================================================================
prompt  Single predicate test - only rows that validate as dates in French
prompt  ==================================================================

select  *
from    t1
where   validate_conversion(v1 as date, 'dd-month-yyyy' , 'nls_date_language=french') = 1
/

prompt  ===========================================================
prompt  Valid French dates that are before 6th April 2017 (English)
prompt  ===========================================================

select
        *
from    t1
where
        validate_conversion(v1 as date, 'dd-month-yyyy' , 'nls_date_language=french') = 1
and     to_date(v1, 'dd-month-yyyy' , 'nls_date_language=french') < to_date('06-Apr-2017','dd-mon-yyyy')
/

prompt  =========================================
prompt  Repeat the above with reversed predicates
prompt  =========================================

select
        *
from    t1
where
        to_date(v1, 'dd-month-yyyy' , 'nls_date_language=french') < to_date('06-Apr-2017','dd-mon-yyyy')
and     validate_conversion(v1 as date, 'dd-month-yyyy' , 'nls_date_language=french') = 1
/

prompt  ====================================================
prompt  Repeat the above but force the order of evaluation
prompt  This will raise error ORA-01843: not a invalid month
prompt  ====================================================

select
        /*+ ordered_predicates */
        *
from    t1
where
        to_date(v1, 'dd-month-yyyy' , 'nls_date_language=french') < to_date('06-Apr-2017','dd-mon-yyyy')
and     validate_conversion(v1 as date, 'dd-month-yyyy' , 'nls_date_language=french') = 1
/

prompt  ========================================================================
prompt  Handle the requirement with the 12.2 extended "to_date()" functionality
prompt  ========================================================================

select
        *
from
        t1
where
        to_date(
                v1 default '06-Avril-2017' on conversion error,
                'dd-month-yyyy',
                'nls_date_language=french'
        ) < to_date(
                '06-Apr-2017',
                'dd-mon-yyyy',
                'nls_date_language=English'
        )
/

Is there any guarantee that a validate_conversion() function will always be called before a call that attempts to use the conversion in another predicate. I doubt it, there’s no indication in the manuals and no general guarantee from Oracle about order of execution of predicates, so I wouldn’t risk it. Maybe the only safe use would be a CASE expression (which short-circuits) with a “when successful then” clause for evaluation of the actual conversion. You’d need to take a little extra care to remember to handle nulls correctly.

Maybe replace the table with a non-mergeable inline view that eliminates the failures? Possibly not, the optimizer might still do a simple filter pushdown.

 

 

 

 

SQL Plus … the sweet spot

Need to get a bunch from rows out of the database? Most people are aware of the ARRAYSIZE parameter to improve the fetch performance back to the client, but many people are not aware of the newer ROWPREFETCH parameter.

Let’s take a look at each to see how quickly we can drag data back to our SQL Plus client. I’ll start with table called TX which is approximately 1500MB in size, and has just over 10millions.

First let’s look at the benefits you get from ramping up ARRAYSIZE. I’ll start with the default size of 15, and quickly you’ll see why that is not a good starting point if you are fetching lots of rows. (Important emphasis on the “if” there)


SQL> set feedback only
SQL> set rowprefetch 1
SQL> set timing on
SQL> set arraysize 15
SQL> select object_name from tx;

10129860 rows selected.

Elapsed: 00:00:45.37
SQL> set arraysize 50
SQL> select object_name from tx;

10129860 rows selected.

Elapsed: 00:00:15.20
SQL> set arraysize 100
SQL> select object_name from tx;

10129860 rows selected.

Elapsed: 00:00:08.77
SQL> set arraysize 200
SQL> select object_name from tx;

10129860 rows selected.

Elapsed: 00:00:05.59
SQL> set arraysize 500
SQL> select object_name from tx;

10129860 rows selected.

Elapsed: 00:00:03.58
SQL> set arraysize 1000
SQL> select object_name from tx;

10129860 rows selected.

Elapsed: 00:00:03.42

SQL> set arraysize 2000
SQL> select object_name from tx;

10129860 rows selected.

Elapsed: 00:00:02.88
SQL>
SQL> set arraysize 3000
SQL> select object_name from tx;

10129860 rows selected.

Elapsed: 00:00:02.86

SQL> set arraysize 5000
SQL> select object_name from tx;

10129860 rows selected.

Elapsed: 00:00:02.93

It looks like the sweet spot is an array size of around 2000-3000 for this test. Notice that once it went out to 5000 the performance dropped a little. Its possible (unverified) that we’re losing a little time just reserving/utilising client memory for all those rows.

Now we’ll introduce ROWPREFETCH into the mix.


SQL> set rowprefetch 500
SQL> set arraysize 500
SQL> select object_name from tx;

10129860 rows selected.

Elapsed: 00:00:03.04
SQL> set arraysize 1000
SQL> set rowprefetch 1000
SQL> select object_name from tx;

10129860 rows selected.

Elapsed: 00:00:02.81

SQL> set arraysize 2000
SQL> set rowprefetch 2000
SQL> select object_name from tx;

10129860 rows selected.

Elapsed: 00:00:02.86

SQL> set arraysize 3000
SQL> set rowprefetch 3000
SQL> select object_name from tx;

10129860 rows selected.

Elapsed: 00:00:02.84

We can see that it doesn’t add a dramatic improvement to the performance, but it has indeed lowered the sweet spot to to an arraysize of 1000.

In any case, dragging 10 million rows out of the database in just a few seconds is impressive. But don’t forget – it is easy to manipulate a benchmark Smile. In this case, I’m only selecting a single column, and that column is near the front of each row, hence less row “slicing and dicing” is needed by the database engine. Query performance is not only about reading data off disk; there is also a CPU cost to extracting the rows from the blocks, and the columns from the rows, as well as constructing that information to be passed back to the client. Here’s the same queries but accessing all of the (20) columns in the table TX


SQL> set arraysize 100
SQL> set rowprefetch 500
SQL> select * from tx;

10129860 rows selected.

Elapsed: 00:00:32.16
SQL>
SQL>
SQL> set arraysize 1000
SQL> set rowprefetch 5000
SQL> select * from tx;

10129860 rows selected.

Elapsed: 00:00:29.21
SQL>
SQL> set arraysize 2000
SQL> set rowprefetch 5000
SQL> select * from tx;

10129860 rows selected.

Elapsed: 00:00:29.33

And also remember, the columns you select also dictates some of the mechanisms via which these rows are delivered to the client. For more on that, see the video I did on “row shipping” below.