Search

Top 60 Oracle Blogs

Recent comments

Oakies Blog Aggregator

Oracle 19c Automatic Indexing: My First Auto Index (Absolute Beginners)

I am SOOOO struggling with this nightmare block editor but here goes. Please excuse any formatting issues below: I thought it was time to show the new Oracle 19c Automatic Indexing feature in action and what better way than to go through how I created my first ever Automatic Index. To start, I create a […]

ORA-14300 partitioning key maps to a partition outside maximum permitted number of partitions and NULLs

While researching interval partitioning in Oracle 19c I came across a phenomenon I had already solved a long time ago but then forgot about. This time I decided to write about it so I don’t have to spend 5 minutes rediscovering the wheel. As always, if you are following along, make sure you are appropriately licensed for this feature

The Situation

Consider the following example:

SQL*Plus: Release 19.0.0.0.0 - Production on Fri Aug 16 09:32:17 2019
Version 19.4.0.0.0

Copyright (c) 1982, 2019, Oracle.  All rights reserved.

Last Successful login time: Fri Aug 16 2019 09:07:55 +02:00

Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.4.0.0.0

SQL> create table t (id, vc, d) 
  2  partition by range (id) interval (5000)
  3  ( 
  4   partition p1 values less than (5000)
  5  ) as
  6  select object_id, object_name, created from dba_objects;
create table t (id, vc, d)
*
ERROR at line 1:
ORA-14300: partitioning key maps to a partition outside maximum permitted
number of partitions

SQL> !oerr ora 14300
14300, 00000, "partitioning key maps to a partition outside maximum permitted number of partitions"
// *Cause:  The row inserted had a partitioning key that maps to a partition number greater than 1048575
// *Action  Ensure that the partitioning key falls within 1048575 partitions or subpartitions.

I knew about the limit of 1048575 partitions, but that shouldn’t be the case here:

SQL> select max(object_id), max(object_id) / 5000 from dba_objects;

MAX(OBJECT_ID) MAX(OBJECT_ID)/5000
-------------- -------------------
         75743             15.1486

There shouldn’t be more than 16 partitions …

NEARBY: not having a not-null constraint on a partition key might be an indication of a sub-optimal design choice.

A quick search revealed an article on AskTom mentioning NULLs. I hadn’t thought about NULL values in dba_objects.object_id, but apparently there are:

SQL> select object_id, object_name, object_type from dba_objects where object_id is null;

 OBJECT_ID OBJECT_NAME                    OBJECT_TYPE
---------- ------------------------------ -----------------------
           SYS_HUB                        DATABASE LINK

Today I learned that database links don’t have object_ids. Neither is there a data_object_id.

Summary

Don’t try inserting NULLs as a partitioning key if that partition key uses interval partitioning. The error message, although strictly speaking correct, lead me astray for a moment:

SQL> create table t (id, vc, d) 
  2  partition by range (id) interval (5000)
  3  (
  4   partition p1 values less than (5000)
  5  )
  6  as
  7  select object_id, object_name, created from dba_objects 
  8  WHERE OBJECT_ID IS NOT NULL;

Table created.

Problem solved! Another, probably better approach, would be the addition of a not null constraint in the first place:

SQL> create table t (id not null, vc, d)
  2  partition by range (id) interval (5000)
  3  (
  4   partition p1 values less than (5000)
  5  )
  6  as select object_id, object_name, created from dba_objects;

ERROR at line 6:
ORA-01400: cannot insert NULL into ("MARTIN"."T"."ID")

The intention of my post was to help users who get an ORA-14300 although they don’t exceed the partition limit. It’s not a post about good design ;)

Interlude

This actually works for non-interval partitioned tables, but ONLY if you have a catch all partition:

SQL> select dbms_rowid.rowid_object(rowid) from t1 where id is null;

DBMS_ROWID.ROWID_OBJECT(ROWID)
------------------------------
                         75564

SQL> select object_name, subobject_name, object_id, data_object_id 
  2  from dba_objects where data_object_id = 75564
  3  /

OBJECT_NAME          SUBOBJECT_NAME        OBJECT_ID DATA_OBJECT_ID
-------------------- -------------------- ---------- --------------
T1                   PMAX                      75564          75564

SQL> select high_value from user_tab_partitions where table_name = 'T1' and partition_name = 'PMAX'
  2  /

HIGH_VALUE
------------------------------
MAXVALUE

Hope this helps!

Oracle Connection Manager (CMAN) quick reporting script

Oracle Connection Manager (CMAN) quick reporting scripts

Here are a few scripts I use to parse Connection Manager “show service”

List services registered by instance

CMCTL can show the services in a long list, but I want something quick like this, with one line per service, and one column per endpoint that registers to CMAN:

script output, a bit obfuscated

The following script

  • get all CMAN running on the current host (with pgrep tnslsnr)
  • run CMCTL with the righ environment variables
  • run “administer” and “show services”
  • parse the output with AWK to put instances into columns
  • resolves IP addresses by colling “host” and removes the domain name
ps --no-headers -o pid,args -p$(pgrep -f "tnslsnr .* -mode proxy") |
while IFS=" " read pid tnslsnr args
do
# get environment variables
awk '
BEGIN{RS="\0"}
/^ORACLE_HOME|^TNS_ADMIN|^LD_LIBRARY_PATH/{printf "%s ",$0}
END{print cmctl,here}
' cmctl="$(dirname $tnslsnr)/cmctl" here="<<-'CMCTL'" /proc/$pid/environ
# name is probably the first arg without '-' nor '='
name=$(echo "$args"|awk 'BEGIN{RS=" "}/^[^-][^=]*$/{print;exit}')
echo "administer $name"
echo "show services"
echo "CMCTL"
done | sh | awk '
function hostname(h){a=h;if (h~/^[0-9.]+$/){"host "h| getline a;close("host "h);sub(/^.* /,"",a);sub(/[.]$/,"",a)}return a}
/Service ".*" has .* instance/{
gsub(qq," ")
sub(/[.].*/,"") # remove domain
service=$2
all_service[service]=1
stats="-"
}
/Instance ".*", status READY, has .* handler.* for this service/{
gsub(qq," ")
instance=$2
all_instance[instance]=0
stats="-"
}
/established:.* refused:.* state:.*/{
sub(/^ */,"")
sub(/.DEDICATED./,"D")
sub(/established:/,"")
sub(/refused:/,"/")
sub(/state:/,"")
sub(/ready/,"R")
stats=$0
}
/ADDRESS.*HOST=.*PORT=/{
port=$0;sub(/.*PORT=/,"",port);sub(/[)].*/,"",port)
host=$0;sub(/.*HOST=/,"",host);sub(/[)].*/,"",host)
if (host ~ /^[xxx0-9.]+$/) {
"host "host| getline host_host
sub(/^.* /,"",host_host)
sub(/[.]$/,"",host_host)
host=host_host
}
host=hostname(host)
sub(/[.].*/,"",host) # remove domain
all_instance_host[instance]=host
all_instance_port[instance]=port
all_instance_instance[instance]=instance
all_instance_stats[instance]=stats
all_service_instance[service,instance]=instance
if (length(host) > all_instance[instance] ) {
all_instance[instance]= length(host)
}
if (length(port) > all_instance[instance] ) {
all_instance[instance]= length(port)
}
if (length(instance) > all_instance[instance] ) {
all_instance[instance]= length(instance)
}
}
END{
# host
printf "1%39s ","host:"
for (instance in all_instance){
printf "%-"all_instance[instance]"s ", all_instance_host[instance]
}
printf "\n"
# port
printf "2%39s ","port:"
for (instance in all_instance){
printf "%-"all_instance[instance]"s ", all_instance_port[instance]
}
printf "\n"
# instance
printf "3%39s ","instance:"
for (instance in all_instance){
printf "%-"all_instance[instance]"s ", all_instance_instance[instance]
}
printf "\n"
# stats
printf "4%39s ","established/refused:"
for (instance in all_instance){
printf "%-"all_instance[instance]"s ", all_instance_stats[instance]
}
printf "\n"
# header
printf "5%39s ","---------------------------------------"
for (instance in all_instance){
printf "%-"all_instance[instance]"s ", substr("----------------------------------------",1,all_instance[instance])
}
printf "\n"
# services
for (service in all_service){
printf "%-40s ",service
for (instance in all_instance){
if (all_service_instance[service,instance]!="") {
printf "%-"all_instance[instance]"s ", all_service_instance[service,instance]
} else {
printf "%-"all_instance[instance]"s ", ""
}
}
printf "\n"
}
}' qq='"'| sort

Of course, it may be improved, and probably there are better solutions already existing. This was just faster for me rather than looking for an existing solution. Feedbacks on twitter, please:

List connections

This one formats the output of “show connections detail” like this:

In this example the last line shows that xxxc543 host has a connection to CMAN gateway 1 which is a proxy for database instance xxxxx5254 service xxxbi. The connection has been established 19 hours 57 minutes is idle for 2:37 minutes. It has received 67KB from the client, 100% of it having been sent to the instance, which returned 294KB which has been 100% transferred to the client.

Here is the script:

ps --no-headers -o pid,args -p$(pgrep -f "tnslsnr .* -mode proxy") |
while IFS=" " read pid tnslsnr args
do
# get environment variables
awk '
BEGIN{RS="\0"}
/^ORACLE_HOME|^TNS_ADMIN|^LD_LIBRARY_PATH/{printf "%s ",$0}
END{print cmctl,here}
' cmctl="$(dirname $tnslsnr)/cmctl" here="<<-'CMCTL'" /proc/$pid/environ
# name is probably the first arg without '-' nor '='
name=$(echo "$args"|awk 'BEGIN{RS=" "}/^[^-][^=]*$/{print;exit}')
echo "administer $name"
echo "show connections detail"
echo "show connections count"
echo "CMCTL"
done | sh | awk '
function time(s){if (s==4294967295){return "runnning "};d=int(s/86400);s=s-(d*86400);h=int(s/3600);s=s-(h*3600);m=int(s/60);s=s-(m*60);return sprintf("%dd+%02d:%02d:%02d",d,h,m,s)}
function size(b){u="";if(b>1024){b=b/1024;u="K"};if(b>1024){b=b/1024;u="M"};if(b>1024){b=b/1024;u="G"};return int(b)""u}
function hostname(h){a=h;if (h~/^[0-9.]+$/){"host "h| getline a;close("host "h);sub(/^.* /,"",a);sub(/[.]$/,"",a)}return a}
function pct(a,b){return int(100*a/b)"%"}
/Number of connections:/{print} #{print>"/dev/stderr"}
/Connection ID/{id=$NF}
/Gateway ID/{gw=$NF}
/Source/{src=$NF}/Destination/{dst=$NF}
/Service/{srv=$NF}/State/{sta=$NF}
/Idle time/{idl=$NF}/Connected time/{con=$NF}
/Bytes Received *.IN./{rci=$NF}
/Bytes Received *.OUT./{rco=$NF}
/Bytes Sent *.IN./{sni=$NF}/Bytes Sent *.OUT./{sno=$NF}
/^$/ || /The command completed successfully/{
if (con>0) printf "%-12s %60s %-40s %15s (idle: %9s) %9s>(%3d%%) %9s<(%3d%%) id:%6d gw:%3d\n", sta,hostname(src)"->"hostname(dst),srv,time(con),time(idl),size(rci),pct(sno,rci),size(rco),pct(sni,rco),id,gw
con=0}' | sort -k3,2

Both scripts call the Linux command “host” to resolve IP addresses to hostnames, with the following function:

function hostname(h){a=h;if (h~/^[0-9.]+$/){"host "h| getline a;close("host "h);sub(/^.* /,"",a);sub(/[.]$/,"",a)}return a}

go to the log directory

Here is a similar script to get, and go to the log directory:

cd $(
ps --no-headers -o pid,args -p$(pgrep -f "tnslsnr .* -mode proxy") |
while IFS=" " read pid tnslsnr args
do
# get environment variables
awk '
BEGIN{RS="\0"}
/^ORACLE_HOME|^TNS_ADMIN|^LD_LIBRARY_PATH/{printf "%s ",$0}
END{print cmctl,here}
' cmctl="$(dirname $tnslsnr)/cmctl" here="<<-'CMCTL'" /proc/$pid/environ
# name is probably the first arg without '-' nor '='
name=$(echo "$args"|awk 'BEGIN{RS=" "}/^[^-][^=]*$/{print;exit}')
echo "administer $name"
echo "show status"
echo "CMCTL"
done | sh | awk '/Instance Log directory/{print $NF}'
)

There, I run something like that to aggregate some information:

{
# 11g log:
for i in $(hostname -s)_{1..9}* ; do [ -f $i ] && fuser $i && cat $i ; done
# 12c log:
[ -f log.xml ] && awk '/^ /{sub(/^ /,"");print}' log.xml
} | awk -F '*' '
toupper($1)$2==" service_update "{
update_count[$3]=update_count[$3]+1
next
}
$4==" establish "{
service_count[$5]=service_count[$5]+1
client_count[$2]=client_count[$2]+1
next
}
#{print >"/dev/stderr"}
END{
for (i in update_count){
printf "%6d update from\t%-s\n",update_count[i],i
}
for (i in service_count){
printf "%6d establish to\t%-s\n",service_count[i],i
}
for (i in client_count){
printf "%6d establish from\t%-s\n",client_count[i],i
}
}
' start=$(TZ=GMT+24 date +%d-%b-%Y)| sort -n

No more stale statistics in 19c

There is an odd contradiction that we all encounter for most databases, especially if they are predominantly used during the business day. Here is how that contradiction comes to be – it is in the way that we obtain and use optimizer  statistics on those databases. The contradiction runs like this:

  • To minimize service disruption, we gather statistics at a quiet time, for example, in the middle of the night
  • We then use those statistics during the business day whilst user activity is at its highest.
  • Highest user activity will typically mean the highest frequency of data changes.
  • Hence the statistics are at their peak accuracy when no-one is using them to optimize queries, and they are at their least accurate when everyone is using them to optimize queries!

We can demonstrate this easily with the following script run in 18c.


SQL> select banner from v$version where rownum = 1 ;

BANNER
--------------------------------------------------------------------------------
Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 - Production

1 row selected.

SQL> create table t (
  2    id number(10),
  3    num number(10),
  4    constraint t_pk primary key ( id )
  5  );

Table created.

SQL> insert into t select rownum,rownum from dual connect by rownum<=10000;

10000 rows created.

SQL> exec dbms_stats.gather_table_stats(user,'t')

PL/SQL procedure successfully completed.

SQL> select stale_stats from user_tab_statistics
  2  where  table_name = 'T';

STALE_S
-------
NO

So far I’ve mimicked a table that has been populated with (say) a days worth of data (10,000) rows, and I’ve gathered statistics at the end of the business day so that my statistics reflect the current data in the table.



SQL> select table_name, num_rows, blocks
  2  from   user_tab_statistics
  3  where  table_name = 'T';

TABLE_NAME NUM_ROWS     BLOCKS
---------- -------- ----------
T             10000         20

1 row selected.

SQL> select table_name, column_name, low_value, high_value, num_distinct
  2  from   user_tab_col_statistics
  3  where  table_name = 'T';

TABLE_NAME COLUMN_NAME          LOW_VALUE       HIGH_VALUE      NUM_DISTINCT
---------- -------------------- --------------- --------------- ------------
T          ID                   C102            C302                   10000
T          NUM                  C102            C302                   10000

2 rows selected.

Hence, if I was to optimize and run a query now, I can expect an excellent estimate from the optimizer. The query below returns a COUNT of 1000 rows, and we can see from the PLAN_TABLE output, that the estimated rows was also 1000. So a perfect estimate by the optimizer!



SQL> select max(num),count(*)
  2  from t where id > 9000;

  MAX(NUM)   COUNT(*)
---------- ----------
     10000       1000

1 row selected.

SQL> select *
  2  from table(dbms_xplan.display_cursor(format=>'typical'));

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------

---------------------------------------------------------------------
| Id  | Operation                            | Name | Rows  | Bytes |
---------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |      |       |       |
|   1 |  SORT AGGREGATE                      |      |     1 |     8 |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| T    |  1000 |  8000 |
|*  3 |    INDEX RANGE SCAN                  | T_PK |  1000 |       |
---------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("ID">9000)


20 rows selected.

But of course, the next business day rolls on, and by mid-morning we may have added 1000 more rows to our table, bringing the total number of rows to 11000.


SQL> insert into t select rownum+10000,rownum+10000 from dual connect by rownum<=1000;

1000 rows created.

SQL> commit;

Commit complete.

Even with an explicit flush of the database monitoring information, we not yet reached the threshold where this tables statistics would be considered stale, so even a GATHER_STALE operation would have no effect if we ran one. The statistics still reflect the data from last night.


SQL> exec dbms_stats.flush_database_monitoring_info;

PL/SQL procedure successfully completed.

SQL> select stale_stats from user_tab_statistics
  2  where  table_name = 'T';

STALE_S
-------
NO

SQL> select table_name, num_rows, blocks
  2  from   user_tab_statistics
  3  where  table_name = 'T';

TABLE_NAME NUM_ROWS     BLOCKS
---------- -------- ----------
T             10000         20

1 row selected.

SQL> select table_name, column_name, low_value, high_value
  2  from   user_tab_col_statistics
  3  where  table_name = 'T';

TABLE_NAME COLUMN_NAME          LOW_VALUE       HIGH_VALUE
---------- -------------------- --------------- ---------------
T          ID                   C102            C302
T          NUM                  C102            C302

2 rows selected.

When I run my same query, and don’t forget, this would now be during the business day, when it is critical that I get a good optimizer plan, you can see that the result is now 2000 rows, but the plan estimate is left languishing at its original 1000 rows because the statistics no long reflect the current state of the table.


SQL> select max(num),count(*)
  2  from t where id > 9000;

  MAX(NUM)   COUNT(*)
---------- ----------
     11000       2000

1 row selected.

SQL> select *
  2  from table(dbms_xplan.display_cursor(format=>'typical'));

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------
SQL_ID  8juuu5warw2z6, child number 0
-------------------------------------
select max(num),count(*) from t where id > 9000

Plan hash value: 2053823973

---------------------------------------------------------------------
| Id  | Operation                            | Name | Rows  | Bytes |
---------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |      |       |       |
|   1 |  SORT AGGREGATE                      |      |     1 |     8 |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| T    |  1000 |  8000 |
|*  3 |    INDEX RANGE SCAN                  | T_PK |  1000 |       |
---------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("ID">9000)


20 rows selected.

Having great statistics that are not used, and poor statistics that are used all the time seems a nonsensical way to do things, but there really hasn’t been any alternative unless you were prepared to consume precious server resources to collect statistics whilst your users are performing their transactions. I have known customers to do this in the past, but of course, it takes careful monitoring and management to ensure that the slicing up of the resource “pie” is done appropriately to keep the right balance between business needs and background database tasks.

19c brings a nifty solution to this contradiction by having the database automatically maintain details about how data is changing in a table via a feature called Real Time Statistics. DML activities can be tracked, and that information can then be looped back as input into future query parsing. Here’s the same example from above, this time performed in 19c.

The same table is seeded with the same initial 10000 rows as before. I have included a new column in my data dictionary queries called NOTES. You will see the significance of that shortly.



SQL> select banner from v$version where rownum = 1;

BANNER
--------------------------------------------------------------------------
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production

1 row selected.

SQL>
SQL> create table t (
  2    id number(10),
  3    num number(10),
  4    constraint t_pk primary key ( id )
  5  );

Table created.

SQL> insert into t select rownum,rownum from dual connect by rownum<=10000;

10000 rows created.

SQL> exec dbms_stats.gather_table_stats(user,'t')

PL/SQL procedure successfully completed.

SQL> select stale_stats from user_tab_statistics
  2  where  table_name = 'T';

STALE_S
-------
NO

1 row selected.

SQL> select notes, table_name, num_rows, blocks
  2  from   user_tab_statistics
  3  where  table_name = 'T';

NOTES                            TABLE_NAME NUM_ROWS     BLOCKS
-------------------------------- ---------- -------- ----------
                                 T             10000         20

1 row selected.

SQL> select notes, table_name, column_name, low_value, high_value, num_distinct
  2  from   user_tab_col_statistics
  3  where  table_name = 'T';

NOTES                            TABLE_NAME COLUMN_NAME          LOW_VALUE       HIGH_VALUE      NUM_DISTINCT
-------------------------------- ---------- -------------------- --------------- --------------- ------------
                                 T          ID                   C102            C302                   10000
                                 T          NUM                  C102            C302                   10000

And since this is a simple query, the optimizer performs exactly as per 18c for its estimate. Once again, at this point, where the data and statistics are in alignment, the estimate is perfect.


                                 
SQL> select max(num),count(*)
  2  from t where id > 9000;

  MAX(NUM)   COUNT(*)
---------- ----------
     10000       1000

1 row selected.                                 


SQL> select *
  2  from table(dbms_xplan.display_cursor(format=>'typical'));

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------

----------------------------------------------------------------------------------
| Id  | Operation                            | Name | Rows  | Bytes | Cost (%CPU)|
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |      |       |       |     5 (100)|
|   1 |  SORT AGGREGATE                      |      |     1 |     8 |            |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| T    |  1000 |  8000 |     5   (0)|
|*  3 |    INDEX RANGE SCAN                  | T_PK |  1000 |       |     3   (0)|
----------------------------------------------------------------------------------

The business day commences, and a fresh 1000 rows are added to the database.



SQL> insert into t select rownum+10000,rownum+10000 from dual connect by rownum<=1000;

1000 rows created.

SQL> commit;

Commit complete.

And just like before, this is not enough for the statistics on the table to be considered stale by the database



SQL> exec dbms_stats.flush_database_monitoring_info;

PL/SQL procedure successfully completed.

SQL>
SQL> select stale_stats from user_tab_statistics
  2  where  table_name = 'T';

STALE_S
-------
NO

But, here is where the 19c enhancements come into play. When we query the data dictionary, we now have TWO rows of statistics for this single table.



SQL> select notes, table_name, num_rows, blocks
  2  from   user_tab_statistics
  3  where  table_name = 'T';

NOTES                            TABLE_NAME NUM_ROWS     BLOCKS
-------------------------------- ---------- -------- ----------
                                 T             10000         20
STATS_ON_CONVENTIONAL_DML        T             11000         20

2 rows selected.

The first row is the original statistics we gathered. The second row is new statistics we have derived from the normal DML activities that have occurred on the database. This is not a full GATHER operation, it is simply taking advantage of simple mathematics, eg if I have 10000 rows and then insert 50 and deleted 20, then I would have 10030 rows etc. The real time statistics can also keep track of simple column level statistics, like whether the low and high water mark values have changed.



SQL> select notes, table_name, column_name, low_value, high_value
  2  from   user_tab_col_statistics
  3  where  table_name = 'T';

NOTES                            TABLE_NAME COLUMN_NAME          LOW_VALUE       HIGH_VALUE
-------------------------------- ---------- -------------------- --------------- ----------
                                 T          ID                   C102            C302
                                 T          NUM                  C102            C302
STATS_ON_CONVENTIONAL_DML        T          ID                   C102            C302094A
STATS_ON_CONVENTIONAL_DML        T          NUM                  C102            C302094A

The real test is whether these statistics can be beneficial to our sample query. I’ll run that through again.



SQL> select max(num),count(*)
  2  from t where id > 9000;

  MAX(NUM)   COUNT(*)
---------- ----------
     11000       2000

1 row selected.

SQL> select *
  2  from table(dbms_xplan.display_cursor(format=>'typical'));

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------
Plan hash value: 2966233522

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |     7 (100)|          |
|   1 |  SORT AGGREGATE    |      |     1 |     8 |            |          |
|*  2 |   TABLE ACCESS FULL| T    |  1895 | 15160 |     7   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("ID">9000)

Note
-----
   - dynamic statistics used: statistics for conventional DML

Notice the estimate of 1895 is much closer now to the true value of 2000. You can expect a little variation here from the true value, because as I mentioned before, real time statistics is not about performing a full gather operation on the table – that would be far too resource hungry for every single DML. So some elements of the statistics (for example, the number of distinct values, or histogram distribution) are not feasible to keep up to date every time a DML is performed. But even having the improved row counts and column extrema have yielded benefit on the execution plan. When the normal nightly gather job comes along, these DML statistics are no longer relevant and will be expunged


SQL> exec dbms_stats.gather_table_stats(user,'t')

PL/SQL procedure successfully completed.

SQL> select notes, table_name, num_rows, blocks
  2  from   user_tab_statistics
  3  where  table_name = 'T';

NOTES                            TABLE_NAME NUM_ROWS     BLOCKS
-------------------------------- ---------- -------- ----------
                                 T             11000         20

1 row selected.

SQL> select notes, table_name, column_name, low_value, high_value
  2  from   user_tab_col_statistics
  3  where  table_name = 'T';

NOTES                            TABLE_NAME COLUMN_NAME          LOW_VALUE       HIGH_VALUE
-------------------------------- ---------- -------------------- --------------- ----------
                                 T          ID                   C102            C3020B
                                 T          NUM                  C102            C3020B

2 rows selected.

There are other enhancements in 19c to keep the statistics in the dictionary closer to the true data in database tables which you can read about here.

Full disclosure: Some 19c new features are available on certain Oracle Database platforms only. Check the Oracle Database Licensing Guide for more information

2019 Public Appearances (What In The World)

I’ll be presenting at a number of Oracle events over the remainder of the year. Details as follows: Oracle Open World – San Francisco (16-19 September 2019)

gather_system_stats

What happens when you execute dbms_stats.gather_system_stats() with the ‘Exadata’ option ?

Here’s what my system stats look like (12.2.0.1 test results) after doing so. (The code to generate the two different versions is at the end of the note).


System Stats
============
Status: COMPLETED
Timed: 13-Aug-2019 15:00:00 - 13-Aug-2019 15:00:00
--------------------------------------------------
CPUSPEED        :
CPUSPEEDNW      :          918
IOSEEKTIM       :           10
IOTFRSPEED      :      204,800
MAXTHR          :
MBRC            :          128
MREADTIM        :
SLAVETHR        :
SREADTIM        :

PL/SQL procedure successfully completed.

MBRC       :          128
MREADTIM   :
SREADTIM   :
CPUSPEED   :
CPUSPEEDNW :          918
IOSEEKTIM  :           10
IOTFRSPEED :      204,800
MAXTHR     :
SLAVETHR   :

PL/SQL procedure successfully completed.

All the code does is set the MBRC, IOSEEKTIM, and IOTFRSPEED to fixed values and the only real gather is the CPUSPEEDNW. The parameters showing blanks are deliberately set null by the procedure – before I called the gather_system_stats() every parameter had a value. You could also check the SQL trace file (with bind captured enabled) to see the statements that deliberately set those parameters to null if you want more proof.

What are the consequences of this call (assuming you haven’t also done something with the calibrate_io() procedure? Essentially Oracle now has information that says a single (8KB) block read request will take marginally over 10 milliseconds, and a multiblock read request of 1MB will take just over 15 milliseconds: in other words “tablescans are great, don’t use indexes unless they’re really precisely targetted”. To give you a quantitative feel for the numbers: given the choice between doing a tablescan of 1GB to pick 1,500 randomly scattered rows and using a perfect index the optimizer would choose the index.

To explain the time calculations: Oracle has set an I/O seek time of 10 ms, and a transfer rate of 204,800 bytes per ms (200 MB/s), with the guideline that a “typical” multiblock read is going to achieve 128 blocks. So the optimizer believes a single block read will take 10 + 8192/204800 ms = 10.04ms, while a multiblock read request for 1MB will take 10 + 1048576/204800 ms = 15.12 ms.

It’s also important to note that Oracle will use the 128 MBRC value in its calculation of the cost of the tablescan – even if you’ve set the db_file_mulitblock_read_count parameter for the session or system to something smaller; and if you have set the db_file_multiblock_read_count that’s the maximum size of multiblock read that the run-time engine will use.

Code

Here are the two procedures I used to report the values above. You will only need the privilege to execute the dbms_stats package for the second one, but you’ll need the privilege to access the SYS table aux_stats$ to use the first. The benefit of the first one is that it can’t go out of date as versions change.


rem
rem     Script:         get_system_stats.sql
rem     Author:         Jonathan Lewis
rem     Dated:          March 2002
rem
rem     Last tested
rem             18.3.0.0
rem             12.2.0.1
rem             12.1.0.2
rem             11.2.0.4
rem

set linesize 180
set trimspool on
set pagesize 60
set serveroutput on

spool get_system_stats

declare
        m_value         number;
        m_status        varchar2(64);
        m_start         date;
        m_stop          date;
begin
        for r1 in (
                select  rownum rn, pname
                from    sys.aux_stats$
                where   sname = 'SYSSTATS_MAIN'
        ) loop
                dbms_stats.get_system_stats(m_status, m_start, m_stop, r1.pname, m_value);
                if r1.rn = 1 then
                        dbms_output.put_line('System Stats');
                        dbms_output.put_line('============');
                        dbms_output.put_line('Status: ' || m_status);
                        dbms_output.put_line(
                                'Timed: ' ||
                                to_char(m_start,'dd-Mon-yyyy hh24:mi:ss') ||
                                ' - ' ||
                                to_char(m_stop ,'dd-Mon-yyyy hh24:mi:ss')
                        );
                        dbms_output.put_line('--------------------------------------------------');
                end if;
                dbms_output.put_line(rpad(r1.pname,15) ||  ' : ' || to_char(m_value,'999,999,999'));
        end loop;
end;
/

declare
        m_value         number;
        m_status        varchar2(64);
        m_start         date;
        m_stop          date;
begin
        dbms_stats.get_system_stats(m_status, m_start, m_stop, 'MBRC', m_value);
        dbms_output.put_line('MBRC       : ' || to_char(m_value,'999,999,999'));
        dbms_stats.get_system_stats(m_status, m_start, m_stop, 'MREADTIM', m_value);
        dbms_output.put_line('MREADTIM   : ' || to_char(m_value,'999,999,999'));
        dbms_stats.get_system_stats(m_status, m_start, m_stop, 'SREADTIM', m_value);
        dbms_output.put_line('SREADTIM   : ' || to_char(m_value,'999,999,999'));
        dbms_stats.get_system_stats(m_status, m_start, m_stop, 'CPUSPEED', m_value);
        dbms_output.put_line('CPUSPEED   : ' || to_char(m_value,'999,999,999'));
        dbms_stats.get_system_stats(m_status, m_start, m_stop, 'CPUSPEEDNW', m_value);
        dbms_output.put_line('CPUSPEEDNW : ' || to_char(m_value,'999,999,999'));
        dbms_stats.get_system_stats(m_status, m_start, m_stop, 'IOSEEKTIM', m_value);
        dbms_output.put_line('IOSEEKTIM  : ' || to_char(m_value,'999,999,999'));
        dbms_stats.get_system_stats(m_status, m_start, m_stop, 'IOTFRSPEED', m_value);
        dbms_output.put_line('IOTFRSPEED : ' || to_char(m_value,'999,999,999'));
        dbms_stats.get_system_stats(m_status, m_start, m_stop, 'MAXTHR', m_value);
        dbms_output.put_line('MAXTHR     : ' || to_char(m_value,'999,999,999'));
        dbms_stats.get_system_stats(m_status, m_start, m_stop, 'SLAVETHR', m_value);
        dbms_output.put_line('SLAVETHR   : ' || to_char(m_value,'999,999,999'));
end;
/

spool off

Azure VMs with Oracle- Next Steps

Microsoft has done a great job of documenting how to create a VM with the appropriate VM image and Oracle version, then how to log in, startup the listener and create the database.  I just have some enhancements I’d like to make to it, hoping to help it move up one level.

I Bequeath to You

All instructions provided by the Microsoft documentation show how to connect to the database using a bequeath, (BEQ) connection.  This is done by the following command, using SQL Plus, (or similar):

sqlplus / as sysdba

It bypasses the need for a connection string, only requiring the SID to be set at the environment level:

export ORACLE_SID=

This information, for Oracle DBAs, is often gathered from a running database server executing the following command:

ps -ef | grep pmon

For the standard database user, accessing an Oracle database in a standard capacity, the connection is performed via TNS, (Transparent Network Substrate).  This requires a running Oracle listener to listen for incoming connections, then connecting to information stored in the sqlnet.ora and the tnsnames.ora files.  There are a couple of additional steps that either an Oracle DBA without Azure experience may not know or an Azure specialist without Oracle experience might miss that I’m going to document to make it easier, which can result in a chicken before the egg challenge that we’ll take on in this blog post.

Using the documentation and performing the steps to create the VM, but stopping at the “Create the Database” step, (no logging into the VM, su over to the Oracle user and starting the listener.)  Instead, lets log into the Azure Portal and go to the VM you’ve just created.

Why do you want to go to the portal instead of starting the listener, (and if you’ve started the listener, why do you want to shut it back down and go to the portal)?

If you do log into the VM and start the listener, you could verify the name of the host that is set by default as part of the VM creation by querying the status of the running listener:

https://dbakevlar.com/wp-content/uploads/2019/08/vmblog2-1-300x52.jpg 300w, https://dbakevlar.com/wp-content/uploads/2019/08/vmblog2-1-768x134.jpg 768w" sizes="(max-width: 800px) 100vw, 800px" />

Host=oracledb2.qzvde3m5s4y…..  what??  cx.internal.cloudapp.net?  Aw, hell no.

Not the best of names and not something you want to have for any length of time if you want to try to connect with TNS.  So you may be asking how to fix this as an Oracle DBA?

Configure DNS in Azure

Although there is a way to do this as part of the azure VM creation, if you’re following the directions, it wouldn’t have occurred, so let’s log into the portal and address the DNS issue.

  1. Go to the Azure Portal and login.
  2. Go to the Resource group where the VM was deployed to.
  3. Click on the VM from the list.

In the top section of the overview, last one on the right hand side, along with pertinent information about the VM, Private and Public IP Address, you’ll also see the option to configure the DNS Name:

https://dbakevlar.com/wp-content/uploads/2019/08/vmblog1-300x98.jpg 300w, https://dbakevlar.com/wp-content/uploads/2019/08/vmblog1-768x250.jpg 768w, https://dbakevlar.com/wp-content/uploads/2019/08/vmblog1.jpg 1440w" sizes="(max-width: 800px) 100vw, 800px" />

Click on Configure and then you’ll be taken to the page to configure it.

Again, make sure your Listener is shut down, as it relies on the updates we’re making to understand it’s “location”, so this will save you from having to run a kill command on the Listener PID for force it to shutdown.

https://dbakevlar.com/wp-content/uploads/2019/08/vmblog4-300x77.jpg 300w, https://dbakevlar.com/wp-content/uploads/2019/08/vmblog4-768x196.jpg 768w, https://dbakevlar.com/wp-content/uploads/2019/08/vmblog4.jpg 1419w" sizes="(max-width: 800px) 100vw, 800px" />

  1. Change the IP Address to STATIC-  yes, this will require a reboot of the VM, that’s alright.
  2. Add the short name of the host, in this case, oracledb2.  Note: the long name for our example will be oracledb2.eastus2.cloudapp.azure.com.
  3. Click on Save and restart the VM to update the new settings if required.
  4. Log back in and verify everything.

The portal will now look similar to the following, (although you will have a private and public IP address showing, along with the DNS Name):

https://dbakevlar.com/wp-content/uploads/2019/08/vmblog5-300x68.jpg 300w, https://dbakevlar.com/wp-content/uploads/2019/08/vmblog5-768x175.jpg 768w, https://dbakevlar.com/wp-content/uploads/2019/08/vmblog5.jpg 1141w" sizes="(max-width: 800px) 100vw, 800px" />

Verify the Linux Host

Go back to your ssh terminal, log into the VM and check the host name by typing in the following command :

uname -a

Now let’s configure the /etc/hosts file with the new information.  This must be done as the server admin login you use to ssh into the host.  It has the sudo privileges needed to perform these steps:

sudo vi /etc/hosts

You should see the following entries:

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

Append the following information on a new line to the file:

Example:

40.17.124.235 oracledb2 oracledb2.eastus2.cloudapp.azure.com

Save the file, (ESC, :wq) and SU, (switch user) to Oracle.

  1. Start the listener.
  2. Note the host name change:

https://dbakevlar.com/wp-content/uploads/2019/08/vmblog6-300x189.jpg 300w, https://dbakevlar.com/wp-content/uploads/2019/08/vmblog6-768x484.jpg 768w" sizes="(max-width: 819px) 100vw, 819px" />

Much better!!

Now follow the directions in the documentation to create the database with DBCA, (the Database Configuration Assistant)

Once you’ve completed all the steps from the documentation, we can then return to complete the TNS configuration.

TNS Setup

then proceed to setting up the TNS connection.

  1. SU over to the Oracle user.
  2. CD to $ORACLE_HOME/network/admin
  3. We’re going to create two new files to support TNS connections.
vi tnsnames.ora

Format of the entry, placing you alias/database/host shortname in where appropriate:

 =
 (DESCRIPTION =
  (ADDRESS_LIST =
   (ADDRESS = (PROTOCOL = TCP)(HOST = )(PORT = 1521))
  )
  (CONNECT_DATA =
   (SERVICE_NAME = )
  )
 )

Once finished, type in :wq to write to the file and quit.

Let’s say you don’t know what the service name is for your database, (again for those Azure folks that are just starting to acclimate to Oracle…) There’s a few different ways to do this.  You can use hour bequeath login to verify:

sqlplus / as sysdba
SQL> select name from dba_services
  2  where name not in ('SYS$BACKGROUND','SYS$USERS')
  3  and name not like '%XDB';

OR gather the info from the listener:

lsnrctl services

Both of these actions must be performed as the ORACLE user and the first one requires the ORACLE_SID to be set for the syntax used.

Now let’s create the sqlnet.ora file:

vi sqlnet.ora

Format of the entry:

NAMES.DIRECTORY_PATH= (TNSNAMES)

Again, type in :wq to write and quit when completed.

This tells the listener to use TNS as it’s choice to connect.  There are numerous other options for connections, but I haven’t included them in this post.  Let’s test our connection now:

sqlplus @

Enter in the password for the user and you should connect.

  • If you receive a 12154, check the names in your TNS alias to the one you used.
  • A 12170, check the services that are listed vs. what you put in the TNS file.

Now your Oracle VM is set up to support basic TNS connections to the Listener.  If there are more advanced connections security setup, such as Kerberos, then this will have to be installed and configured on the Linux VM, but that’s a post for another day.

I’m headed out this week to speak at NoCOUG, (Northern California Oracle User Group) on Thursday and will then be heading to Louisiana to speak at SQL Saturday Baton Rouge this weekend.  Looking forward to seeing everyone at both these events and customers inbetween!

 

 



Tags:  , ,


Del.icio.us



Facebook

TweetThis

Digg

StumbleUpon




Copyright ©  [Azure VMs with Oracle- Next Steps], All Right Reserved. 2019.

How 19c Auto Indexes are named?

As a SQL_ID-like base 32 hash on table owner, name, column list

The indexes created by the 19c Auto Indexing feature have a generated name like: “SYS_AI_gg1ctjpjv92d5”. I don’t like to rely on the names: there’s an AUTO column in DBA_INDEXES to flag the indexes created automatically.

But, one thing is very nice: the name is not random. The same index (i.e on same table and columns) will always have the same name. Even when dropped and re-created. Even when created in a different database. This is very nice to follow them (like quickly searching in my e-mails and find the same issue encountered in another place). Like we do with SQL_ID.

Yes, the generation of the name is similar to SQL_ID as it is the result of a 64-bit number from a hash function, displayed in base 32 with alphanumeric characters.

The hash function is SYS_OP_COMBINED_HASH applied on the table owner, table name and column list. Yes, the same function that is used by extended statistics column groups. Why not? All that is developed by the CBO team. They re-use their own functions.

So, from the previous post, I have the following indexes created by Auto Index feature:

SQL> select owner,index_name,object_id,auto from dba_indexes natural left outer join (select owner index_owner,object_name index_name,object_id from dba_objects where object_type='INDEX') where auto='YES';
OWNER              INDEX_NAME    OBJECT_ID    AUTO
________ _______________________ ____________ _______
ADMIN SYS_AI_gg1ctjpjv92d5 73,192 YES
ADMIN SYS_AI_8u25mzzr6xw1v 73,231 YES
ADMIN SYS_AI_26rdw45ph3hag 73,232 YES

Let’s take the first one.

SQL> ddl "SYS_AI_gg1ctjpjv92d5"
CREATE INDEX "ADMIN"."SYS_AI_gg1ctjpjv92d5" 
ON "ADMIN"."WORDS" ("SOUND") AUTO;

Here is the hash from table owner and name (without quotes) and column list (quoted):

SQL> select SYS_OP_COMBINED_HASH('ADMIN','WORDS','"SOUND"')
from dual;
   SYS_OP_COMBINED_HASH('ADMIN','WORDS','"SOUND"')
__________________________________________________
17835830731812932005

And I’m using Nenad Noveljic conversion to base 32 from:

Converting HASH_VALUE to SQL_ID - All-round Database Topics

gg1ctjpjv92d5 is the base 32 hash value for this table/columns definition and the Auto Index created was: SYS_AI_gg1ctjpjv92d5

If you are connected as SYS, there’s an internal function for this base32 conversion (the one from SQL Plan Directives used to store the SQL_ID of the Dynamic Sampling query since 12cR2, which caches the result of dynamic sampling in the SPD rather than using the Result Cache as in 12cR2):

SQL> select ltrim(SYS.DBMS_SPD_INTERNAL.UB8_TO_SQLID( 17835830731812932005 ),'0') from dual;
LTRIM(SYS.DBMS_SPD_INTERNAL.UB8_TO_SQLID(17835830731812932005),'0')
____________________________________________________________________
gg1ctjpjv92d5

For compound indexes, here is an example:

SQL> ddl "SYS_AI_26rdw45ph3hag"
CREATE INDEX "ADMIN"."SYS_AI_26rdw45ph3hag" 
ON "ADMIN"."WORDS" ("CAP", "LOW", "UPP") AUTO;

The hash of columns is calculated on a the space-free comma-separated quoted column list:

SQL> select SYS_OP_COMBINED_HASH
('ADMIN','WORDS','"CAP","LOW","UPP"')
from dual;
SYS_OP_COMBINED_HASH('ADMIN','WORDS','"SOUND"')
__________________________________________________
2548399815876788559
SQL> select ltrim(SYS.DBMS_SPD_INTERNAL.UB8_TO_SQLID( 2548399815876788559 ),'0') from dual;
LTRIM(SYS.DBMS_SPD_INTERNAL.UB8_TO_SQLID(2548399815876788559),'0')
____________________________________________________________________
26rdw45ph3hag

Here it is. The hash is 26rdw45ph3hag on the columns indexed by SYS_AI_26rdw45ph3hag.

Back with Nenad function, here is how to generate the AI name for any existing index:

with function TO_SQLID(n number) return varchar2 as
--https://nenadnoveljic.com/blog/converting-hash_value-to-sql_id/
base32 varchar2(16);
begin
select
listagg(substr('0123456789abcdfghjkmnpqrstuvwxyz',
mod(trunc(n/power(32,level-1)),32)+1,1)
) within group (order by level desc) into base32
from dual
connect by level <= ceil(log(32,n+1));
return base32;
end;
select table_owner,table_name,cols,'SYS_AI_'||
to_sqlid(sys_op_combined_hash(table_owner,table_name,cols))
AI_INDEX_NAME
from (
select table_owner,table_name,index_name
,listagg('"'||column_name||'"',',')
within group(order by column_position) cols
from dba_ind_columns
--where index_name like 'SYS_AI%'
group by table_owner,table_name,index_name
);

Of course, when the index is created its name and definition is accessible. But being sure that the name is a predictable hash will help to manage Automatic Indexes.

With this post and the previous one, you have all information to rename a manually created index to am Auto Index one and set the AUTO flag. Of course, don’t do that. Auto Index keeps metadata about the SQL Tuning Sets and Auto Index DDL actions and faking the AUTO flag will make all that inconsistent.

Those examples on this WORDS tables comes from the demo I’m preparing for my Oracle Open World session on Auto Index:

Oracle Database 19c Automatic Indexing Demystified
Thursday, Sept. 19th, 02:15 PM in Moscone West — Room 3020A

Also many sessions on the same topic:

Session Catalog

How to drop an index created by Oracle 19c Auto Indexing?

ORA-65532: cannot alter or drop automatically created indexes

Oracle 19c Automatic Indexing is not like the autonomous features that happen without your control. You can decide to enable it (if you are on a platform that allows it) or not, and in report-only or implementation mode.

But when you have enabled it to create new indexes, you are not supposed to revert its effect. What if you want to drop those indexes?

DROP INDEX

If I want to drop an index that has been created automatically (i.e with the AUTO=’YES’ in DBA_INDEXES) I get the following error:

SQL> select owner,index_name,auto,tablespace_name from dba_indexes natural where auto='YES';
OWNER              INDEX_NAME    AUTO    TABLESPACE_NAME
________ _______________________ _______ __________________
ADMIN SYS_AI_8u25mzzr6xw1v YES AITBS
ADMIN SYS_AI_gg1ctjpjv92d5 YES AITBS
ADMIN SYS_AI_26rdw45ph3hag YES AITBS
SQL> drop index ADMIN."SYS_AI_8u25mzzr6xw1v";
drop index ADMIN."SYS_AI_8u25mzzr6xw1v"
*
ERROR at line 1:
ORA-65532: cannot alter or drop automatically created indexes

ALTER INDEX

I get the same error if I try to make it invisible (so that at least it is not used by the queries) or unusable (so that it is not maintained by the DML):

SQL> alter index ADMIN."SYS_AI_8u25mzzr6xw1v" invisible;
alter index ADMIN."SYS_AI_8u25mzzr6xw1v" invisible
*
ERROR at line 1:
ORA-65532: cannot alter or drop automatically created indexes
SQL> alter index ADMIN."SYS_AI_8u25mzzr6xw1v" unusable;
alter index ADMIN."SYS_AI_8u25mzzr6xw1v" unusable
*
ERROR at line 1:
ORA-65532: cannot alter or drop automatically created indexes

IND$.PROPERTY unsupported hack

In ?/rdbms/admin/cdcore_ind.sql the definition for DBA_INDEXES defines AUTO as:

decode(bitand(i.property, 8), 8, 'YES', 'NO'),
...
from ... sys.ind$ i ...

In ?/rdbms/admin/dcore.bsq the comment for this IND$ flag is probably wrong (probably an old flag being re-used for the Auto-Index feature):

property number not null,/* immutable flags for life of the index */
/* unique : 0x01 */
/* partitioned : 0x02 */
/* reverse : 0x04 */
/* compressed : 0x08 */
/* functional : 0x10 */

The comment is wrong but the important thing is that the AUTO attribute is defined as an immutable property rather than a flag that can be mutable.

This gives me a possibility to drop an index that has been created by the Auto Index feature, but totally unsupported, undocumented and probably very dangerous. Here is the OBJECT_ID:

SQL> select owner,index_name,object_id,auto,tablespace_name from dba_indexes natural left outer join (select owner index_owner,object_name index_name,object_id from dba_objects where object_type='INDEX') where auto='YES';
OWNER           INDEX_NAME OBJECT_ID AUTO TABLESPACE_NAME
_____ ____________________ _________ ____ _______________
ADMIN SYS_AI_8u25mzzr6xw1v 73191 YES AITBS
ADMIN SYS_AI_gg1ctjpjv92d5 73192 YES AITBS
ADMIN SYS_AI_26rdw45ph3hag 73193 YES AITBS

The property 0x8 is set:

SQL> select property from sys.ind$ where obj#=73191;
PROPERTY
----------
8

I un-flag it:

SQL> show user
show user
USER is "SYS"
SQL> update sys.ind$ set property=property-8 
where bitand(property,8)=8 and obj#=73191;
1 row updated.

Not anymore flagged as AUTO:

SQL> select owner,index_name,object_id,auto,tablespace_name from dba_indexes natural left outer join (select owner index_owner,object_name index_name,object_id from dba_objects where object_type='INDEX') where index_name like 'SYS_AI%';
OWNER           INDEX_NAME OBJECT_ID AUTO TABLESPACE_NAME
_____ ____________________ _________ ____ _______________
ADMIN SYS_AI_8u25mzzr6xw1v 73191 NO AITBS
ADMIN SYS_AI_gg1ctjpjv92d5 73192 YES AITBS
ADMIN SYS_AI_26rdw45ph3hag 73193 YES AITBS

And I can now drop it:

SQL> drop index ADMIN."SYS_AI_8u25mzzr6xw1v";
Index dropped.

Again, this is totally unsupported: don’t do that!

SQL> select owner,index_name,object_id,auto,tablespace_name from dba_indexes natural left outer join (select owner index_owner,object_name index_name,object_id from dba_objects where object_type='INDEX') where index_name like 'SYS_AI%';
OWNER           INDEX_NAME OBJECT_ID AUTO TABLESPACE_NAME
_____ ____________________ _________ ____ _______________
ADMIN SYS_AI_gg1ctjpjv92d5 73192 YES AITBS
ADMIN SYS_AI_26rdw45ph3hag 73193 YES AITBS

DROP TABLESPACE

In a more supported way, I can drop all AUTO indexes by dropping the tablespace where they reside. If I plan to do that, I’ve probably defined a specific tablespace for them (rather than the default tablespace for the user):

SQL> select parameter_name,parameter_value from dba_auto_index_config order by 1;
                    PARAMETER_NAME    PARAMETER_VALUE
__________________________________ __________________
AUTO_INDEX_COMPRESSION OFF
AUTO_INDEX_DEFAULT_TABLESPACE AITBS
AUTO_INDEX_MODE IMPLEMENT
AUTO_INDEX_REPORT_RETENTION 31
AUTO_INDEX_RETENTION_FOR_AUTO 373
AUTO_INDEX_RETENTION_FOR_MANUAL
AUTO_INDEX_SCHEMA
AUTO_INDEX_SPACE_BUDGET 50

This just works to remove all indexes created there:

SQL> drop tablespace AITBS including contents;
Tablespace dropped.

MOVE and DROP

I may not want to drop all of them. What if I move one index into a new tablespace? I don’t want to actually rebuild it, unusable is ok for me:

SQL> alter index ADMIN."SYS_AI_26rdw45ph3hag" rebuild tablespace EPHEMERAL unusable;
alter index ADMIN."SYS_AI_26rdw45ph3hag" rebuild tablespace EPHEMERAL unusable
*
ERROR at line 1:
ORA-14048: a partition maintenance operation may not be combined with other operations

Well, I don’t know how to do this without rebuilding it. So let’s do this:

SQL> create tablespace EPHEMERAL nologging;
Tablespace created.
SQL> alter user admin quota unlimited on EPHEMERAL;
User altered.
SQL> alter index ADMIN."SYS_AI_26rdw45ph3hag" rebuild tablespace EPHEMERAL online;
Index altered.

This works, so not all ALTER INEX commands fail with an ORA-65532.

SQL> select owner,index_name,object_id,auto,tablespace_name from dba_indexes natural left outer join (select owner index_owner,object_name index_name,object_id from dba_objects where object_type='INDEX') where index_name like 'SYS_AI%';
OWNER           INDEX_NAME OBJECT_ID AUTO TABLESPACE_NAME
_____ ____________________ _________ ____ _______________
ADMIN SYS_AI_gg1ctjpjv92d5 73192 YES AITBS
ADMIN SYS_AI_26rdw45ph3hag 73193 YES EPHEMERAL

And I can now drop this tablespace that contains only this index:

SQL> drop tablespace EPHEMERAL including contents;
Tablespace dropped.

Goal achieved, in a supported way:

SQL> select owner,index_name,object_id,auto,tablespace_name from dba_indexes natural left outer join (select owner index_owner,object_name index_name,object_id from dba_objects where object_type='INDEX') where index_name like 'SYS_AI%';
OWNER           INDEX_NAME OBJECT_ID AUTO TABLESPACE_NAME
_____ ____________________ _________ ____ _______________
ADMIN SYS_AI_gg1ctjpjv92d5 73192 YES AITBS

“_optimizer_use_auto_indexes”=OFF

Finally, if I don’t want to use the AUTO indexes, I don’t have to drop them. There’s a parameter to disable the use of them.

Here is a query using my AUTO index:

SQL> select count(*) from admin.words where sound='H400';
COUNT(*)
___________
152
SQL> select * from dbms_xplan.display_cursor(format=>'allstats last');
PLAN_TABLE_OUTPUT
____________________________________________________________________
SQL_ID bdbr7vnx88x7z, child number 0
-------------------------------------
select count(*) from admin.words where sound='H400'
Plan hash value: 335171867
-------------------------------------------------------------------                                                                                                                 | Id  | Operation         | Name                 |A-Rows| Buffers |                                                                                                                 -------------------------------------------------------------------                                                                                                                 |   0 | SELECT STATEMENT  |                      |     1|       3 |                                                                                                                 |   1 |  SORT AGGREGATE   |                      |     1|       3 |                                                                                                                 |*  2 |   INDEX RANGE SCAN| SYS_AI_gg1ctjpjv92d5 |   152|       3 |                                                                                                                 -------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("SOUND"='H400')

Now, disabling all Auto Index at my session level:

SQL> alter session set "_optimizer_use_auto_indexes"=OFF;
Session altered.
SQL> select count(*) from admin.words where sound='H400';
   COUNT(*)
___________
152
SQL> select * from dbms_xplan.display_cursor(format=>'allstats last +outline');
PLAN_TABLE_OUTPUT
____________________________________________________________________
SQL_ID bdbr7vnx88x7z, child number 1
-------------------------------------
select count(*) from admin.words where sound='H400'
Plan hash value: 1662541906
----------------------------------------------------------------                                                                                                                    | Id  | Operation          | Name  | Starts | A-Rows | Buffers |                                                                                                                    ----------------------------------------------------------------                                                                                                                    |   0 | SELECT STATEMENT   |       |      1 |      1 |    1598 |                                                                                                                    |   1 |  SORT AGGREGATE    |       |      1 |      1 |    1598 |                                                                                                                    |*  2 |   TABLE ACCESS FULL| WORDS |      1 |    152 |    1598 |                                                                                                                    ----------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("SOUND"='H400')

Do you really want to drop them?

Note that if you drop them, then you probably also want to disable Auto Indexing at all or they will probably re-appear:

exec dbms_auto_index.configure('AUTO_INDEX_MODE','REPORT');

And there it is your decision to create the indexes or not.

But remember that in theory, the presence of an index should not have bad effects as the optimizer. With correct statistics, the CBO can decide to use it or not. And the Auto Indexing feature has also a way to blacklist some auto-created indexes from some queries if a regression has been encountered.

Multi-table

Here’s a problem (and I think it should be called a bug) that I first came across about 6 years ago, then forgot for a few years until it reappeared some time last year and then again a few days ago. The problem has been around for years (getting on for decades), and the first mention of it that I’ve found is MoS Bug 2891576, created in 2003, referring back to Oracle 9.2.0.1, The problem still exists in Oracle 19.2 (tested on LiveSQL).

Here’s the problem: assume you have a pair of tables (call them parent and child) with a referential integrity constraint connecting them. If the constraint is enabled and not deferred then the following code may fail, and if you’re really unlucky it may only fail on rare random occasions:


insert all
        into parent({list of parent columns}) values({list of source columns})
        into child ({list of child columns})  values({list of source columns})
select
        {list of columns}
from    {source}
;

The surprising Oracle error is “ORA-02291: integrity constraint ({owner.constraint_name}) violated – parent key not found”, and the reason is simple (and documented in MoS note 265826.1 Multi-table Insert Can Cause ORA-02291: Integrity Constraint Violated for Master-Detail tables: the order in which the insert operations take place is “indeterminate” so that child rows may be inserted before their parent rows (and for the multi-table insert the constraint checks are not postponed until the statement completes as they are, for instance, for updates to a table with a self-referencing RI constraint).

Two possible workarounds are suggested in Doc ID 265826.1

  • drop the foreign key constraint and recreate it after the load,
  • make the foreign key constraint deferrable and defer it before the insert so that it is checked only on commit (or following an explicit call to make it immediate)

The second option would probably be preferable to the first but it’s still not a very nice thing to do and could leave your database temporarily exposed to errors that are hard to clean up. There are some details of the implementation of deferrable constraints in the comments of this note on index rebuilds if you’re interested in the technicalities.

A further option which seems to work is to create a (null) “before row insert” trigger on the parent table – this appears to force the parent into a pattern of single row inserts and the table order of insertion then seems to behave. Of course you do pay the price of an increase in the volume of undo and redo. On the down-side Bug 2891576 MULTITABLE INSERT FAILS WITH ORA-02291 WHEN FK & TRIGGER ARE PRESENT can also be fouind on MoS, leading 265826.1 to suggests disabling triggers if their correctness is in some way dependent on the order in which your tables are populated. That dependency threat should be irrelevant if the trigger is a “do nothing” trigger. Sadly there’s a final note that I should mention: Bug 16133798 : INSERT ALL FAILS WITH ORA-2291 reports the issue as “Closed: not a bug”

There is a very simple example in the original bug note demonstrating the problem, but it didn’t work on the version of Oracle where I first tested it, so I’ve modified it slightly to get it working on a fairly standard install. (I suspect the original was executed on a database with a 4KB block size.)


drop table child purge;
drop table parent purge;

create table parent (id number primary key);

create table child  (id number, v1 varchar2(4000),v2 varchar2(3920));
alter table child add constraint fk1 foreign key (id) references parent (id);
 
create or replace trigger par_bri
before insert on parent
for each row
begin
        null;
end;
.

insert all
        into parent ( id ) values ( id )
        into child  ( id ) values ( id )
select  100 id from dual
;

In the model above, and using an 8KB block in ASSM, the code as is resulted in an ORA-02991 error. Changing the varchar2(3920) to varchar2(3919) the insert succeeded, and when I kept the varchar2(3920) but created the trigger the insert succeeded.

Fiddling around in various ways and taking some slightly more realistic table definitions here’s an initial setup to demonstrate the “randomness” of the failure (tested on various versions up to 18.3.0.0):


rem
rem     Script:         insert_all_bug.sql
rem     Author:         Jonathan Lewis
rem     Dated:          May 2018
rem
rem     Last tested 
rem             18.3.0.0
rem             12.2.0.1
rem             12.1.0.2
rem             10.2.0.5
rem              9.2.0.8
rem

create table t1
as
with generator as (
        select 
                rownum id
        from dual 
        connect by 
                level <= 1e4 -- > comment to avoid WordPress format issue
)
select
        rownum                          id,
        lpad(rownum,10,'0')             small_vc,
        lpad(rownum,100,'0')            medium_vc,
        lpad(rownum,200,'0')            big_vc
from
        generator       v1
;

create table parent(
        id              number,
        small_vc        varchar2(10),
        medium_vc       varchar2(100),
        big_vc          varchar2(200),
        constraint par_pk primary key(id)
)
segment creation immediate
;

create table child(
        id              number,
        small_vc        varchar2(10),
        medium_vc       varchar2(100),
        big_vc          varchar2(200),
        constraint chi_pk primary key(id),
        constraint chi_fk_par foreign key (id) references parent(id)
)
segment creation immediate
;

create table child2(
        id              number,
        small_vc        varchar2(10),
        medium_vc       varchar2(100),
        big_vc          varchar2(200),
        constraint ch2_pk primary key(id),
        constraint ch2_fk_par foreign key (id) references parent(id)
)
segment creation immediate
;

I’ve created a “source” table t1, and three “target” tables – parent, child and child2. Table parent has a declared primary key and both child and child2 have a referential integrity constraint to parent. I’m going to do a multi-table insert selecting from t1 and spreading different columns across the three tables.

Historical note: When I first saw the “insert all” option of multi-table inserts I was delighted with the idea that it would let me query a de-normalised source data set just once and insert the data into a normalised set of tables in a single statement – so (a) this is a realistic test from my perspective and (b) it has come as a terrible disappointment to discover that I should have been concerned about referential integrity constraints (luckily very few systems had them at the time I last used this feature in this way).

The multi-table insert I’ve done is as follows:


insert all
        into parent(id, small_vc)  values(id, small_vc)
        into child (id, medium_vc) values(id, medium_vc)
        into child2(id, medium_vc) values(id, medium_vc)
--      into child2(id, big_vc)    values(id, big_vc)
select
        id, small_vc, medium_vc, big_vc
from
        t1
where
        rownum <= &m_rows_to_insert
;

You’ll notice that I’ve allowed user input to dictate the number of rows selected for insertion and I’ve also allowed for an edit to change the column that gets copied from t1 to child2. Althought it’s not visible in the create table statements I’ve also tested the effect of varying the size of the big_vc column in t1.

Starting with the CTAS and multi-table insert as shown the insert runs to completion if I select 75 rows from t1, but if I select 76 rows the insert fails with “ORA-02991: integrity constraint (TEST_USER.CHI_FK_PAR) violated – parent key not found”. If I change the order of the inserts into child1 and child2 the violated constraint is TEST_USER.CH2_FK_PAR – so Oracle appears to be applying the inserts in the order they appear in the statement in some circumstances.

Go back to the original order of inserts for child1 and child2, but use the big_vc option for child2 instead of the medium_vc. In this case the insert succeeds for 39 rows selected from t1, but fails reporting constraint TEST_USER.CH2_FK_PAR when selecting 40 rows. Change the CTAS and define big_vc with as lpad(rownum,195) and the insert succeeds with 40 rows selected and fails on 41 (still on the CH2_FK_PAR constraint); change big_vc to lpad(rownum,190) and the insert succeeds on 41 rows selected, fails on 42.

My hypothesis on what’s happening is this: each table in the multitable insert list gets a buffer of 8KB (maybe matching one Oracle block if we were to try different block sizes). As the statement executes the buffers will fill and, critically, when the buffer is deemed to be full (or full enough) it is applied to the table – so if a child buffer fills before the parent buffer is full you can get child rows inserted before their parent, and it looks like Oracle isn’t postponing foreign key checking to the end of statement execution as it does with other DML – it’s checking as each array is inserted.

Of course there’s a special boundary condition, and that’s why the very first test with 75 rows succeeds – neither of the child arrays gets filled before we reach the end of the t1 selection, so Oracle safely inserts the arrays for parent, child and child2 in that order. The same boundary applies occurs in the first of every other pair of tests that I’ve commented on.

When we select 76 rows from t1 in the first test the child and child2 arrays hit their limit and Oracle attempts to insert the child1 rows first – but the parent buffer is far from full so its rows are not inserted and the attempted insert results in the ORA-02991 error. Doing a bit of rough arithmetic the insert was for 76 rows totalling something like: 2 bytes for the id, plus a length byte, plus 100 bytes for the medium_vc plus a length byte, totalling 76 * 104 =7,904 bytes.

When we switch to using the big_vc for child2 the first array to fill is the child2 array, and we have 3 sets of results as we shorten big_vc:

  • 40 * ((1 + 2) + (1 + 200)) = 8160
  • 41 * ((1 + 2) + (1 + 195)) = 8159
  • 42 * ((1 + 2) + (1 + 190)) = 8148

While I’m fairly confident that my “8KB array” hypothesis is in the right ballpark I know I’ve still got some gaps to explain – I don’t like the fact that I’ve got a break point around 7,900 in the first example and something much closer to 8,192 in the other three examples.  I could try to get extra precision by running up a lot more examples with different numbers and lengths of columns to get a better idea of where the error is appearing – but I’m sufficiently confident that the idea is about right so I can’t persuade myself to make the effort to refine it. An example of an alternative algorithm (which is actually a better fit though a little unexpected) is to assume that the normal 5 byte row overhead (column count, lock byte, flags and 2-byte row directory entry) has been included in the array sizing code, and the insert takes place at the point incoming row breaks, or just touches, the limit. In this case our 4 results would suggest the following figures:

  • 75 * 109 = 8175
  • 39 * 209 = 8151
  • 40 * 204 = 8160
  • 41 * 199 = 8159

With these numbers we can see 8KB (8,192 bytes) very clearly, and appreciate that the one extra row would take us over the critical limit, hence triggering the insert and making the array space free to hold the row.

Bottom Line

If you’re using the multi-table “insert all” syntax and have referential integrity declared between the various target tables then you almost certainly need to ensure that the foreign key constraints are declared as deferrable and then deferred as the insert takes place otherwise you may get random (and, until now, surprisingly inexplicable) ORA-02991 foreign key errors.

A possible alternative workaround is to declare a “do nothing” before row insert trigger on the top-level as this seems to switch the process into single row inserts on the top-most parent that force the other array inserts to take place with their parent row using small array sizes and protecting against the foreign key error. This is not an officially sanctioned workaround, though, and may only have worked by accident in the examples I tried.

It is possible, if the 8KB working array hypothesis is correct, that you will never see the ORA-02991 if the volume of data (number of rows * row length) for the child rows of any given parent row is always less than the size of the parent row – but that might be a fairly risky thing to hope for in a production system. It might be much better to pay the overhead of deferred foreign key checking than having a rare, unpredictable error appearing.