Search

OakieTags

Who's online

There are currently 0 users and 37 guests online.

Recent comments

Oakies Blog Aggregator

Identifying Redundant Indexes in PeopleSoft

This is the first of a two-part series that discusses how to identify and remove redundant indexes from a PeopleSoft system.
I recently wrote a series of articles on my Oracle blog about identifying redundant indexes on an Oracle database. The same approach can be taken with PeopleSoft, but using the metadata in the PeopleTools tables rather than the database catalogue.
(This means you could take a similar approach for PeopleSoft on databases other than Oracle, but I can't say anything about the behaviour of other optimizers) 
The following query returns a report of superset and redundant subset indexes in a PeopleSoft system. It can be downloaded as a script from my website (psredundant.sql).
The generic Oracle version of this script will identify subset indexes that are unique as redundant because the primary key and unique constraints can be altered to use the superset index.  However, this query does not make subset unique indexes as redundant because PeopleSoft does not explicitly create unique constraints, only unique indexes.  If I disabled a unique index, I would not be able to maintain the unique constraint via Application Designer.
Subquery IC returns a list of columns for each index. Subqueries UNI, AS2 and USI retrieve columns for unique/duplicate key, alternate search and user defined indexes respectively. The absolute value of key position is not important for this exercise, it is merely used to sort the column list in the listagg() function in subquery I.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 85%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">REM psredundant.sql
set lines 160 pages 50 trimspool on
column recname format a15
column superset_index format a60
column redundant_index format a60
break on recname skip 1 on indexid skip 1
spool psredundant

WITH uni AS (/*unique indexes*/
SELECT /*+MATERIALIZE*/ f.recname, i.indexid
, MIN(i.uniqueflag) OVER (PARTITION BY f.recname) uniqueflag
, CASE WHEN MAX(CASE WHEN f.recname != f.recname_parent THEN 1 ELSE 0 END) OVER (PARTITION BY f.recname)=1
THEN f.fieldnum ELSE k.keyposn END keyposn
, k.fieldname
FROM psrecfielddb f
, psindexdefn i
, pskeydefn k
WHERE i.recname IN(f.recname,f.recname_parent)
AND i.recname = k.recname
AND k.fieldname = f.fieldname
AND i.indexid = '_' /*unique index*/
AND k.indexid = i.indexid
AND bitand(f.useedit,3) > 0 /*unique or dup key*/
), as0 AS (/*leading column on alternate search indexes*/
SELECT f0.recname, k0.indexid, i0.uniqueflag, 0 keyposn, f0.fieldname
FROM psrecfielddb f0
, psindexdefn i0
, pskeydefn k0
WHERE bitand(f0.useedit,16) = 16 /*alternate search key*/
AND k0.recname = f0.recname_parent
AND k0.fieldname = f0.fieldname
AND i0.recname = k0.recname
AND i0.indexid = k0.indexid
AND i0.indexid BETWEEN '0' AND '9' /*alternate search index*/
), as1 AS ( /*now add unique columns*/
SELECT as0.recname, as0.indexid, as0.uniqueflag, as0.keyposn, as0.fieldname
FROM as0
UNION ALL /*append unique key index columns*/
SELECT as0.recname, as0.indexid, as0.uniqueflag, uni.keyposn, uni.fieldname
FROM as0, uni
WHERE as0.recname = uni.recname
), as2 AS (
SELECT as1.recname, as1.indexid, as1.uniqueflag, NVL(k.keyposn,as1.keyposn), as1.fieldname
FROM as1
LEFT OUTER JOIN pskeydefn k /*to handle custom key order*/
ON k.recname = as1.recname
AND k.indexid = as1.indexid
AND k.fieldname = as1.fieldname
), usi AS (/*user indexes*/
SELECT i.recname, i.indexid, i.uniqueflag, k.keyposn, k.fieldname
FROM psindexdefn i
, pskeydefn k
WHERE k.recname = i.recname
AND k.indexid = i.indexid
AND k.indexid BETWEEN 'A' AND 'Z'
AND i.platform_ora = 1
), m AS (/*merge three kinds of index here*/
SELECT uni.recname, uni.indexid, uni.uniqueflag, uni.keyposn, uni.fieldname
FROM uni
UNION ALL
SELECT as1.recname, as1.indexid, as1.uniqueflag, as1.keyposn, as1.fieldname
FROM as1
UNION ALL
SELECT usi.recname, usi.indexid, usi.uniqueflag, usi.keyposn, usi.fieldname
FROM usi
), ic AS ( /*list of columns, restrict to tables*/
SELECT r.recname, m.indexid, m.uniqueflag, m.keyposn, m.fieldname
FROM m
, psrecdefn r
WHERE r.rectype IN(0,7)
And r.recname = m.recname
), i AS ( --construct column list
SELECT /*+ MATERIALIZE*/
ic.recname, ic.indexid, ic.uniqueflag
, count(*) num_columns
, listagg(ic.fieldname,',') within group (order by ic.keyposn) AS fieldlist
FROM ic
GROUP BY ic.recname, ic.indexid, ic.uniqueflag
)
SELECT r.recname
, i.indexid||' ('||i.fieldlist||')' superset_index
, r.indexid||' ('||r.fieldlist||')' redundant_index
FROM i
, i r
WHERE i.recname = r.recname
AND i.indexid != r.indexid
AND r.uniqueflag = 0 /*non-unique redundant*/
AND i.fieldlist LIKE r.fieldlist||',%'
AND i.num_columns > r.num_columns
order by r.recname, r.indexid
/

spool off

This is part of the output for a HR 9.2 demo database. In all it identified 58 indexes on 41 records. However, some of those records are temporary records with several physical instances, so the number of database objects that will be removed is higher.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 85%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">RECNAME         SUPERSET_INDEX                                               REDUNDANT_INDEX
--------------- ------------------------------------------------------------ -----------------------------------------------------------

HRS_APP_PROFILE C (HRS_AL_LOG_ID,HRS_PERSON_ID) B (HRS_AL_LOG_ID)

HRS_SJT_JO B (HRS_JOB_OPENING_ID,SCRTY_TYPE_CD) A (HRS_JOB_OPENING_ID)

HR_PROMOTE_XREF C (EMPLID,EMPL_RCD) A (EMPLID)
B (EMPLID,EMPL_RCD,ACTION_DT_SS,EFFSEQ) A (EMPLID)
B (EMPLID,EMPL_RCD,ACTION_DT_SS,EFFSEQ) C (EMPLID,EMPL_RCD)

HR_SSTEXT_EFFDT _ (OBJECTOWNERID,HR_SSTEXT_SUB_ID,TEXT_ID,EFFDT) A (OBJECTOWNERID,HR_SSTEXT_SUB_ID,TEXT_ID)

HR_TRANSFR_XREF C (EMPLID,EMPL_RCD) A (EMPLID)
B (EMPLID,EMPL_RCD,ACTION_DT_SS,EFFSEQ) A (EMPLID)
B (EMPLID,EMPL_RCD,ACTION_DT_SS,EFFSEQ) C (EMPLID,EMPL_RCD)

PA_ALIAS_USE 0 (ALIAS_TYPE,ALIAS_NAME) A (ALIAS_TYPE)


WRK_XREF_CALC _ (PROCESS_INSTANCE,SEQ_NBR,XREF_NUM) A (PROCESS_INSTANCE,SEQ_NBR)

The articles on my Oracle blog discuss how to identify redundant indexes on an Oracle database, and then how to go about removing them.  Of course, in PeopleSoft, you should only remove indexes by making the change within Application Designer, and that is what the next posting discusses.

Upgrade nach #Oracle 12c selbst testen: Mainz, 2. bis 3. Februar

noon2noonhttps://uhesse.files.wordpress.com/2017/01/noon2noon.png?w=1240&h=372 1240w, https://uhesse.files.wordpress.com/2017/01/noon2noon.png?w=150&h=45 150w, https://uhesse.files.wordpress.com/2017/01/noon2noon.png?w=300&h=90 300w, https://uhesse.files.wordpress.com/2017/01/noon2noon.png?w=768&h=230 768w, https://uhesse.files.wordpress.com/2017/01/noon2noon.png?w=1024&h=307 1024w" sizes="(max-width: 620px) 100vw, 620px" />

Die Deutsche Oracle Anwender Gruppe (DOAG) bietet diesen interaktiven Workshop an. Eine sehr gute Gelegenheit, das Upgrade einmal selbst mit der Unterstützung renommierter Experten durchzuführen. Ich werde einen Vortrag zum Thema Common vs. Local beisteuern. Das ist eins der tendenziell komplizierten Themen, mit denen Einsteiger in Multitenant konfrontiert werden.

Tagged: DOAG

The Most Important Tool for SQL Tuning

Summary: SQLT is a tool that collects comprehensive information on all aspects of a SQL performance problem. SQL tuning experts know that EXPLAIN PLAN is only the proverbial tip of the iceberg but the fact is not well recognized by the Oracle database community, so much evangelization is necessary.

I remember the time I was trying to solve a production problem a long time ago. I did not have any tools but I was good at writing queries against the Oracle data dictionary. How does one find the PID of an Oracle dedicated server process? Try something like this:

select spid from v$process where addr=(select saddr from v$session where sid = '&sid')

My boss was not amused.

After the incident, he got me a license for Toad.

Writing queries against the data dictionary is macho but it is not efficient.

Tools are in.

Fast forward many years. Now one of my specialties is tuning SQL statements. There is a general lack of appreciation for the vast spectrum of reasons that may lie behind poor performance of a SQL statement. Therefore, the opening gambit is usually “Here is the EXPLAIN PLAN. What’s wrong?”

First of all, note that EXPLAIN PLAN does not take bind variables into account and therefore the EXPLAIN PLAN may not be the plan that was used at run time. Further, plans can change at any time for dozens of reasons: bind variables, statistics, cardinality feedback, adaptive query optimization, etc.

What if the query plan is just fine but the environment is having difficulties?

And don’t expect me to tune a SQL statement without all the information about the tables and indexes involved in the query, all the information that feeds into plan generation, the execution history of the SQL statement, and detailed information about the environment in which the statement is execution.

Does the query refer to views or PL/SQL functions; I need information on them.

Why not give me a test case, perhaps even with data?

Carlos Sierra was a genius who worked in the SQL performance team of Oracle Support for many, many years. He realized that the process of collecting the necessary information was unsystematic and long and drawn out. He therefore created a tool called SQLT (the T stands for Trace) to collect all the required information and present it in the form of an HTML report.

View definitions? Check.

Optimizer settings? Check.

Environment information? Check.

Execution history? Check.

There’s even a 10053 trace and a test case that I can run in my own environment.

Here’s a screen shot of the HTML report that I mentioned.

Outlines and patches and profiles and baselines and directives, oh my!

Security policies? Really?

Since the SQLT tool was produced by Oracle Support, there’s a Support note on it. But, there’s even a whole book on it written by Stelios Charlambides, one of Carlos’s partners in crime.

The Case of the Foolish Fix

Performance of many queries had worsened when the database was upgraded from version 11.2.0.3 to 12.1.0.2. In this case, the problem was definitely the query plan. In version 11.2.0.3, the cost-based optimizer picked the right driving table but not in 12.1.0.2. I used a SQLT feature called XPLORE to solve the problem. First, I set up a test case in my own environment. Then I unleashed XPLORE on it. It systematically changed the values of optimizer settings and “fix controls” to see how the optimizer would respond in each case. After an hour and a half of processing—or so it seemed—it hit pay dirt. A certain fix, first inserted into version 11.2.0.4, had corrected the “single table cardinalities” in the presence of transitive predicates. Unfortunately, the CBO was producing worse plans with the corrected cardinalities. While this is certainly cause for consternation, the problem had been identified. Restoring performance was now simply a matter of disabling the fix.

The Case of the Puzzling Policy

A certain query executed lightning fast in the development environment and dog-slow in the production environment. The problem had been puzzling everybody for months. When I reviewed the SQLT report, I saw the curious filter “NULL IS NOT NULL” in the development environment. This told me that a security policy was in effect and that it was disabling part of the query plan in the development environment. ‘Nuff said.

The Case of the Curious Constraints

The DBA insisted that there was no difference between two environments, and yet the query performed better--much better--in one of the two environments. I used the COMPARE feature of SQLT but could not find any differences in the two environments that could explain the oddity. It would seem that SQLT had failed me. But I was able to write simple queries on the data from the two environments that SQLT had collected and I found that three table constraints in the faster environment were disabled in the slower environments. Why would disabling constraints make queries run slower? Join-elimination, you see. The DBA said that he was aware of the disabled constraints but had not considered them important enough to mention. Well, live and learn.

The Case of the Immense Indexes

The customer was loading hundreds billions of rows of new data into a table using INSERT statements, a process that was expected to last weeks. However, the insertion rate had been slowing down dramatically as the days passed and what had been expected to take weeks would likely take months. It was clear from the wait-event information in the SQL report that Oracle was spending a big chunk of time updating two indexes, but what was the reason for the slowdown? There was enough execution-history information in the AWR report for me to graph the slowdown over time and prove that the slowdown was directly proportional to the size of the indexes. It was too late to redesign the loading strategy but at least I was able to explain the problem to the customer.

Always a But

Unfortunately SQLT is not perfect. The biggest problem is that it is written in PL/SQL which must be loaded into the target database. And it creates lots of tables in a special schema. This is anathema to production environments. My other pet peeve about SQLT is that it does not support STATSPACK. The creators of SQLT believed that most customers are licensed for AWR, but I do not believe that’s the case. Finally, SQLT is only available in My Oracle Support so you’re out of luck if you do not have access to it.

And a Word of Caution

It is easy to get very excited about SQLT but it’s not the be-all and end-all. There will always be cases where SQLT is not enough. For example, you might find the answer you are looking in an AWR report. Remember to keep an open mind.

Taking the Plunge

Read the Support note first, then download SQLT and install it in a test environment and play with it. You don’t want to forget the COMPARE and XPLORE features. If you’ve got a few spare dollars left over from Christmas shopping, I recommend the book  by Stelios Charlambides.

Make learning about SQLT your New Year’s resolution. Happy learning.

Iggy Fernandez is the editor of the NoCOUG Journal, the print journal of the oldest-running Oracle user group in the world which celebrated thirty years last year. He was an Oracle DBA for many years but now specializes in SQL tuning. He used to speak Oracle conferences all over the country but is now completely bored with that gig except in the case of the RMOUG Training Days which treats its speakers so well, even picking them up at the airport and taking them back.

Auto Sync for Password Files in #Oracle 12c Data Guard

synchttps://uhesse.files.wordpress.com/2017/01/sync.png?w=600&h=538 600w, https://uhesse.files.wordpress.com/2017/01/sync.png?w=150&h=134 150w" sizes="(max-width: 300px) 100vw, 300px" />

A useful enhancement in 12cR2 is that password files are automatically synchronized in a Data Guard environment:

[oracle@uhesse1 ~]$ dgmgrl sys/oracle_4U@cdb1
DGMGRL for Linux: Release 12.2.0.1.0 - Production on Tue Jan 10 07:38:15 2017

Copyright (c) 1982, 2016, Oracle and/or its affiliates.  All rights reserved.

Welcome to DGMGRL, type "help" for information.
Connected to "cdb1"
Connected as SYSDBA.
DGMGRL> show configuration;

Configuration - myconf

  Protection Mode: MaxPerformance
  Members:
  cdb1   - Primary database
    cdb1sb - Physical standby database 

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS   (status updated 31 seconds ago)

DGMGRL> exit
[oracle@uhesse1 ~]$ sqlplus sys/oracle_4U@cdb1 as sysdba

SQL*Plus: Release 12.2.0.1.0 Production on Tue Jan 10 07:38:49 2017

Copyright (c) 1982, 2016, Oracle.  All rights reserved.


Connected to:
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production

SQL> alter user sys identified by oracle;

User altered.

SQL> connect sys/oracle@cdb1 as sysdba
Connected.
SQL> connect sys/oracle@cdb1sb as sysdba
Connected.

I have just changed the password of sys on the Primary and it got automatically also changed on the Standby. In earlier versions that had to be done manually, typically by replacing the password file on the standby with a new copy from the primary. One little thing less to keep in mind now</p />
</p></div>

    	  	<div class=

Join Elimination 12.2

From time to time someone comes up with the question about whether or not the order of tables in the from clause of a SQL statement should make a difference to execution plans and performance. Broadly speaking the answer is no, although there are a couple of boundary cases were a difference can appear unexpectedly.

When considering join permutations the optimizer has a few algorithms for picking an initial join order and then deciding how to permute from that order, and one of the criteria with the very lowest priority (i.e. when all other factors are equal) is dictated by the order the tables appear in the from clause so if you have enough tables in the from clause it’s possible for the subset of join orders considered to change if you change the from clause in a way that causes the initial join order to change.

It’s been over 11 years since I wrote the article I’ve linked to in the previous paragraph and in that time no-one has yet approached me with other examples of a plan changing due to a change in the from clause order (though, with all the transformations now available to the optimizer, I wouldn’t be surprised if a few cases have appeared occasionally, so feel free to let me know if you think you’ve got an interesting example that I can experiment on).

A little while ago, though, while testing a feature enhancement in 12.2, I finally came across a case where a real difference appeared. Here’s the query I was using – I’ll give you the SQL to reproduce the tables at the end of the article:


select 
	count(c.small_vc_c)
from 
	grandparent	g, 
	parent		p,
	child		c
where
	c.small_num_c between 200 and 215
and	p.id   = c.id_p
and	p.id_g = c.id_g
and	g.id   = p.id_g
;

As you will see later on the three tables grandparent, parent, child have the obvious primary keys and referential integrity constraints. This means that grandparent has a single-column primary key, parent has a two-column primary key, and child has a three-column primary key. The query joins the three tables along their primary keys and then selects data only from the child table, so it’s a good candidate for join elimination.

In earlier versions of Oracle join elimination could take place only if the primary key you joined to was a single column key, so 12.1 and earlier would be able to eliminate just the grandparent from this three-table join; but in 12.2 multi-column primary keys also allow join elimination to take place, so we might hope that the plan we get from this query would eliminate both the grandparent and parent tables. Here’s the plan (pulled from memory after execution):

SQL_ID  8hdybjpn2884b, child number 0
-------------------------------------
select  count(c.small_vc_c) from  grandparent g,  parent  p,  child  c
where  c.small_num_c between 200 and 215 and p.id   = c.id_p and p.id_g
= c.id_g and g.id   = p.id_g

Plan hash value: 4120004759

-----------------------------------------------------------------------------
| Id  | Operation           | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |       |       |       |    26 (100)|          |
|   1 |  SORT AGGREGATE     |       |     1 |    23 |            |          |
|   2 |   NESTED LOOPS      |       |    85 |  1955 |    26   (4)| 00:00:01 |
|*  3 |    TABLE ACCESS FULL| CHILD |    85 |  1615 |    26   (4)| 00:00:01 |
|*  4 |    INDEX UNIQUE SCAN| G_PK  |     1 |     4 |     0   (0)|          |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter(("C"."SMALL_NUM_C"<=215 AND "C"."SMALL_NUM_C">=200))
   4 - access("G"."ID"="C"."ID_G")

It didn’t work quite as expected. The optimizer has managed to eliminate table parent – so that looks like “single column primary key” join elimination has worked, but “multi-column” join elimination hasn’t appeared. On the other hand, I’ve not followed my usual rules for writing SQL so let’s try again. If I follow the pattern I usually follow, my from clause would have been in the order child  -> parent -> grandparent – listing the tables in the order I expect to visit them. Here’s the plan – again pulled from memory – after making this visual change the SQL:


SQL_ID  1uuq5vf4bq0gt, child number 0
-------------------------------------
select  count(c.small_vc_c) from  child  c,  parent  p,  grandparent g
where  c.small_num_c between 200 and 215 and p.id   = c.id_p and p.id_g
= c.id_g and g.id   = p.id_g

Plan hash value: 1546491375

----------------------------------------------------------------------------
| Id  | Operation          | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |       |       |       |    26 (100)|          |
|   1 |  SORT AGGREGATE    |       |     1 |    15 |            |          |
|*  2 |   TABLE ACCESS FULL| CHILD |    85 |  1275 |    26   (4)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter(("C"."SMALL_NUM_C"<=215 AND "C"."SMALL_NUM_C">=200))

So join elimination based on multi-column primary keys does work – but you might have to get a bit lucky in the order you list the tables in the from clause.

Footnote.

If you’re wondering whether or not switching from Oracle syntax to ANSI syntax would make a difference, it does: with ANSI syntax both grandparent and parent are eliminated if the SQL lists the tables in the order grandparent -> parent -> child (i.e. the order which doesn’t work properly for Oracle syntax) and only the parent is eliminated for the order child -> parent -> grandparent. In other words, both syntax options have a point of failure but they fail the opposite way around.

Code:


rem
rem	Script:		join_elimination_12c2.sql
rem	Author:		Jonathan Lewis
rem	

-- Environment details eliminated

define m_pad=100

/*
	IDs will be 1 to 1000
*/

create table grandparent 
as
select 
	rownum			id,
	trunc((rownum-1)/5)	small_num_g,
	rpad(rownum,10)		small_vc_g,
	rpad(rownum,&m_pad)	padding_g
from 
	all_objects 
where 
	rownum <= 1000
;

/*
	Each GP has two (scattered) children here
	Parent IDs are 1 to 2,000
*/

create table parent 
as
select 
	1+mod(rownum,1000)	id_g,
	rownum			id,
	trunc((rownum-1)/5)	small_num_p,
	rpad(rownum,10)		small_vc_p,
	rpad(rownum,&m_pad)	padding_p
from 
	all_objects 
where 
	rownum <= 2000
;

/*
	Simple trick to get 5 (clustered) children per parent
	Child IDs are 1 to 12,000
*/

create table child 
as
select 
	id_g,
	id			id_p,
	rownum			id,
	trunc((rownum-1)/5)	small_num_c,
	rpad(rownum,10)		small_vc_c,
	rpad(rownum,&m_pad)	padding_c
from 
	parent	p,
	(
		select /*+ no_merge */ 
			rownum 
		from	parent p 
		where	rownum <= 5
	)	d
;

create unique index g_pk on grandparent(id);
create unique index p_pk on parent     (id_g, id)       compress 1;
create unique index c_pk on child      (id_g, id_p, id) compress 2;

alter table grandparent add constraint g_pk primary key (id);
alter table parent      add constraint p_pk primary key (id_g, id);
alter table child       add constraint c_pk primary key (id_g, id_p, id);

alter table parent add constraint p_fk_g foreign key (id_g)       references grandparent;
alter table child  add constraint c_fk_p foreign key (id_g, id_p) references parent;

rem
rem	Don't need to collect stats because it's 12c
rem

prompt	===============================================================
prompt	Join all three tables with the FROM clause ordered gp -> p -> c
prompt	The final plan is GP->C, The optimizer eliminated P before
prompt	considering GP
prompt	===============================================================

select 
	count(c.small_vc_c)
from 
	grandparent	g, 
	parent		p,
	child		c
where
	c.small_num_c between 200 and 215
and	p.id   = c.id_p
and	p.id_g = c.id_g
and	g.id   = p.id_g
;

select * from table(dbms_xplan.display_cursor(null,null,'outline'));

prompt	===============================================================
prompt	Join all three tables with the FROM clause ordered c -> p -> gp
prompt	The final plan is a tablescan of C only. The optimizer managed 
prompt	to eliminate GP first and P second
prompt	===============================================================

select 
	count(c.small_vc_c)
from 
	child		c,
	parent		p,
	grandparent	g 
where
	c.small_num_c between 200 and 215
and	p.id   = c.id_p
and	p.id_g = c.id_g
and	g.id   = p.id_g
;

select * from table(dbms_xplan.display_cursor(null,null,'outline'));

prompt	==================================================
prompt	Convert to ANSI standard in the order gp -> p -> c
prompt	and both gp and p eliminated.
prompt	==================================================

select 
	count(c.small_vc_c)
from 
	grandparent	g
join
	parent		p
on	p.id_g = g.id
join
	child		c
on	c.id_p = p.id
and	c.id_g = p.id_g
where
	c.small_num_c between 200 and 215
;

select * from table(dbms_xplan.display_cursor(null,null,'outline'));

prompt	===================================================
prompt	Convert to ANSI standard in the order c -> p -> gp
prompt	and only p is eliminated. 
prompt	===================================================

select 
	count(c.small_vc_c)
from 
	child		c
join
	parent		p
on      p.id   = c.id_p 
and	p.id_g = c.id_g 
join
	grandparent	g
on	g.id = p.id_g 
where
	c.small_num_c between 200 and 215
;

select * from table(dbms_xplan.display_cursor(null,null,'outline'));

It’s possible, of course, that with different system stats, or I/O calibration, or extent sizes, or segment space management, or block sizes, sundry other parameter details that you won’t be able to reproduce the results without messing about a little bit, but I don’t think I’ve done anything special in the setup that would make a real difference.

Footnote:

If you’re wondering why the “traditional” and “ANSI” syntax should exhibit this flaw for joins in the opposite direction – remember that ANSI SQL is first transformed into an equivalent Oracle form and – in the simple cases – the first two tables form the first query block then each table after that introduces a new query block, so the optimizer strategy does (approximately) the following translation:


select ... from grandparent join parent join child

==>

select ... from (select ... from grandparent join parent) join child

The optimizer then optimizes the inline query, which eliminates grandparent leaving a join between parent and child, which then allows parent to be eliminated.

Conversely we get:

select ... form child join parent join grandparent 

==>

select ... from (select ... from child join parent) join grandparent

In this form the optimizer eliminates parent from the inline view and is left with a join between child and grandparent – so no further elimination.

Removing Unnecessary Indexes: 3.Cost of Maintaining Indexes

This is the third post in a series about unnecessary indexes and some of the challenges that they present.
In this post I want to look at how to quantify the overhead of index maintenance, and estimate the benefit of removing redundant indexes.
Let’s conduct an experiment. I will set up the same table and indexes as in the previous posts in this series, but this time I will put them into the RECYCLE buffer pool, to which I have allocated the minimum possible 4M of memory, and I will allocate 25% free space to the indexes. The idea is not to benefit from the buffer cache, but suffer the pain of the physical I/O.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 85%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">CREATE TABLE t (a NUMBER, b NUMBER, c NUMBER, d NUMBER
,CONSTRAINT t_pk PRIMARY KEY (a) USING INDEX STORAGE (BUFFER_POOL RECYCLE)
) STORAGE(BUFFER_POOL RECYCLE);

CREATE INDEX t_ab ON t (a,b) STORAGE(BUFFER_POOL RECYCLE) PCTFREE 25;
CREATE INDEX t_bc ON t (b,c) STORAGE(BUFFER_POOL RECYCLE) PCTFREE 25;
CREATE INDEX t_bcd ON t (b,c,d) STORAGE(BUFFER_POOL RECYCLE) PCTFREE 25;

INSERT /*+APPEND*/ INTO t
WITH x AS (
SELECT rownum-1 n FROM DUAL connect by level <= 1E5)
SELECT n
, MOD(n,100)
, ROUND(MOD(n,100),-1)
, dbms_random.value(1,100)
FROM x
/
CREATE INDEX T_BCD_ROUND ON T (B, C, ROUND(D,0)) STORAGE(BUFFER_POOL RECYCLE) PCTFREE 25;
CREATE INDEX T_BCD_DESC ON T (B, C, D DESC) STORAGE(BUFFER_POOL RECYCLE) PCTFREE 25;

EXEC dbms_stats.gather_table_stats(null,'T',cascade=>TRUE,method_opt=>'FOR ALL COLUMNS SIZE 1 FOR COLUMNS SIZE 1 (B,C)');

The table and indexes really too large to fit in the recycle buffer pool, so there will be physical read as blocks are loaded into the buffer cache, and physical write as dirty blocks are pushed out.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 85%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">SELECT s.owner, s.segment_name
, s.blocks, s.bytes/1024/1024 Mb
, t.blocks
FROM dba_segments s
, dba_tables t
WHERE t.table_name = 'T'
AND s.segment_type = 'TABLE'
AND s.segment_name = t.table_name
/

OWNER SEGMENT_NAME BLOCKS MB BLOCKS
---------- ------------ ---------- ---------- ----------
SYSADM T 640 5 543

SELECT s.owner, s.segment_name
, s.blocks, s.bytes/1024/1024 Mb
, i.leaf_blocks
FROM dba_segments s
, dba_indexes i
WHERE s.owner = i.owner
AND s.segment_name = i.index_name
AND i.table_name = 'T'
/

OWNER SEGMENT_NAME BLOCKS MB LEAF_BLOCKS
---------- ------------ ---------- ---------- -----------
SYSADM T_PK 256 2 187
SYSADM T_AB 256 2 237
SYSADM T_BC 256 2 213
SYSADM T_BCD 512 4 488
SYSADM T_BCD_ROUND 384 3 336
SYSADM T_BCD_DESC 768 6 675

Now I will run a PL/SQL loop that randomly updates rows in the table with random values, 500,000 times. The random nature of the update will minimize the benefit of the caching.  The database will continuously have to load blocks from disk back into buffer cache.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 85%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">DECLARE 
l_counter INTEGER := 0;
l_t1 DATE;
l_r1 INTEGER;
l_r2 INTEGER;
l_module VARCHAR2(64);
l_action VARCHAR2(64);
BEGIN
dbms_application_info.read_module(l_module, l_action);
dbms_application_info.set_module('REDUNDANT','UPDATE TEST');
l_t1 := SYSDATE + &&runtime/86400;
WHILE l_t1 >= SYSDATE AND l_counter < 5e5 LOOP
l_r1 := round(dbms_random.value(1,100),0);
l_r2 := round(dbms_random.value(1,100),0);
UPDATE t
SET c = l_r2
, d = ROUND(l_r2,-1)
WHERE a = l_r1;
l_counter := l_counter + 1;
COMMIT;
END LOOP;
dbms_output.put_line(l_counter||' rows updated');
dbms_application_info.set_module(l_module, l_action);
END;
/

The activity in the buffer cache is visible via v$buffer_pool_statistics.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 85%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">select * from v$buffer_pool_statistics where name = 'RECYCLE';

Before:

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 85%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">        ID NAME                 BLOCK_SIZE  SET_MSIZE  CNUM_REPL CNUM_WRITE   CNUM_SET    BUF_GOT  SUM_WRITE   SUM_SCAN
---------- -------------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
FREE_BUFFER_WAIT WRITE_COMPLETE_WAIT BUFFER_BUSY_WAIT FREE_BUFFER_INSPECTED DIRTY_BUFFERS_INSPECTED DB_BLOCK_CHANGE
---------------- ------------------- ---------------- --------------------- ----------------------- ---------------
DB_BLOCK_GETS CONSISTENT_GETS PHYSICAL_READS PHYSICAL_WRITES
------------- --------------- -------------- ---------------
2 RECYCLE 8192 490 490 0 490 482786 758695 0
65466 22517 0 889060 612270 49971577
71349961 18260691 385878 758695

After:

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 85%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">         2 RECYCLE                    8192        490        459         31        490     518646     853441          0
73457 25942 0 981089 688777 57003938
81763622 19260981 413463 853441

We can see there have been physical reads and writes on objects in the recycle pool.
This is the execution plan of the statement in the PL/SQL loop. The update occurs in line 1 of the plan.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 85%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">SQL_ID  3f305xbd6ts1d, child number 1
-------------------------------------
UPDATE T SET C = :B2 , D = ROUND(:B2 ,-1) WHERE A = :B1

Plan hash value: 795017363
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | UPDATE STATEMENT | | | | 2 (100)| |
| 1 | UPDATE | T | | | | |
|* 2 | INDEX UNIQUE SCAN| T_PK | 1 | 33 | 1 (0)| 00:00:01 |
---------------------------------------------------------------------------

Profile DML Overhead

Now, I will profile the recent ASH data for my test (filtered by module and action) by object and event, but we will only look at line 1 of execution plans where the SQL_OPNAME indicates a DML statement. Thus we are looking exclusively at the overhead of modifying data, and not the overhead of finding it in the first place.
The current_obj# and current_file# is used to identify database segments and tablespace name. However, these columns are reliable only for wait events that relate to physical I/O. They are not, for example, valid for CPU events.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 85%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">column sql_id format a13
column sql_plan_hash_value heading 'SQL Plan|Hash Value'
column sql_plan_line_id heading 'SQL Plan|Line ID'
column top_level_sql_id format a13 heading 'Top Level|SQL ID'
column event format a25
column object_type format a6 heading 'Object|Type'
column object_name format a11
column p1text format a9
column p2text format a6
column p3text format a7
column action format a11
column ash_secs heading 'ASH|secs' format 9999
column current_file# heading 'File|#' format 9999
column current_obj# heading 'Curr|Obj#' format 999999
compute sum of ash_secs on report
break on report
WITH x AS (
SELECT event, action, current_obj#, current_file#
, p1text, p2text, p3text
, SUM(1) ash_secs
FROM v$active_session_history
WHERE module = 'REDUNDANT'
AND sample_time >= SYSDATE - (10+&&runtime)/86400
AND sql_plan_line_id = 1 /*line 1 is where the update occurs*/
AND sql_opname IN('INSERT','UPDATE','DELETE','UPSERT') /*DML Statement*/
GROUP BY event, action, current_obj#, current_file#, p1text, p2text, p3text
)
SELECT x.*
, o.object_type, o.object_name
, f.tablespace_name
FROM x
LEFT OUTER JOIN dba_objects o
ON o.object_id = x.current_obj#
AND event is not null
LEFT OUTER JOIN dba_data_files f
ON f.file_id = x.current_file#
AND event is not null
AND x.p1text = 'file#'
AND x.p2text = 'block#'
ORDER BY ash_secs DESC
/

The test ran for 314 seconds, though we only spent 210 seconds updating the table. However, we can see how much time was spent time of writing to various indexes. Note that T_BC is still maintained even though it is invisible.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 85%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">                                         Curr  File                            ASH Object
EVENT ACTION Obj# # P1TEXT P2TEXT P3TEXT secs Type OBJECT_NAME TABLESPACE_NAME
------------------------- ----------- ------- ----- --------- ------ ------- ----- ------ ----------- -----------------------
free buffer waits UPDATE TEST -1 0 file# block# set-id# 44
UPDATE TEST -1 0 file# block# blocks 28
write complete waits UPDATE TEST 99411 4 file# block# 19 INDEX T_BC PSDEFAULT
free buffer waits UPDATE TEST 99412 0 file# block# set-id# 14 INDEX T_BCD
write complete waits UPDATE TEST 99414 4 file# block# 14 INDEX T_BCD_DESC PSDEFAULT
free buffer waits UPDATE TEST 99411 0 file# block# set-id# 12 INDEX T_BC
write complete waits UPDATE TEST 99412 4 file# block# 12 INDEX T_BCD PSDEFAULT
write complete waits UPDATE TEST 99413 4 file# block# 11 INDEX T_BCD_ROUND PSDEFAULT
UPDATE TEST 99411 0 file# block# blocks 11
UPDATE TEST 99413 0 file# block# blocks 8
free buffer waits UPDATE TEST 99413 0 file# block# set-id# 7 INDEX T_BCD_ROUND
free buffer waits UPDATE TEST 99414 0 file# block# set-id# 7 INDEX T_BCD_DESC
UPDATE TEST 99414 0 file# block# blocks 5
UPDATE TEST 99412 0 file# block# blocks 5
free buffer waits UPDATE TEST 99412 4 file# block# set-id# 3 INDEX T_BCD PSDEFAULT
log file switch (private UPDATE TEST 99411 4 1 INDEX T_BC
strand flush incomplete)
free buffer waits UPDATE TEST 99411 4 file# block# set-id# 1 INDEX T_BC PSDEFAULT
UPDATE TEST 99413 4 file# block# blocks 1
UPDATE TEST -1 0 file# block# 1
UPDATE TEST 99411 4 file# block# blocks 1
UPDATE TEST 99411 0 file# block# 1
log file switch completio UPDATE TEST 99414 0 1 INDEX T_BCD_DESC
free buffer waits UPDATE TEST -1 4 file# block# set-id# 1 PSDEFAULT
UPDATE TEST -1 4 file# block# blocks 1
UPDATE TEST 99414 0 file# block# 1
-----
sum 210

Now I know where time was spent maintaining which indexes, and I can decide whether it is worth dropping a particular index.  In all, 33 seconds were spent maintaining index T_BC.
When I repeated the test, having dropped index T_BC, the runtime of the test goes down by 21 seconds to 293 seconds. The time spent maintaining other indexes also goes down. Time spent on index T_BCD went from 29 seconds to just 9 seconds. I speculate that this is due to better cache efficiency because it no longer has to cache index T_BC.
More time was spent on other wait events that are not directly associated with the index, so other forms of contention were introduced.  However, the performance of the test as a whole improved, so we are in a better place.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 85%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">                                         Curr  File                            ASH Object
EVENT ACTION Obj# # P1TEXT P2TEXT P3TEXT secs Type OBJECT_NAME TABLESPACE_NAME
------------------------- ----------- ------- ----- --------- ------ ------- ----- ------ ----------- ---------------
free buffer waits UPDATE TEST -1 0 file# block# set-id# 41
UPDATE TEST -1 0 file# block# blocks 39
UPDATE TEST 99419 0 file# block# blocks 12
UPDATE TEST 99418 0 file# block# blocks 11
free buffer waits UPDATE TEST 99411 0 file# block# set-id# 10
write complete waits UPDATE TEST 99413 4 file# block# 10 PSDEFAULT
UPDATE TEST 99411 0 file# block# blocks 9
free buffer waits UPDATE TEST 99412 0 file# block# set-id# 9
write complete waits UPDATE TEST 99414 4 file# block# 8 PSDEFAULT
write complete waits UPDATE TEST 99411 4 file# block# 8 PSDEFAULT
write complete waits UPDATE TEST 99418 4 file# block# 6 INDEX T_BCD PSDEFAULT
write complete waits UPDATE TEST 99412 4 file# block# 5 PSDEFAULT
UPDATE TEST 99414 0 file# block# blocks 4
free buffer waits UPDATE TEST 99414 0 file# block# set-id# 4
free buffer waits UPDATE TEST 99413 0 file# block# set-id# 3
write complete waits UPDATE TEST 99419 4 file# block# 3 INDEX T_BCD_ROUND PSDEFAULT
UPDATE TEST 99412 0 file# block# blocks 3
UPDATE TEST 99413 0 file# block# blocks 3
UPDATE TEST 99420 0 file# block# blocks 2
free buffer waits UPDATE TEST 99418 0 file# block# set-id# 2 INDEX T_BCD
write complete waits UPDATE TEST 99420 4 file# block# 1 INDEX T_BCD_DESC PSDEFAULT
UPDATE TEST 99411 4 file# block# blocks 1
write complete waits UPDATE TEST 99415 4 file# block# 1 TABLE T PSDEFAULT
UPDATE TEST 99418 0 file# block# 1
UPDATE TEST 99414 0 file# block# 1
log file switch (private UPDATE TEST 99418 0 1 INDEX T_BCD
strand flush incomplete)
UPDATE TEST 99411 0 file# block# 1
UPDATE TEST 99418 0 1
free buffer waits UPDATE TEST 99412 4 file# block# set-id# 1 PSDEFAULT
free buffer waits UPDATE TEST -1 4 file# block# set-id# 1 PSDEFAULT
UPDATE TEST -1 0 file# block# 1
free buffer waits UPDATE TEST 99420 0 file# block# set-id# 1 INDEX T_BCD_DESC
-----
sum 204

Working on an AWS Host as a DBA

We, DBAs, have a tendency to over think everything.  I don’t know if the trait to over think is just found in DBAs or if we see it in other technical positions, too.

I believe it corresponds to some of why we become loyal to one technology or platform.  We become accustomed to how something works and it’s difficult for us to let go of that knowledge and branch out into something new.  We also hate asking questions-  we should just be able to figure it out, which is why we love blog posts and simple documentation.  Just give us the facts and leave us alone to do our job.

Take the cloud-  Many DBAs were quite hesitant to embrace it.  There was a fear of security issues, challenges with network and more than anything, a learning curve.  As common, hindsight is always 20/20.  Once you start working in the cloud, you often realize that its much easier than you first thought it would be and your frustration is your worst enemy.

So today we’re going to go over some basic skills the DBA requires to manage a cloud environment, using Amazon, (AWS) as our example and the small changes required to do what we once did on-premise.

In Amazon, we’re going to be working on EC2, also referred to as the Elastic Compute Cloud.

Understanding Locations, Regions and Zones

EC2 is built out into regions and zones.  Knowing what region you’re working in is important, as it allows you to “silo” the work you’re doing and in some ways, isn’t much different than a data center. Inside of each of these regions, are availability zones, which isolates services and features even more, allowing definitive security at a precise level, with resources shared only when you deem it should.

Just as privileges granted inside a database can both be a blessing and a curse, locations and regions can cause challenges if you don’t pay attention to the location settings when you’re building out an environment.

Amazon provides a number of links with detailed information on this topic, but here’s the tips I think are important for a DBA to know:

  1.  Before setting anything up that is part of a complete solution requiring multiple setup page configurations, ALWAYS check the region in the upper right corner.  I was surprised when it would change from page to page or after a login-

https://i2.wp.com/dbakevlar.com/wp-content/uploads/2017/01/Screen-Shot-2... 300w, https://i2.wp.com/dbakevlar.com/wp-content/uploads/2017/01/Screen-Shot-2... 768w" sizes="(max-width: 600px) 100vw, 600px" data-recalc-dims="1" />

2.  If you think you may have set something up in the wrong region, the dashboard can tell you what is deployed to what region under the resources section:

https://i2.wp.com/dbakevlar.com/wp-content/uploads/2017/01/Screen-Shot-2... 300w, https://i2.wp.com/dbakevlar.com/wp-content/uploads/2017/01/Screen-Shot-2... 768w" sizes="(max-width: 600px) 100vw, 600px" data-recalc-dims="1" />

Understanding Security Keys

Public key cryptography makes the EC2 world go round.  Without this valuable 2048-bit SSH-2 RSA key encryption, you can’t communicate or log into your EC2 host securely.  Key pairs, a combination of a private and public key should be part of your setup for your cloud environment.

Using EC2’s mechanism to create these is easy to do and eases management.  Its not the only way, but it does simplify and as you can see above in the resource information from the dashboard, it also offers you a one-stop shop for everything you need.

When you create one in the Amazon cloud, the private key downloads automatically to the workstation you’re using and it’s important that you keep track of it, as there’s no way to recreate the private key that will be required to connect to the EC2 host.

Your key pair is easy to create by first accessing your EC2 dashboard and then scroll down on the left side and click on “Key Pairs”.  From this console, you’ll have the opportunity to create, import a pre-existing key or manage the ones already in EC2:

https://i0.wp.com/dbakevlar.com/wp-content/uploads/2017/01/Screen-Shot-2... 300w, https://i0.wp.com/dbakevlar.com/wp-content/uploads/2017/01/Screen-Shot-2... 768w" sizes="(max-width: 600px) 100vw, 600px" data-recalc-dims="1" />

Before creating, always verify your region you’re working in, as we discussed in the previous section and if you’re experiencing issue with your key, verify typographical errors and if the location of the private file matches the name listed for identification.

If more than one group is managing the EC2 environment, carefully consider before deleting a key pair.  I’ve experienced the pain caused by a key removal that created a production outage.  Creation of a new key pair is simpler to manage than implementation of a new key pair across application and system tiers after the removal of one that was necessary.

Understanding Roles and Security

Security Groups are silo’d for a clear reason and no where is this more apparent than in the cloud.  To ensure that the cloud is secure, setting clear and defined boundaries of accessibility to roles and groups is important to keep infiltrators out of environments they have no business accessing.

As we discussed in Key Pairs, our Security Groups are also listed by region under resources so we know they exist at a high level.  If we click on the Security Groups link under Resources in the EC2 Dashboard, we’ll go from seeing 5 security group members:

https://i0.wp.com/dbakevlar.com/wp-content/uploads/2017/01/Screen-Shot-2... 300w, https://i0.wp.com/dbakevlar.com/wp-content/uploads/2017/01/Screen-Shot-2... 768w" sizes="(max-width: 600px) 100vw, 600px" data-recalc-dims="1" />

To viewing the list of security groups:

https://i2.wp.com/dbakevlar.com/wp-content/uploads/2017/01/Screen-Shot-2... 300w, https://i2.wp.com/dbakevlar.com/wp-content/uploads/2017/01/Screen-Shot-2... 768w" sizes="(max-width: 600px) 100vw, 600px" data-recalc-dims="1" />

If you need to prove that these are for N. California, (i.e. US-West-1) region, click on the region up in the top right corner and change to a different region.  For our example, I switched to Ohio, (us-east-2) and the previous security groups aren’t listed and just the default security group for Ohio region is displayed:

https://i0.wp.com/dbakevlar.com/wp-content/uploads/2017/01/Screen-Shot-2... 300w, https://i0.wp.com/dbakevlar.com/wp-content/uploads/2017/01/Screen-Shot-2... 768w" sizes="(max-width: 600px) 100vw, 600px" data-recalc-dims="1" />

Security groups should be treated in the cloud the same way we treat privileges inside a database-  granting the least privileges required is best practice.

Understanding How to SSH to a Host

You’re a DBA, which means you’re most likely most comfortable at the command line.  Logging in via SSH on a box is as natural as walking and everything we’ve gone through so far was to prepare you for this next step.

Your favorite command line tool, no matter if it’s Putty or Terminal, if you’re set up everything in the previous sections correctly, then you’re ready to log into the host, aka instance.

  1.  Ensure your downloaded private key is saved in an easily accessible spot for you to use to log in or that you know the username/password, (keys just make this easier…)
  2. Gather the information about “instances” by clicking on the EC2 dashboard, then click on Instances.
  3. The Public DNS and the Public IP is displayed and note the region, too:

https://i0.wp.com/dbakevlar.com/wp-content/uploads/2017/01/Screen-Shot-2... 300w" sizes="(max-width: 600px) 100vw, 600px" data-recalc-dims="1" />

You can use this information to then ssh into the host:

ssh -i ".pem" @..compute.amazonaws.com

Once logged in as the OS user, you can SU over to the application or database user and proceed as you would on any other host.

If you attempt to log into a region with a key pair from another region, it state that the key pair can’t be found, so another aspect showing the importance of regions.

Understanding How to SCP a File

This is the last area I’ll cover today, (I know, a few of you are saying, “good, I’ve already got too much in my head to keep straight, Kellyn…)

With just about any Cloud offering, you can bring your own license.  Although there are a ton of images, (AMIs in AWS, VHDs in Azure, etc.) pre-built, you may need to use a bare metal OS image and load your own software or as most DBAs, bring over patches to maintain the database you have running out there.  Just because you’re in the cloud doesn’t mean you don’t have a job to do.

Change over to the directory that contains the file that you need to copy and then run the following:

scp -i .pem  @..compute.amazonaws.com://.

If you try to use a key pair from one region to log into a SCP to a host, (instance) in another region, you won’t receive an error, but it will act like you skipped the “-i” and the key pair and you’ll be prompted for the password for the user:

<> password: 

pxxxxxxxxxxxx_11xxxx_Linux-x86-64.zip             100%   20MB  72.9KB/s   04:36

This is a good start to getting started as a DBA on the cloud and not over-thinking.  I’ll be posting more in the upcoming weeks that will not only assist those already in the cloud, but those wanting to find a way to invest more into their own cloud education!



Tags:  


Del.icio.us



Facebook

TweetThis

Digg

StumbleUpon




Copyright © DBA Kevlar [Working on an AWS Host as a DBA], All Right Reserved. 2017.

The post Working on an AWS Host as a DBA appeared first on DBA Kevlar.

Hack-a-Mongo

I was reading an article today about how 10,000+ Mongo installations that are/were openly accessible on the internet have now been captured by ransomware, with nearly 100,000 other instances potentially vulnerable to the same issue.

Now, since I’m an Oracle blogger, you may be inclined to think the post is going to jump on the “bash MongoDB” bandwagon, but it’s not.  I am going to bash  something…but it’s not MongoDB Smile

I’m going to steal a quote from … of all things… Jurassic Park Smile

“Your scientists were so preoccupied with whether or not they could, they didn’t stop to think if they should”

In this age of DevOps, Agility, reactive-this, and responsive-that, there is an understandable desire to have a goal of “deliver fast and often”.  But I’m not attacking the various methodologies and processes out there that espouse the new age of fast and furious delivery.  Because, to my knowledge, none of them ever had a goal of “fast delivery no matter what“…they had a goal of fast and flexible delivery whilst adhering to the unchanging fundamental principle of delivering high quality, secure and robust software.  It may be functionally incomplete, or have compromises in (for example) the user experience, but we never meant to compromise on the core principles.

So when I see stories about MongoDB (or any software technology) being exposed for reasons of poor security, my lament is not for the technology but the part when a developer, or administrator, or manager, or anyone in the chain of processes that skipped an appropriate assessment of security, in effect, playing a game of Russian Roulette with their customer’s personal details.  I don’t think there is even a defence of “ignorance” here, because the moment anything is deployed in an organisation, surely the first question that must be asked either before the deployment, or even in that awkward moment afterwards when you discover something was deployed without your knowledge is … Is it secured?  And if no-one can answer that question, then surely that’s the immediate death knell on that application right there ?

If “hand on heart” a team can claim that security measures are in place, and then these are circumvented, then that’s defensible – perfection in security is such a moving target nowadays.  But it seems a stretch to think that 100,000 Mongo teams out there all did a diligent crack at securing their system before getting all excited about having their application live Sad smile

So if you’re building applications with the latest Micro-agile-internet-of-things-reactive-responsive-Java-scripted-open-source-media-rich-blahblah approach…that’s super cool.  But that’s not a Leave Pass from being responsible when it comes to security.

12c Adaptive Joins Plus Statistics Feedback For Joins Cardinality Estimate Bug

I've encountered a bug at several clients that upgraded to Oracle 12c - 12.1.0.2 - that requires the combination of several new adaptive features introduced with Oracle 12c.It needs an execution plan that makes use of adaptive join methods, plus at runtime the activation of the new "statistics feedback for joins" feature that was also introduced with Oracle 12c. Note that in 11.2 there was already the "cardinality feedback" feature that only applies to single table cardinality misestimates, but not to join cardinality misestimates.In case then the join method used at runtime is a Nested Loop join - not necessarily the join method preferred initially, so a runtime switch from Hash to Nested Loop join also reproduces the problem - the "statistics feedback for joins" feature generates a bad OPT_ESTIMATE hint for the join cardinality that always seems to be one, like the following: OPT_ESTIMATE(... JOIN (...) ROWS=1.000000), no matter what the actual join cardinality observed was.This can lead to very inefficient execution plans that get generated based on the "statistics feedback" on subsequent executions of the same statement, caused by the misleading join cardinality used to generate the new plan.The good news is that in 12.2.0.1 and in 12.1, when making use of the new backport available that enables the same optimizer default settings than in 12.2 - for more information see this MOS document: Recommendations for Adaptive Features in Oracle Database 12c Release 1 (2187449.1) and for example Christian Antognini's post - the "statistics feedback for joins" feature is disabled by default (in fact in principle only the "adaptive joins" feature is left enabled, all other adaptive features more or less disabled), so the problem doesn't occur there. So one more good reason why installing this backport in 12.1 is a good idea.However, when enabling this feature specifically or simply enabling all "adaptive statistics" features (OPTIMIZER_ADAPTIVE_STATISTICS = TRUE in 12.2 / 12.1 with backport) the problem also reproduces in 12.2.If you don't make use of the recommended optimizer settings backport in 12.1 yet then setting "_optimizer_use_feedback" to FALSE prevents the problem, however this will disable both, the "cardinality feedback for single table" 11.2 feature as well as the "statistics feedback for joins" 12c feature.In 12.2, there are two parameters related to "cardinality feedback", "_OPTIMIZER_USE_FEEDBACK" that controls the "cardinality feedback for single table" 11.2 feature and "_OPTIMIZER_USE_FEEDBACK_FOR_JOIN" that controls the new "statistics feedback for joins" feature. Hence, in 12.2, when enabling the "adaptive statistics" feature, the problem can be avoided by setting specifically "_OPTIMIZER_USE_FEEDBACK_FOR_JOIN" to FALSE, which would still leave the "cardinality feedback for single table" 11.2 feature enabled.For more information regarding the various optimizer related settings and differences between 12.1 and 12.2 / 12.1 backport, see this very informative post by Christian Antognini.Here is a simplified test case that allows reproducing the problem:


-----------------------------------------------------------------------
-- Adaptive joins combined with statistics feedback for joins
-- lead to join cardinality estimate of 1
-- caused by incorrect statistics feedback OPT_ESTIMATE hint generated
--
-- The problem seems to be related to the following combination:
--
-- - Adaptive join method selected
-- - Statistics feedback for joins kicks in
-- - The runtime join method used is Nested Loop join
--
-- Reproduced: 12.1.0.2
-- 12.2.0.1 (with adaptive statistics features re-enabled)
--
-----------------------------------------------------------------------

set echo on

alter system flush shared_pool;

drop table t1 purge;

drop table t2 purge;

create table t1
as
select
rownum as id
, cast('9999' as varchar2(4)) as hist_ind
, rpad('x', 200) as filler
from
dual
connect by
level <= 30000
order by
dbms_random.value
;

exec dbms_stats.gather_table_stats(null, 't1', method_opt => 'for all columns size 1')

create unique index t1_idx on t1 (id);

create table t2
as
with
generator1 as
(
select /*+
cardinality(1e3)
materialize
*/
rownum as id
, rpad('x', 100) as filler
from
dual
connect by
level <= 1e3
),
generator2 as
(
select /*+
cardinality(1e3)
materialize
*/
rownum as id
, rpad('x', 100) as filler
from
dual
connect by
level <= 1e3
),
generator as
(
select /*+ leading(b a) */
(a.id - 1) * 1e3 + b.id as id
from
generator1 a
, generator2 b
)
select
case when id <= 10000 then 1 when id <= 20000 then -1 else id end as id
, cast('N' as varchar2(1)) as some_ind
, rpad('x', 200) as filler
from
generator
;

exec dbms_stats.gather_table_stats(null, 't2', method_opt => 'for all columns size 1')

create /*unique*/ index t2_idx on t2 (id);

set echo on define on

column sql_id new_value sql_id

alter session set statistics_level = all;

-- Enable statistics feedback for joins in 12.2.0.1
alter session set optimizer_adaptive_statistics = true;

-- Disabling statistics feedback for joins prevents the bug
-- alter session set "_OPTIMIZER_USE_FEEDBACK_FOR_JOIN" = false;

-- Or disabling adaptive joins prevents the bug
-- alter session set "_OPTIMIZER_ADAPTIVE_PLANS" = false;

-- alter session set "_OPTIMIZER_NLJ_HJ_ADAPTIVE_JOIN" = false;

-- Not related to NL join optimized plan shapes
--alter session set "_nlj_batching_enabled" = 0;

--alter session set "_table_lookup_prefetch_size" = 0;

-- Expected number of rows, initial join NL, still "statistics feedback" kicks in (is it a bug that statistics feedback kicks in here?)
-- No join method switch
-- Bad OPT_ESTIMATE hint for join (ROWS=1) => bug
SELECT /*+ leading(a1) */ count(A1.SOME_IND) FROM
T1 A0 INNER JOIN T2 A1
ON 1 = 1
AND A1.ID = A0.ID
WHERE 1 = 1
AND A0.HIST_IND = '9999'
AND A0.ID between 20000 and 22000
;

-- Re-execute to have statistics feedback kick in
/

select prev_sql_id as sql_id from v$session where sid = userenv('sid');

select * from table(dbms_xplan.display_cursor('&sql_id', null, format => 'TYPICAL ALLSTATS LAST ADAPTIVE'));

select hint_text from V$SQL_REOPTIMIZATION_HINTS where sql_id = '&sql_id' and hint_text like '%JOIN%';

-- More than expected number of rows, initial join NL, "statistics feedback" kicks in
-- Switch from NL to HASH (at runtime and at re-optimization time)
-- Generated OPT_ESTIMATE hints are OK
SELECT /*+ leading(a1) */ count(A1.SOME_IND) FROM
T1 A0 INNER JOIN T2 A1
ON 1 = 1
AND A1.ID = A0.ID
WHERE 1 = 1
AND A0.HIST_IND = '9999'
AND A0.ID between 1 and 2
;

-- Re-execute to have statistics feedback kick in
/

select prev_sql_id as sql_id from v$session where sid = userenv('sid');

select * from table(dbms_xplan.display_cursor('&sql_id', null, format => 'TYPICAL ALLSTATS LAST ADAPTIVE'));

select hint_text from V$SQL_REOPTIMIZATION_HINTS where sql_id = '&sql_id' and hint_text like '%JOIN%';

-- Less than expected number of rows, initial join HASH, "statistics feedback" kicks in
-- Switch from HASH to NL (at runtime and at re-optimization time)
-- Bad OPT_ESTIMATE hint for join generated (ROWS=1) => bug
SELECT /*+ leading(a1) */ count(A1.SOME_IND) FROM
T1 A0 INNER JOIN T2 A1
ON 1 = 1
AND A1.ID = A0.ID
WHERE 1 = 1
AND A0.HIST_IND = '9999'
AND A0.ID between 2 and 20500
;

-- Re-execute to have statistics feedback kick in
/

select prev_sql_id as sql_id from v$session where sid = userenv('sid');

select * from table(dbms_xplan.display_cursor('&sql_id', null, format => 'TYPICAL ALLSTATS LAST ADAPTIVE'));

select hint_text from V$SQL_REOPTIMIZATION_HINTS where sql_id = '&sql_id' and hint_text like '%JOIN%';

-- Expected number of rows, initial join HASH, "statistics feedback" does not kick in
-- No join method switch
SELECT /*+ leading(a1) */ count(A1.SOME_IND) FROM
T1 A0 INNER JOIN T2 A1
ON 1 = 1
AND A1.ID = A0.ID
WHERE 1 = 1
AND A0.HIST_IND = '9999'
AND A0.ID between 20000 and 30000
;

-- Re-execute to have statistics feedback kick in
/

select prev_sql_id as sql_id from v$session where sid = userenv('sid');

select * from table(dbms_xplan.display_cursor('&sql_id', null, format => 'TYPICAL ALLSTATS LAST ADAPTIVE'));

select hint_text from V$SQL_REOPTIMIZATION_HINTS where sql_id = '&sql_id' and hint_text like '%JOIN%';

The critical part of the output looks like this:


SQL> -- Expected number of rows, initial join NL, still "statistics feedback" kicks in (is it a bug that statistics feedback kicks in here?)
SQL> -- No join method switch
SQL> -- Bad OPT_ESTIMATE hint for join (ROWS=1) => bug
SQL> SELECT /*+ leading(a1) */ count(A1.SOME_IND) FROM
2 T1 A0 INNER JOIN T2 A1
3 ON 1 = 1
4 AND A1.ID = A0.ID
5 WHERE 1 = 1
6 AND A0.HIST_IND = '9999'
7 AND A0.ID between 20000 and 22000
8 ;

COUNT(A1.SOME_IND)
------------------
2000

SQL>
SQL> -- Re-execute to have statistics feedback kick in
SQL> /

COUNT(A1.SOME_IND)
------------------
2000

SQL>
SQL> select prev_sql_id as sql_id from v$session where sid = userenv('sid');

SQL_ID
-------------
8mqn521y28t58

SQL>
SQL> select * from table(dbms_xplan.display_cursor('&sql_id', null, format => 'TYPICAL ALLSTATS LAST ADAPTIVE'));

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 8mqn521y28t58, child number 0
-------------------------------------
SELECT /*+ leading(a1) */ count(A1.SOME_IND) FROM T1 A0 INNER JOIN
T2 A1 ON 1 = 1 AND A1.ID = A0.ID WHERE 1 = 1 AND A0.HIST_IND =
'9999' AND A0.ID between 20000 and 22000

Plan hash value: 3258782287

--------------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time | A-Rows | A-Time | Buffers | Reads |
--------------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | | 2036 (100)| | 1 |00:00:00.02 | 4044 | 38 |
| 1 | SORT AGGREGATE | | 1 | 1 | 17 | | | 1 |00:00:00.02 | 4044 | 38 |
|- * 2 | HASH JOIN | | 1 | 2001 | 34017 | 2036 (1)| 00:00:01 | 2000 |00:00:00.02 | 4044 | 38 |
| 3 | NESTED LOOPS | | 1 | 2001 | 34017 | 2036 (1)| 00:00:01 | 2000 |00:00:00.02 | 4044 | 38 |
| 4 | NESTED LOOPS | | 1 | 2002 | 34017 | 2036 (1)| 00:00:01 | 2000 |00:00:00.02 | 2044 | 38 |
|- 5 | STATISTICS COLLECTOR | | 1 | | | | | 2000 |00:00:00.01 | 2007 | 6 |
| 6 | TABLE ACCESS BY INDEX ROWID BATCHED| T2 | 1 | 2002 | 14014 | 1974 (0)| 00:00:01 | 2000 |00:00:00.01 | 2007 | 6 |
| * 7 | INDEX RANGE SCAN | T2_IDX | 1 | 2002 | | 7 (0)| 00:00:01 | 2000 |00:00:00.01 | 7 | 6 |
| * 8 | INDEX UNIQUE SCAN | T1_IDX | 2000 | 1 | | 0 (0)| | 2000 |00:00:00.01 | 37 | 32 |
| * 9 | TABLE ACCESS BY INDEX ROWID | T1 | 2000 | 1 | 10 | 1 (0)| 00:00:01 | 2000 |00:00:00.01 | 2000 | 0 |
|- * 10 | TABLE ACCESS FULL | T1 | 0 | 1 | 10 | 1 (0)| 00:00:01 | 0 |00:00:00.01 | 0 | 0 |
--------------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("A1"."ID"="A0"."ID")
7 - access("A1"."ID">=20000 AND "A1"."ID"<=22000)
8 - access("A1"."ID"="A0"."ID")
filter(("A0"."ID">=20000 AND "A0"."ID"<=22000))
9 - filter("A0"."HIST_IND"='9999')
10 - filter(("A0"."ID">=20000 AND "A0"."ID"<=22000 AND "A0"."HIST_IND"='9999'))

Note
-----
- this is an adaptive plan (rows marked '-' are inactive)

SQL_ID 8mqn521y28t58, child number 1
-------------------------------------
SELECT /*+ leading(a1) */ count(A1.SOME_IND) FROM T1 A0 INNER JOIN
T2 A1 ON 1 = 1 AND A1.ID = A0.ID WHERE 1 = 1 AND A0.HIST_IND =
'9999' AND A0.ID between 20000 and 22000

Plan hash value: 3258782287

-----------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time | A-Rows | A-Time | Buffers |
-----------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | | 2036 (100)| | 1 |00:00:00.02 | 4044 |
| 1 | SORT AGGREGATE | | 1 | 1 | 17 | | | 1 |00:00:00.02 | 4044 |
|- * 2 | HASH JOIN | | 1 | 1 | 17 | 2036 (1)| 00:00:01 | 2000 |00:00:00.02 | 4044 |
| 3 | NESTED LOOPS | | 1 | 1 | 17 | 2036 (1)| 00:00:01 | 2000 |00:00:00.02 | 4044 |
| 4 | NESTED LOOPS | | 1 | 2002 | 17 | 2036 (1)| 00:00:01 | 2000 |00:00:00.01 | 2044 |
|- 5 | STATISTICS COLLECTOR | | 1 | | | | | 2000 |00:00:00.01 | 2007 |
| 6 | TABLE ACCESS BY INDEX ROWID BATCHED| T2 | 1 | 2002 | 14014 | 1974 (0)| 00:00:01 | 2000 |00:00:00.01 | 2007 |
| * 7 | INDEX RANGE SCAN | T2_IDX | 1 | 2002 | | 7 (0)| 00:00:01 | 2000 |00:00:00.01 | 7 |
| * 8 | INDEX UNIQUE SCAN | T1_IDX | 2000 | 1 | | 0 (0)| | 2000 |00:00:00.01 | 37 |
| * 9 | TABLE ACCESS BY INDEX ROWID | T1 | 2000 | 1 | 10 | 1 (0)| 00:00:01 | 2000 |00:00:00.01 | 2000 |
|- * 10 | TABLE ACCESS FULL | T1 | 0 | 1 | 10 | 1 (0)| 00:00:01 | 0 |00:00:00.01 | 0 |
-----------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("A1"."ID"="A0"."ID")
7 - access("A1"."ID">=20000 AND "A1"."ID"<=22000)
8 - access("A1"."ID"="A0"."ID")
filter(("A0"."ID">=20000 AND "A0"."ID"<=22000))
9 - filter("A0"."HIST_IND"='9999')
10 - filter(("A0"."ID">=20000 AND "A0"."ID"<=22000 AND "A0"."HIST_IND"='9999'))

Note
-----
- statistics feedback used for this statement
- this is an adaptive plan (rows marked '-' are inactive)


77 rows selected.

SQL>
SQL> select hint_text from V$SQL_REOPTIMIZATION_HINTS where sql_id = '&sql_id' and hint_text like '%JOIN%';

HINT_TEXT
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
OPT_ESTIMATE (@"SEL$58A6D7F6" JOIN ("A0"@"SEL$1" "A1"@"SEL$1") ROWS=1.000000 )

SQL>
SQL> -- More than expected number of rows, initial join NL, "statistics feedback" kicks in
SQL> -- Switch from NL to HASH (at runtime and at re-optimization time)
SQL> -- Generated OPT_ESTIMATE hints are OK
SQL> SELECT /*+ leading(a1) */ count(A1.SOME_IND) FROM
2 T1 A0 INNER JOIN T2 A1
3 ON 1 = 1
4 AND A1.ID = A0.ID
5 WHERE 1 = 1
6 AND A0.HIST_IND = '9999'
7 AND A0.ID between 1 and 2
8 ;

COUNT(A1.SOME_IND)
------------------
10000

SQL>
SQL> -- Re-execute to have statistics feedback kick in
SQL> /

COUNT(A1.SOME_IND)
------------------
10000

SQL>
SQL> select prev_sql_id as sql_id from v$session where sid = userenv('sid');

SQL_ID
-------------
92rttcj6ntzqs

SQL>
SQL> select * from table(dbms_xplan.display_cursor('&sql_id', null, format => 'TYPICAL ALLSTATS LAST ADAPTIVE'));

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 92rttcj6ntzqs, child number 0
-------------------------------------
SELECT /*+ leading(a1) */ count(A1.SOME_IND) FROM T1 A0 INNER JOIN
T2 A1 ON 1 = 1 AND A1.ID = A0.ID WHERE 1 = 1 AND A0.HIST_IND =
'9999' AND A0.ID between 1 and 2

Plan hash value: 777836357

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time | A-Rows | A-Time | Buffers | Reads | OMem | 1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | | 7 (100)| | 1 |00:00:00.04 | 1262 | 70 | | | |
| 1 | SORT AGGREGATE | | 1 | 1 | 17 | | | 1 |00:00:00.04 | 1262 | 70 | | | |
| * 2 | HASH JOIN | | 1 | 2 | 34 | 7 (0)| 00:00:01 | 10000 |00:00:00.04 | 1262 | 70 | 2061K| 2061K| 1355K (0)|
|- 3 | NESTED LOOPS | | 1 | 2 | 34 | 7 (0)| 00:00:01 | 10000 |00:00:00.03 | 1258 | 40 | | | |
|- 4 | NESTED LOOPS | | 1 | 3 | 34 | 7 (0)| 00:00:01 | 10000 |00:00:00.02 | 1258 | 40 | | | |
|- 5 | STATISTICS COLLECTOR | | 1 | | | | | 10000 |00:00:00.02 | 1258 | 40 | | | |
| 6 | TABLE ACCESS BY INDEX ROWID BATCHED| T2 | 1 | 3 | 21 | 6 (0)| 00:00:01 | 10000 |00:00:00.02 | 1258 | 40 | | | |
| * 7 | INDEX RANGE SCAN | T2_IDX | 1 | 3 | | 3 (0)| 00:00:01 | 10000 |00:00:00.01 | 23 | 40 | | | |
|- * 8 | INDEX UNIQUE SCAN | T1_IDX | 0 | 1 | | 0 (0)| | 0 |00:00:00.01 | 0 | 0 | | | |
|- * 9 | TABLE ACCESS BY INDEX ROWID | T1 | 0 | 1 | 10 | 1 (0)| 00:00:01 | 0 |00:00:00.01 | 0 | 0 | | | |
| * 10 | TABLE ACCESS BY INDEX ROWID BATCHED | T1 | 1 | 1 | 10 | 1 (0)| 00:00:01 | 2 |00:00:00.01 | 4 | 30 | | | |
| * 11 | INDEX RANGE SCAN | T1_IDX | 1 | 1 | | 0 (0)| | 2 |00:00:00.01 | 2 | 30 | | | |
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("A1"."ID"="A0"."ID")
7 - access("A1"."ID">=1 AND "A1"."ID"<=2)
8 - access("A1"."ID"="A0"."ID")
filter(("A0"."ID"<=2 AND "A0"."ID">=1))
9 - filter("A0"."HIST_IND"='9999')
10 - filter("A0"."HIST_IND"='9999')
11 - access("A0"."ID">=1 AND "A0"."ID"<=2)

Note
-----
- this is an adaptive plan (rows marked '-' are inactive)

SQL_ID 92rttcj6ntzqs, child number 1
-------------------------------------
SELECT /*+ leading(a1) */ count(A1.SOME_IND) FROM T1 A0 INNER JOIN
T2 A1 ON 1 = 1 AND A1.ID = A0.ID WHERE 1 = 1 AND A0.HIST_IND =
'9999' AND A0.ID between 1 and 2

Plan hash value: 3588347061

---------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | | 7982 (100)| | 1 |00:00:00.13 | 29516 | | | |
| 1 | SORT AGGREGATE | | 1 | 1 | 17 | | | 1 |00:00:00.13 | 29516 | | | |
|* 2 | HASH JOIN | | 1 | 10000 | 166K| 7982 (1)| 00:00:01 | 10000 |00:00:00.13 | 29516 | 2061K| 2061K| 1356K (0)|
|* 3 | TABLE ACCESS FULL | T2 | 1 | 10000 | 70000 | 7978 (1)| 00:00:01 | 10000 |00:00:00.12 | 29512 | | | |
|* 4 | TABLE ACCESS BY INDEX ROWID BATCHED| T1 | 1 | 2 | 20 | 4 (0)| 00:00:01 | 2 |00:00:00.01 | 4 | | | |
|* 5 | INDEX RANGE SCAN | T1_IDX | 1 | 2 | | 2 (0)| 00:00:01 | 2 |00:00:00.01 | 2 | | | |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("A1"."ID"="A0"."ID")
3 - filter(("A1"."ID"<=2 AND "A1"."ID">=1))
4 - filter("A0"."HIST_IND"='9999')
5 - access("A0"."ID">=1 AND "A0"."ID"<=2)

Note
-----
- statistics feedback used for this statement


71 rows selected.

SQL>
SQL> select hint_text from V$SQL_REOPTIMIZATION_HINTS where sql_id = '&sql_id' and hint_text like '%JOIN%';

HINT_TEXT
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
OPT_ESTIMATE (@"SEL$58A6D7F6" JOIN ("A0"@"SEL$1" "A1"@"SEL$1") ROWS=10000.000000 )

SQL>
SQL> -- Less than expected number of rows, initial join HASH, "statistics feedback" kicks in
SQL> -- Switch from HASH to NL (at runtime and at re-optimization time)
SQL> -- Bad OPT_ESTIMATE hint for join generated (ROWS=1) => bug
SQL> SELECT /*+ leading(a1) */ count(A1.SOME_IND) FROM
2 T1 A0 INNER JOIN T2 A1
3 ON 1 = 1
4 AND A1.ID = A0.ID
5 WHERE 1 = 1
6 AND A0.HIST_IND = '9999'
7 AND A0.ID between 2 and 20500
8 ;

COUNT(A1.SOME_IND)
------------------
500

SQL>
SQL> -- Re-execute to have statistics feedback kick in
SQL> /

COUNT(A1.SOME_IND)
------------------
500

SQL>
SQL> select prev_sql_id as sql_id from v$session where sid = userenv('sid');

SQL_ID
-------------
c55rjg5mdxpph

SQL>
SQL> select * from table(dbms_xplan.display_cursor('&sql_id', null, format => 'TYPICAL ALLSTATS LAST ADAPTIVE'));

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID c55rjg5mdxpph, child number 0
-------------------------------------
SELECT /*+ leading(a1) */ count(A1.SOME_IND) FROM T1 A0 INNER JOIN
T2 A1 ON 1 = 1 AND A1.ID = A0.ID WHERE 1 = 1 AND A0.HIST_IND =
'9999' AND A0.ID between 2 and 20500

Plan hash value: 1011946885

-------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time | A-Rows | A-Time | Buffers |
-------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | | 8227 (100)| | 1 |00:00:00.13 | 30028 |
| 1 | SORT AGGREGATE | | 1 | 1 | 17 | | | 1 |00:00:00.13 | 30028 |
|- * 2 | HASH JOIN | | 1 | 20500 | 340K| 8227 (1)| 00:00:01 | 500 |00:00:00.13 | 30028 |
| 3 | NESTED LOOPS | | 1 | 20500 | 340K| 8227 (1)| 00:00:01 | 500 |00:00:00.13 | 30028 |
| 4 | NESTED LOOPS | | 1 | | | | | 500 |00:00:00.13 | 29528 |
|- 5 | STATISTICS COLLECTOR | | 1 | | | | | 500 |00:00:00.13 | 29512 |
| * 6 | TABLE ACCESS FULL | T2 | 1 | 20500 | 140K| 7978 (1)| 00:00:01 | 500 |00:00:00.13 | 29512 |
| * 7 | INDEX UNIQUE SCAN | T1_IDX | 500 | | | | | 500 |00:00:00.01 | 16 |
| * 8 | TABLE ACCESS BY INDEX ROWID| T1 | 500 | 1 | 10 | 248 (0)| 00:00:01 | 500 |00:00:00.01 | 500 |
|- * 9 | TABLE ACCESS FULL | T1 | 0 | 20501 | 200K| 248 (0)| 00:00:01 | 0 |00:00:00.01 | 0 |
-------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("A1"."ID"="A0"."ID")
6 - filter(("A1"."ID"<=20500 AND "A1"."ID">=2))
7 - access("A1"."ID"="A0"."ID")
filter(("A0"."ID"<=20500 AND "A0"."ID">=2))
8 - filter("A0"."HIST_IND"='9999')
9 - filter(("A0"."ID"<=20500 AND "A0"."ID">=2 AND "A0"."HIST_IND"='9999'))

Note
-----
- this is an adaptive plan (rows marked '-' are inactive)

SQL_ID c55rjg5mdxpph, child number 1
-------------------------------------
SELECT /*+ leading(a1) */ count(A1.SOME_IND) FROM T1 A0 INNER JOIN
T2 A1 ON 1 = 1 AND A1.ID = A0.ID WHERE 1 = 1 AND A0.HIST_IND =
'9999' AND A0.ID between 2 and 20500

Plan hash value: 1011946885

----------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time | A-Rows | A-Time | Buffers |
----------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | | 7994 (100)| | 1 |00:00:00.13 | 30028 |
| 1 | SORT AGGREGATE | | 1 | 1 | 17 | | | 1 |00:00:00.13 | 30028 |
| 2 | NESTED LOOPS | | 1 | 1 | 17 | 7994 (1)| 00:00:01 | 500 |00:00:00.13 | 30028 |
| 3 | NESTED LOOPS | | 1 | 500 | 17 | 7994 (1)| 00:00:01 | 500 |00:00:00.13 | 29528 |
|* 4 | TABLE ACCESS FULL | T2 | 1 | 500 | 3500 | 7978 (1)| 00:00:01 | 500 |00:00:00.13 | 29512 |
|* 5 | INDEX UNIQUE SCAN | T1_IDX | 500 | 1 | | 0 (0)| | 500 |00:00:00.01 | 16 |
|* 6 | TABLE ACCESS BY INDEX ROWID| T1 | 500 | 1 | 10 | 1 (0)| 00:00:01 | 500 |00:00:00.01 | 500 |
----------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

4 - filter(("A1"."ID"<=20500 AND "A1"."ID">=2))
5 - access("A1"."ID"="A0"."ID")
filter(("A0"."ID"<=20500 AND "A0"."ID">=2))
6 - filter("A0"."HIST_IND"='9999')

Note
-----
- statistics feedback used for this statement


69 rows selected.

SQL>
SQL> select hint_text from V$SQL_REOPTIMIZATION_HINTS where sql_id = '&sql_id' and hint_text like '%JOIN%';

HINT_TEXT
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
OPT_ESTIMATE (@"SEL$58A6D7F6" JOIN ("A0"@"SEL$1" "A1"@"SEL$1") ROWS=1.000000 )

SQL>
SQL> -- Expected number of rows, initial join HASH, "statistics feedback" does not kick in
SQL> -- No join method switch
SQL> SELECT /*+ leading(a1) */ count(A1.SOME_IND) FROM
2 T1 A0 INNER JOIN T2 A1
3 ON 1 = 1
4 AND A1.ID = A0.ID
5 WHERE 1 = 1
6 AND A0.HIST_IND = '9999'
7 AND A0.ID between 20000 and 30000
8 ;

COUNT(A1.SOME_IND)
------------------
10000

SQL>
SQL> -- Re-execute to have statistics feedback kick in
SQL> /

COUNT(A1.SOME_IND)
------------------
10000

SQL>
SQL> select prev_sql_id as sql_id from v$session where sid = userenv('sid');

SQL_ID
-------------
4tj7bn17xcbad

SQL>
SQL> select * from table(dbms_xplan.display_cursor('&sql_id', null, format => 'TYPICAL ALLSTATS LAST ADAPTIVE'));

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 4tj7bn17xcbad, child number 0
-------------------------------------
SELECT /*+ leading(a1) */ count(A1.SOME_IND) FROM T1 A0 INNER JOIN
T2 A1 ON 1 = 1 AND A1.ID = A0.ID WHERE 1 = 1 AND A0.HIST_IND =
'9999' AND A0.ID between 20000 and 30000

Plan hash value: 4274056747

----------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | | 8227 (100)| | 1 |00:00:00.17 | 30434 | | | |
| 1 | SORT AGGREGATE | | 1 | 1 | 17 | | | 1 |00:00:00.17 | 30434 | | | |
| * 2 | HASH JOIN | | 1 | 10001 | 166K| 8227 (1)| 00:00:01 | 10000 |00:00:00.17 | 30434 | 1969K| 1969K| 1895K (0)|
|- 3 | NESTED LOOPS | | 1 | 10001 | 166K| 8227 (1)| 00:00:01 | 10000 |00:00:00.14 | 29512 | | | |
|- 4 | NESTED LOOPS | | 1 | | | | | 10000 |00:00:00.14 | 29512 | | | |
|- 5 | STATISTICS COLLECTOR | | 1 | | | | | 10000 |00:00:00.13 | 29512 | | | |
| * 6 | TABLE ACCESS FULL | T2 | 1 | 10002 | 70014 | 7978 (1)| 00:00:01 | 10000 |00:00:00.13 | 29512 | | | |
|- * 7 | INDEX UNIQUE SCAN | T1_IDX | 0 | | | | | 0 |00:00:00.01 | 0 | | | |
|- * 8 | TABLE ACCESS BY INDEX ROWID| T1 | 0 | 1 | 10 | 248 (0)| 00:00:01 | 0 |00:00:00.01 | 0 | | | |
| * 9 | TABLE ACCESS FULL | T1 | 1 | 10001 | 97K| 248 (0)| 00:00:01 | 10001 |00:00:00.01 | 922 | | | |
----------------------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("A1"."ID"="A0"."ID")
6 - filter(("A1"."ID"<=30000 AND "A1"."ID">=20000))
7 - access("A1"."ID"="A0"."ID")
filter(("A0"."ID">=20000 AND "A0"."ID"<=30000))
8 - filter("A0"."HIST_IND"='9999')
9 - filter(("A0"."ID">=20000 AND "A0"."HIST_IND"='9999' AND "A0"."ID"<=30000))

Note
-----
- this is an adaptive plan (rows marked '-' are inactive)


37 rows selected.

SQL>
SQL> select hint_text from V$SQL_REOPTIMIZATION_HINTS where sql_id = '&sql_id' and hint_text like '%JOIN%';

no rows selected

SQL>

Note how in each case where the Nested Loop join gets used at runtime and "statistics feedback for joins" kicks in, the bad OPT_ESTIMATE hint gets generated.I've discussed this case also with Nigel Bayliss at Oracle (the Optimizer Product Manager) and a corresponding bug was opened, so hopefully the problem gets addressed in the future.

AskTom takes another step forward

For over 16 years, AskTom has been one of the most valuable resources available to developers and database administrators working with the Oracle Database.  With over 20,000 questions tackled and answered, along with over 120,000 follow up’s to additional queries, it remains an outstanding knowledgebase of Oracle assistance.

And today, AskTom just got a whole lot better!

We’re excited to announce a new member of AskTom team…database evangelist Maria Colgan.  Many of you will know Maria for her work with Optimizer and In-Memory, but in fact she brings decades of expertise across the entire database technology stack to assist you with your questions. Maria blogs regularly at sqlmaria.com so make sure you check in there regularly as well. With Maria on the team, AskTom keeps getting better and better in 2017!

 

Maria_korea