Search

Top 60 Oracle Blogs

Recent comments

Oakies Blog Aggregator

Outages

Some people might have noticed that the blog has been down a few times over the past week, particularly on Friday. The server that the site and blog is hosted on has been subject to some sort of hack attack and as a result it was moved to a new server and all privileges were locked down for a while, particularly on the blog directory

One factor was probably that my Serendipity installation was fairly out of date - the punchline being that one of my Bank Holiday weekend tasks was to upgrade it. Too late, unfortunately :-(

But, thanks to the efforts of Mike Scott, everything is back online and both blogs have been updated to the latest version of Serendipity.

Many thanks to Mike and if I can tear myself from my year-end accounts long enough, I might still have some time to reply to some comments and get the stats stuff moving again.

Frequency histograms - edge cases

Oracle introduced with the 10.2.0.4 patch set a significant change how non-existing values are treated when there is a frequency histogram on a column / expression. See Jonathan Lewis' blog post which is probably the most concise description of the issue. In a nutshell the change is about the following (quoted from Jonathan's post): "If the value you supply does not appear in the histogram, but is inside the low/high range of the histogram then the cardinality will be half the cardinality of the least frequently occurring value that is in the histogram".

I'm still a bit puzzled why Oracle introduced such a significant change to the optimizer with a patch set, but one of the most obvious reasons might be that the change allows to generate frequency histograms using a rather low sample size, because there is no longer a similar threat as before when the frequency histogram misses one of the existing values (which would then return an estimate of 1 if they didn't appear in the histogram).

In fact when using the default DBMS_STATS.AUTO_SAMPLE_SIZE Oracle exactly does this: It uses by default a quite low sample size to perform the additional runs required for each histogram - probably an attempt to minimize the additional work that needs to be done for histogram generation.

In it is however quite interesting to see how exactly this behaviour together with the new treatment of non-existing values can turn into a threat as a recent thread on OTN demonstrated.

Consider the following scenario: You have a column with a highly skewed data distribution; there is a single, very popular value, and a few other, very unpopular values.

Now you have a query type that filters on this column and frequently searches for non-existing values. In order to speed up the query an index has been created on the column, and in order to make the optimizer aware of the fact that the data distribution is highly skewed a histogram is generated, so that the optimizer should favor the index only for those unpopular respectively non-existing values.

The following test case (run on 11.1.0.7) emulates this scenario:

create table t1 (
id number(*, 0)
, t_status number(*, 0)
, vc varchar2(100)
);

-- 1 million rows
-- One very popular value
-- Two very unpopular values
-- in column T_STATUS
insert /*+ append */ into t1 (id, t_status, vc)
with generator as (
select /*+ materialize */
level as id
from
dual
connect by
level <= 10000
)
select /*+ use_nl(v1, v2) */
rownum as id
, case
when rownum <= 10
then 1
when rownum <= 20
then 2
else 0
end as t_status
, rpad('x', 100) as vc
from
generator v1
, generator v2
where
rownum <= 1000000;

commit;

create index t1_idx on t1 (t_status);

exec dbms_stats.gather_table_stats(null, 'T1', method_opt => 'for all columns size 1 for columns t_status size 254')

The first interesting point are the generated column statistics:

SQL>
SQL> -- Note that the basic column statistics
SQL> -- are generated with a much higher sample size
SQL> -- The histogram however was created with a sample size
SQL> -- of 5,000 rows only
SQL> -- Therefore we get three distinct values but only a single bucket
SQL> -- But not always => Potential instability issue
SQL> select
2 column_name
3 , num_distinct
4 , sample_size
5 from
6 user_tab_col_statistics
7 where
8 table_name = 'T1';

COLUMN_NAME NUM_DISTINCT SAMPLE_SIZE
--------------- ------------ -----------
ID 1000000 1000000
T_STATUS 3 5499
VC 1 1000000

SQL>
SQL> -- This results in a single bucket
SQL> -- But not always => Potential instability issue
SQL> select
2 column_name
3 , endpoint_number
4 , endpoint_value
5 from
6 user_tab_histograms
7 where
8 table_name = 'T1'
9 and column_name = 'T_STATUS';

COLUMN_NAME ENDPOINT_NUMBER ENDPOINT_VALUE
--------------- --------------- --------------
T_STATUS 5499 0

Notice the inconsistency: The basic column statistics (number of distinct values, low value, high value) obviously have been generated with a much higher sample size (in fact a compute in this case) than the histogram on T_STATUS. The histogram therefore misses the unpopular values and consists only of a single value - the very popular one.

Now watch closely what happens to the cardinality estimates of the non-existing values:

SQL>
SQL> -- The popular value
SQL> explain plan for
2 select
3 *
4 from
5 t1
6 where
7 t_status = 0
8 ;

Explained.

SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 999K| 102M| 4319 (2)| 00:00:52 |
|* 1 | TABLE ACCESS FULL| T1 | 999K| 102M| 4319 (2)| 00:00:52 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("T_STATUS"=0)

13 rows selected.

SQL>
SQL> -- A non-existing value
SQL> explain plan for
2 select
3 *
4 from
5 t1
6 where
7 t_status = 1000
8 ;

Explained.

SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 500K| 51M| 4316 (2)| 00:00:52 |
|* 1 | TABLE ACCESS FULL| T1 | 500K| 51M| 4316 (2)| 00:00:52 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("T_STATUS"=1000)

13 rows selected.

SQL>
SQL> -- Two non-existing values
SQL> explain plan for
2 select
3 *
4 from
5 t1
6 where
7 t_status in (1000, 2000)
8 ;

Explained.

SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1000K| 102M| 4325 (3)| 00:00:52 |
|* 1 | TABLE ACCESS FULL| T1 | 1000K| 102M| 4325 (3)| 00:00:52 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("T_STATUS"=1000 OR "T_STATUS"=2000)

13 rows selected.

SQL>

Ouch, ouch: Whereas the estimate for the popular value is correct, the estimates for the unpopular values are totally way-off - in fact using an IN clause with two non-existing values gets an estimate of all rows contained in the table.

The explanation: Since the least popular value is the single bucket covering virtually all rows, half of it is still 50% of the total cardinality - so two non-existing values in an IN clause end up with a selectivity of 1.

Furthermore the usual decay to values outside the low/high column values doesn't apply here either - no matter what non-existing values get used, the overestimation stays the same.

These overestimates have obviously a significant impact - here the suitable index doesn't get used - more complex plans might turn even into a complete disaster.

The default behaviour that histograms are generated with a much lower sample size than the basic column statistics also introduces a kind of instability - try to run the example several times. Sometimes one of the unpopular values might be caught by the default histogram generation, sometimes not. The effect is dramatic: If one of the unpopular values gets caught, the estimates will be reasonable again, since then the least popular value do have a very low cardinality - if not, you get exactly the opposite result just demonstrated.

If the old behaviour gets re-activated, the results are as expected with the same set of statistics:

SQL>
SQL> -- Switch off new behaviour
SQL> alter session set "_fix_control" = '5483301:off';

Session altered.

SQL>
SQL> -- The popular value
SQL> explain plan for
2 select
3 *
4 from
5 t1
6 where
7 t_status = 0
8 ;

Explained.

SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 999K| 102M| 4319 (2)| 00:00:52 |
|* 1 | TABLE ACCESS FULL| T1 | 999K| 102M| 4319 (2)| 00:00:52 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("T_STATUS"=0)

13 rows selected.

SQL>
SQL> -- A non-existing value
SQL> explain plan for
2 select
3 *
4 from
5 t1
6 where
7 t_status = 1000
8 ;

Explained.

SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 546753835

--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 107 | 4 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| T1 | 1 | 107 | 4 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | T1_IDX | 1 | | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("T_STATUS"=1000)

14 rows selected.

SQL>
SQL> -- Two non-existing values
SQL> explain plan for
2 select
3 *
4 from
5 t1
6 where
7 t_status in (1000, 2000)
8 ;

Explained.

SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3743026710

---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 107 | 5 (0)| 00:00:01 |
| 1 | INLIST ITERATOR | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| T1 | 1 | 107 | 5 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | T1_IDX | 1 | | 4 (0)| 00:00:01 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - access("T_STATUS"=1000 OR "T_STATUS"=2000)

15 rows selected.

Switching back to the former treatment that non-existing values lead to an estimate of 1 will fix this issue, however since this has a potential impact on every execution plan thorough regression testing would be required with this used as a global setting.

Increasing the sample size is another option, if it ensures that unpopular values get caught by the histogram generation, so that the least popular value of the histogram is one with a low cardinality. Note however that this will increase the amount of work necessary to gather the statistics for the histograms.

SQL>
SQL> -- Gather statistics with 100% sample size
SQL> exec dbms_stats.gather_table_stats(null, 'T1', estimate_percent => null, method_opt => 'for all columns size 1 for columns t_status size 254')

PL/SQL procedure successfully completed.

SQL>
SQL> -- Now the statistics are more representative
SQL> select
2 column_name
3 , num_distinct
4 , sample_size
5 from
6 user_tab_col_statistics
7 where
8 table_name = 'T1';

COLUMN_NAME NUM_DISTINCT SAMPLE_SIZE
--------------- ------------ -----------
ID 1000000 1000000
T_STATUS 3 1000000
VC 1 1000000

SQL>
SQL> -- The histogram now covers also the unpopular values
SQL> select
2 column_name
3 , endpoint_number
4 , endpoint_value
5 from
6 user_tab_histograms
7 where
8 table_name = 'T1'
9 and column_name = 'T_STATUS';

COLUMN_NAME ENDPOINT_NUMBER ENDPOINT_VALUE
--------------- --------------- --------------
T_STATUS 999980 0
T_STATUS 999990 1
T_STATUS 1000000 2

SQL>
SQL> -- Switch back on new behaviour
SQL> alter session set "_fix_control" = '5483301:on';

Session altered.

SQL>
SQL> -- A non-existing value
SQL> explain plan for
2 select
3 *
4 from
5 t1
6 where
7 t_status = 1000
8 ;

Explained.

SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 546753835

--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 107 | 4 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| T1 | 1 | 107 | 4 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | T1_IDX | 1 | | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("T_STATUS"=1000)

14 rows selected.

SQL>
SQL> -- Two non-existing values
SQL> explain plan for
2 select
3 *
4 from
5 t1
6 where
7 t_status in (1000, 2000)
8 ;

Explained.

SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3743026710

---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 107 | 5 (0)| 00:00:01 |
| 1 | INLIST ITERATOR | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| T1 | 1 | 107 | 5 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | T1_IDX | 1 | | 4 (0)| 00:00:01 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - access("T_STATUS"=1000 OR "T_STATUS"=2000)

15 rows selected.

As suggested by Jonathan Lewis in the OTN thread, another elegant solution to the problem of searching for non-existing values would be to add a virtual column that filtered out the popular values. This approach has several advantages if applicable: The index maintained is very small, minimizes the maintenance effort (and might also address index-efficiency related issues in case of frequent updates to the column) and solves the histogram issue since the histogram will only cover the unpopular values:

SQL>
SQL> -- Add a virtual column (the same could be achieved using a function-based index instead for pre-11g versions)
SQL> alter table t1 add (t_status_unpop as (case when t_status != 0 then t_status end));

Table altered.

SQL>
SQL> exec dbms_stats.gather_table_stats(null, 'T1', method_opt => 'for columns t_status_unpop size 254')

PL/SQL procedure successfully completed.

SQL>
SQL> -- Since the virtual column only covers
SQL> -- the unpopular values, the histogram
SQL> -- will be very precise even with a low sample size
SQL> select
2 column_name
3 , num_distinct
4 , sample_size
5 from
6 user_tab_col_statistics
7 where
8 table_name = 'T1';

COLUMN_NAME NUM_DISTINCT SAMPLE_SIZE
--------------- ------------ -----------
ID 1000000 1000000
T_STATUS 3 1000000
VC 1 1000000
T_STATUS_UNPOP 2 20

SQL>
SQL> -- The histogram only covers the unpopular values
SQL> select
2 column_name
3 , endpoint_number
4 , endpoint_value
5 from
6 user_tab_histograms
7 where
8 table_name = 'T1'
9 and column_name = 'T_STATUS_UNPOP';

COLUMN_NAME ENDPOINT_NUMBER ENDPOINT_VALUE
--------------- --------------- --------------
T_STATUS_UNPOP 10 1
T_STATUS_UNPOP 20 2

SQL>
SQL> -- Two non-existing values
SQL> -- So we don't run into the same problem here
SQL> -- The estimate is reasonable for non-existing values
SQL> explain plan for
2 select
3 *
4 from
5 t1
6 where
7 t_status_unpop in (1000, 2000)
8 ;

Explained.

SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3617692013

--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 108 | 4470 (6)| 00:00:54 |
|* 1 | TABLE ACCESS FULL| T1 | 1 | 108 | 4470 (6)| 00:00:54 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("T_STATUS_UNPOP"=1000 OR "T_STATUS_UNPOP"=2000)

13 rows selected.

I haven't created an index on the virtual column, but the estimate is correct and a suitable index would get used if it existed.

Dutch Oracle University DBA week

Just a quick note about the DBA week that Oracle University in The Netherlands has organized which will run in the week of April 19th. During this week several one-day seminars are scheduled and presented by Jonathan Lewis, Joel Goodman and myself. Jonathan will talk about Troubleshooting and Tuning, Joel will talk about RAC and [...]

Online Materialized View Complete Refresh With Partitions

The other day I was posed with an interesting problem regarding the refresh of the materialized views. First, a little background. MVs are nothing new in Oracle; they have been around since Oracle 7 (called snapshots then). Essentially they are similar to tables (as they actually store data) but populated by running a query on some source tables. This query is the defining query of the MV. So, the user can select from the MV instead of executing the defining query – reducing the execution time significantly.

However, the MV and the query would both give the same result if the source tables have not changed. If the data has changed, the MV will not have known about it and will give a stale result. Therefore, you need to refresh the MV from time to time. Usually the MVs are refreshed by the following method:

begin
dbms_mview.refresh(‘’);
end;

So, what’s the problem? The problem is, during complete refreshes, the MV remains unavailable to the users. This duration of refresh depends on the size of the MV, the execution time of the underlying query and so on; and the unavailability is a nuisance for most applications. In some cases the applications are designed to timeout after some wait; and they do so quite consistently in this case – making it more than just a nuisance.

So, I was asked to devise a solution to make the MVs available during the complete refreshes. I accomplished that with a not-so-common use of partitions. Here is how I did it.

Prebuilt Table

First, as an architectural standard, I advocate the use of prebuilt tables in MVs. Here is how an MV is usually created:

create materialized view mv_bookings_01
refresh complete on demand
enable query rewrite
as
select hotel_id, sum(book_txn) tot_txn
from bookings
group by hotel_id
/

This creates the segment mv_bookings_01 during this process. Instead of this approach, I recommend first creating and populating a regular table and then converting that to an MV. In this revised approach, you would first create the table:

create table mv_bookings_01
as
select HOTEL_ID, sum(book_txn) tot_txn
from bookings
group by HOTEL_ID
/


Then, you would create the MV using the prebuilt table option:

create materialized view mv_bookings_01
on prebuilt table
refresh complete on demand
enable query rewrite
as
select HOTEL_ID, sum(book_txn) tot_txn
from bookings
group by HOTEL_ID
/

This does not do anything to the functionality of the MV itself. The MV feels, looks and smells like it was before; but with two significant advantages:

(1) You can easily alter the columns
(2) You can manipulate the table

I blogged about it at http://arup.blogspot.com/2009/01/ultra-fast-mv-alteration-using-prebuilt.html. Anyway, back to the issue at hand. Using the prebuilt trick, I can create a partitioned MV as well. In this specific case, I will add a column to partition on. Remember, this column does not actually participate in the application; it simply creates an anchor point for the partition.

create table mv_bookings_01
(
ref_type number,
hotel_id number(3),
tot_txn number
)
partition by list (ref_type)
(
partition p0 values (0),
partition p1 values (1)
)
/

After the table is created, I can insert data into it:

insert /*+ append */ into mv_bookings_01
select 0, hotel_id, sum(book_txn) tot_txn
from bookings
group by 0, hotel_id
/

Of course I could have selected the data from the source tables directly while creating the prebuilt table; but I just wanted to show a different approach of data loading. The column REF_TYPE is not part of the application; but I have added it to divide the table along a value by list partitioning. The column can hold only two values – 0 and 1, and hence the table has only two partitions.

Once the table is created, I can use the prebuilt table option to create the MV as shown above. However, the presence of the new column makes it a little different. My MV creation script now looks like this:

create materialized view mv_bookings_01
on prebuilt table
enable query rewrite
as
select 0 as ref_type, HOTEL_ID, sum(book_txn) tot_txn
from bookings
group by 0, HOTEL_ID
/

Note that I have used “0 as ref_type” in the select clause, i.e. the ref_type will always be 0. This is not going to a problem as you can see later. When the MV is first created, the value of ref_type is 0; hence the partition p0 is the one is that is populated; not the partition p1. We can confirm that:

SQL> select count(1) from mv_bookings_01 partition (p0);

COUNT(1)
----------
2

SQL> select count(1) from mv_bookings_01 partition (p1);

COUNT(1)
----------
0

Now that the partitions are in place, let’s see how we refresh this MV. We will no longer use the dbms_mview approach. Rather, we will use this:

(1) Create a temporary table
(2) Create all the indexes
(3) Exchange the partition that is not used with this temporary table

Step 1: Temporary table

First we create a temporary table that is identical to the prebuilt table of the MV in structure; but just not partitioned. We will use the value of ref_type column as 0 or 1 based on the value already in the table, in fact exactly opposite of what is in the table. Since we have 0 in the table, we will use 1 in the temporary table. This temporary table will contain the data that we need refreshed as.

create table tmp1
as
select 1 as ref_type, hotel_id, sum(book_txn) tot_txn
from bookings
group by 1, hotel_id
/

We also need to create another temporary table with no data.

create table tmp0
(
ref_type number,
hotel_id number(3),
tot_txn number
)
/

Step 2: Create Indexes

In this case we don’t have any indexes on the MV; but if we had, we would have created them here on the TMP0 and TMP1 tables.

Step 3: Exchange Partitions

When the temporary table is ready, we can exchange the inactive partition with this.

alter table mv_bookings_01
exchange partition p1
with table tmp1
including indexes
without validation
update global indexes
/
alter table mv_bookings_01
exchange partition p0
with table tmp0
including indexes
without validation
update global indexes
/

This SQL performs a switch: the segment that was called partition P1 in the table mv_bookings_01 is now called TMP1 and former segment TMP1 is now called partition P1 in the table mv_bookings_01. This occurs at the data dictionary level; no actual data is ever transferred from one segment to the other. As a result this is very quick. A lock is required on the table only for that instance is such an insignificant amount of time that it may not be even noticeable. Similarly the segment used to be called P0 partition is now known as TMP0 table and TMP0 is called P0.

There are several key things to note here. Note the clause “without validation”, which tells Oracle not to bother checking inside the TMP0 table that it will confirm to the partition specification. This is critical for the performance. Since we built the table with “1” as a hardcoded value, we know that the REF_TYPE column will definitely contain 1, satisfying the partition requirement for P1. A further checking is not required.

The “including indexes” clause switched the local index partitions as well. If there is a global index on the MV’s prebuilt table, that would have been invalidated and needed rebuilding afterwards. But, we avoided that invalidation by using a “update global indexes” clause in the alter statement.

Once the switchover is complete, we can check the MV to see if the data has been visible.

SQL> select count(1) from mv_bookings_01 partition (p0)
2 /

COUNT(1)
----------
0

SQL> select count(1) from mv_bookings_01 partition (p1)
2 /

COUNT(1)
----------
2

This completes the refreshing of the MV and the data is visible to end users. Let’s see the timeline

(1) Building of temporary table
(2) Building of indexes on the temporary table
(3) Exchange partitions

Steps 1 and 2 take most of the time; but that is done offline, without affecting the MV itself. So, long time there does not affect the availability of the MV itself. The step 3 is where the availability is impacted; but that is miniscule.

The table TMP0 now has the rows from partition P0 of the MV. To confirm that:

SQL> select count(1) from tmp0;

COUNT(1)
----------
2

You can drop the table by “drop table tmp0 purge”. You should also drop TMP1 since that table will contain the previous contents of the partition P1, which is useless now.

The next time this MV needs refreshing, you have to repeat the process; but with a very important distinction – the partition P0 needs exchange now. I used a script that checks the value of REF_TYPE in the MV now and use the other value passed to an SQL script with a positional parameter that accepts 0 or 1. Using this approach you can refresh a MV very quickly.

Oracle 11g Variation

In Oracle 11g, there is a new feature – Virtual Column. You can define a column as virtual in the table, which is not actually stored in the table; but calculated during runtime. What’s more, this virtual column can also be used as a partition key. You can define the REF_TYPE as a virtual column in that case.

Takeaways

(1) Just because there is an Materialized View, it does not have to be refreshed by the usual approach, i.e. dbms_mview.refresh. There are other options as well.
(2) Partitioning is not just for performance or purge reasons; it can be used in clever ways such as MV refreshes.
(3) In this specific case, we didn’t have a column that could have been used as a partitioning key; so we had to resort to adding a column. In many cases, the table may already have a column for that purpose, making the process even simpler.

Non-trivial performance problems

Gwen Shapira has written an article about a good example of a non-trivial performance problem.

I’m not talking about anything advanced here (such as bugs or problems arising at OS/Oracle touchpoint) but that sometimes the root cause of a problem (or at least the reason why you notice this problem now) is not something deeply technical or related to some specific SQL optimizer feature or a configuration issue. Instead of focusing on the first symptom you see immediately, it pays off to take a step back and see how the problem task/application/SQL is actually used by the users or client applications.

In other words, talk to the users, ask how exactly they experience the problem and then drill down from there.

Share/Bookmark

Daylight Saving Time Causes Performance Issues!

Since I’m starting employment with Pythian on Monday, this is the last technical post that is exclusive to this blog. Future technical tips, discoveries and stories will appear on the Pythian blog, and either copied here or linked to from here. BTW. I already made my first post in the Pythian blog today, as the editor of the Log Buffer – and I’m not even an employee yet :)

So, lets end this with a bang. Real life story on how Daylight Saving Time caused one query to take twice as long as before! And its not even an April Fools joke :)

Three days ago the customer called me and said: Listen, since daylight saving time started, this process that should take two hours is now taking five. My manager gets his morning report late and he is unhappy.

I told him that I’ll be glad to help with his performance issue, but this can’t possibly related  to day light saving time. Besides, since the process runs every night but some nights the performance is fine, the problem is probably in the nature of the data that changes from day to day.

In any case, we started working on the problem with our usual methods: Instrumented the PL/SQL of the process so we can know exactly what part is slowing down the process, ran trace, analyzed the results of traces, looks at AWR reports, look at query plans, etc.

Eventually, we found a query, buried within a view, whose explain plan looked completely unreasonable – First of all, it included a bunch of “nested loops” and even “cartesian merge join” between tables that I knew were quite large. Second, those large tables had estimated number of rows of 1.

I checked DBA_TABLES and saw that num_rows was indeed 0, and last analyzed was yesterday at 10pm. This didn’t make any sense – I personally collected statistics for this table yesterday, it had around 1M rows and it was at 3pm.

Why does it show 0 rows at 10pm?

A quick check with the customer confirmed that the  process truncates this table and copies data into it before our slow query runs. I assumed that “truncate table” resets the statistics and asked the customer to modify the process to analyze the table after all the data is copied in, but before the slow query runs.

But two questions still bothered me: Does truncate really resets statistics? And why did this problem start occurring after day light saving time?

The answer to the first question is “no”, truncate does not reset statistics:

SQL>  analyze table  sitescope.test1 compute statistics;
Table analyzed.

SQL> select table_name,num_rows from dba_tables where owner='SITESCOPE' and table_name='TEST1';
TABLE_NAME                       NUM_ROWS
------------------------------ ----------
TEST1                                   4

SQL> truncate table sitescope.test1;
Table truncated.
SQL> select table_name,num_rows from dba_tables where owner='SITESCOPE' and table_name='TEST1';
TABLE_NAME                       NUM_ROWS
------------------------------ ----------
TEST1                                   4

The answer to the second question was found using DBA_TAB_STATS_HISTORY.

Turns out that there is an automated process that collects statistics on this schema every night. Until DST, it ran every night at 9pm, before the process started running – so the table had data and the collected statistics were accurate. After DST, the process started running at 10pm (obviously the system running the process doesn’t know about DST). At 10pm the table was already truncated, but was not loaded yet, so the process collected statistics for a table that was temporarily empty.

Tada! Daylight saving time caused a severe performance problem!

P.S. No indexes were rebuilt while attempting to resolve the issue :)

Analyze This – 2

If you’ve run the scripts from Analyze This, I hope you found that the query gave you two different execution plans. This is my output from the test (with a little cosmetic tidying):

=============
Using Analyze
=============

Execution Plan
----------------------------------------------------------
Plan hash value: 1838229974

-------------------------------------------------------------------
| Id | Operation | Name | [...]

calculating the SQL_ID hash from the SQL_TEXT

This really cool:

he explains a how to calculate the sql_id from the SQL_TEXT. The SQL_ID is just a hash of the text.

Analyze this

Here’s a little script I wrote a few years ago to make a point about using the dbms_stats package. I’ve just re-run it on 10.2.0.3 to see if it still behaves the way it used to – and it does. If you want to be just a little bit baffled, set up a database with [...]

Oracle Plan Stability (Stored Outlines) in PeopleSoft Global Payroll

A longer version of this posting, with experimental results, is available on my website.

In PeopleSoft for the Oracle DBA, I wrote a page (p. 291) explaining why stored outlines were not suitable for use in PeopleSoft.  Five years later, my view has not significantly changed.  Essentially, stored outlines work best with shared SQL, and there isn't much shared SQL in PeopleSoft, because a lot of it is dynamically generated.

  • Code generated by the component processor is dynamically generated.  At save time, only fields that have changed are updated.
  • PeopleCode can written in such a way that where clauses are dynamically assembled
  • nVision reports have variable numbers of criteria on literal tree node IDs in the queries.
  • By default in Application Engine, bind variables are converted to literals before the SQL is submitted to the database.  Even if this is overridden by enabling ReUseStatement in Application Engine or by using Cursor Sharing in the database, the code still wouldn’t be sharable.  Different instances of Application Engine executing the same program can use different instances on non-shared temporary records, so the tables in otherwise identical SQL statements are not the same.  You would get one version of the statement per temporary table instance.

However, there are very limited exceptions to this rule (otherwise I wouldn't have a story to tell).  The SQL in COBOL and SQR programs are more likely to be shareable.   Although some programs are coded to generate SQL dynamically, bind variables are passed through to SQL statements, and they use regular tables for working storage and not PeopleSoft temporary records.
A Global Payroll customer came to me with a problem where the payroll calculation (GPPDPRUN) would usually run well, but sometimes, the execution plan of a statement would change and the calculation would take additional several hours.  It is significant that the Global Payroll engine is written in COBOL.  My usual response to this sort of problem in Global Payroll is to add a hint to the stored statement.  Usually, I find that only a few statements that are affected.  However, after this happened a couple of times in production, it was clear that we couldn't continue to react to these problems. We needed to proactively stop this happening again.  This is exactly what stored outlines are designed to do.

Using Stored Outlines in the PeopleSoft GP Engine

Earlier I said that we could apply stored outlines to the Global Payroll calculation engine (GPPDPRUN) because it generally doesn’t use dynamic code with embedded literal values.  
While outlines are being created, the following privilege needs to be granted.  It can be revoked later.

#eeeeee; border: 0px solid rgb(0, 0, 0); overflow: auto; padding-left: 4px; padding-right: 4px; width: 100%;">
GRANT CREATE ANY OUTLINE TO SYSADM;

We can create a trigger to collect the stored outlines for a payroll calculation, thus:

  • The trigger fires when a payroll calculation process starts or finishes. 
  • At the start a payroll process it starts collecting stored outlines in a category called the same as the process; GPPDPRUN.
  • When the process finishes, outline collection is disabled by setting it back to false.
#eeeeee; border: 0px solid rgb(0, 0, 0); overflow: auto; padding-left: 4px; padding-right: 4px; width: 100%;">
CREATE OR REPLACE TRIGGER sysadm.gfc_create_stored_outlines
BEFORE UPDATE OF runstatus ON sysadm.psprcsrqst
FOR EACH ROW
WHEN (new.prcsname = 'GPPDPRUN' AND (new.runstatus = 7 OR old.runstatus = 7))
DECLARE
l_sql VARCHAR2(100);
BEGIN
l_sql := 'ALTER SESSION SET create_stored_outlines = ';
IF :new.runstatus = 7 THEN
EXECUTE IMMEDIATE l_sql||:new.prcsname;
ELSIF :old.runstatus = 7 THEN
EXECUTE IMMEDIATE l_sql||'FALSE';
END IF;
EXCEPTION WHEN OTHERS THEN NULL; --because I dont want to crash the process scheduler
END;
/

The exact number of outlines that are collected during this process will vary depending upon configuration, and which employees are processed as different payroll rules are invoked.
If no more outlines are to be collected the CREATE ANY OUTLINE privilege can be revoked.  This does not prevent the outlines from being used.

#eeeeee; border: 0px solid rgb(0, 0, 0); overflow: auto; padding-left: 4px; padding-right: 4px; width: 100%;">
REVOKE CREATE ANY OUTLINE FROM SYSADM;

Then, the category of outlines can be used in subsequent executions by replacing the trigger above with the one below, and the execution plans cannot change so long as the SQL doesn’t change.

#eeeeee; border: 0px solid rgb(0, 0, 0); overflow: auto; padding-left: 4px; padding-right: 4px; width: 100%;">
CREATE OR REPLACE TRIGGER sysadm.gfc_use_stored_outlines
BEFORE UPDATE OF runstatus ON sysadm.psprcsrqst
FOR EACH ROW
WHEN (new.prcsname = 'GPPDPRUN' AND (new.runstatus = 7 OR old.runstatus = 7))
DECLARE
l_sql VARCHAR2(100);
BEGIN
l_sql := 'ALTER SESSION SET use_stored_outlines = ';
IF :new.runstatus = 7 THEN
EXECUTE IMMEDIATE l_sql||:new.prcsname;
ELSIF :old.runstatus = 7 THEN
EXECUTE IMMEDIATE l_sql||'FALSE';
END IF;
EXCEPTION WHEN OTHERS THEN NULL; --because I dont want to crash the process scheduler
END;
/

After running an identify-and-calc and a cancel, we can see how many of the outlines are actually used.

#eeeeee; border: 0px solid rgb(0, 0, 0); overflow: auto; padding-left: 4px; padding-right: 4px; width: 100%;">
SELECT category, count(*) outlines
, sum(decode(used,'USED',1,0)) used
FROM user_outlines
GROUP BY category
ORDER BY 1
/

I have a large number of unused outlines because of additional recursive SQL generated because OPTIMIZER_DYNAMIC_SAMPLING was set 4. This does not occur if this parameter is set to the default of 2.

#eeeeee; border: 0px solid rgb(0, 0, 0); overflow: auto; padding-left: 4px; padding-right: 4px; width: 100%;">
CATEGORY                         OUTLINES       USED
------------------------------ ---------- ----------
GPPDPRUN 572 281

I can then remove the unused outlines.

#eeeeee; border: 0px solid rgb(0, 0, 0); overflow: auto; padding-left: 4px; padding-right: 4px; width: 100%;">
EXECUTE dbms_outln.drop_unused;

Used flags on the outlines can be reset, so we later we can see the outlines being used again.

#eeeeee; border: 0px solid rgb(0, 0, 0); overflow: auto; padding-left: 4px; padding-right: 4px; width: 100%;">
BEGIN
FOR i IN (SELECT * FROM user_outlines WHERE category = 'GPPDPRUN') LOOP
dbms_outln.clear_used(i.name);
END LOOP;
END;
/

If I want to go back running without outlines, I just disable the trigger

#eeeeee; border: 0px solid rgb(0, 0, 0); overflow: auto; padding-left: 4px; padding-right: 4px; width: 100%;">
ALTER TRIGGER sysadm.stored_outlines DISABLE;

To re-enable outlines, just re-enable the trigger.

#eeeeee; border: 0px solid rgb(0, 0, 0); overflow: auto; padding-left: 4px; padding-right: 4px; width: 100%;">
ALTER TRIGGER sysadm.stored_outlines ENABLE;

Conclusions

Stored Outlines have very limited application in a PeopleSoft system.  However, they can easily be collected and used with the PeopleSoft Global Payroll engine.  It is just a matter of granting a privilege and using the database triggers on the process request table.
Testing that they actually have the desired effect is quite difficult, because you are trying to prove a negative.  I don’t think it is adequate simply to say that the outline has been used.

  • First you would need an environment where payroll calculation performs well, where you could collect outlines.
  • Then you would need a payroll calculation that performs poorly because the execution plan for at least one SQL statement is different
    • Either, on a second environment with exactly the same code.
    • Or in the same environment on a different set of data.
  • Then, it would be possible to demonstrate that applying the outline causes the execution plan to revert and restores the performance.  This can be confirmed by comparison of the ASH data for the various scenarios.

Even if you don’t want to use a stored outline immediately, it might be advantageous to collect them, and have them available when you do encounter a problem.