Tasks that are performed via jobs in the database will be double accounted in the system time model that has been introduced with Oracle 10g.
So if you execute significant workload via DBMS_JOB or DBMS_SCHEDULER any system time model related statistic like DB Time, DB CPU etc. that gets recorded for that workload gets double accounted.
This bug is not particularly relevant since your top workloads will still be the same top workloads, because all other statistics (like Elapsed Time, CPU, Buffer Gets etc.) are not affected by the bug.
I mention it only here since the bug (see below for details) as of the time of writing can't yet be found on My Oracle Support in the bug database but I recently came across several AWR reports where the majority of workload was generated via job processes and therefore the time model statistics were effectively doubled.
It might help as a viable explanation if you sometimes wonder why an AWR or Statspack report only captures 50% or less of the recorded total DB Time or DB CPU and where this unaccounted time has gone. If a significant part of the workload during the reporting period has been performed by sessions controlled via DBMS_JOB or DBMS_SCHEDULER then probably most of the unaccounted time is actually not unaccounted but the time model statistics are wrong.
So if you have such an otherwise unexplainable unaccounted DB Time / DB CPU etc. you might want to check if significant workload during the reporting period was executed via the job system. Note that I don't say that this is the only possible explanation of such unaccounted time - there might be other reasons like uninstrumented waits, other bugs etc.
Of course all the percentages that are shown in the AWR / ADDM / Statspack reports that refer to "Percentage of DB Time" or "Percentage of DB CPU" will be too small in such cases.
One of my clients runs a Data Warehouse with several important data loading and transformation procedures.
One of those key procedures had to be executed in a kind of a loop since many of the transformations steps depend on each other's result.
So although each transformation step was executed in an rather efficient way using Oracle best-practices like parallel DML / exchange partition still the iterative approach meant that the underlying server wasn't used as efficiently as possible since many of those transformations steps also included smaller data sets that didn't use all of the resources available.
In order to address that they've recently introduced a more sophisticated version of that key procedure that uses some clever algorithm based on a dependency tree to determine which steps could be executed in parallel since they do not depend on each other.
Now as part of that optimization they've worked out that on their particular hardware they could run as much as 25 transformation steps in parallel and furthermore up to four of those 25 as parallel execution steps.
So you end up in principle with two queues - one is dequeued by four consumers processing "big" slices via parallel execution and a second queue is dequeued by approx. 20 consumers each processing "small" slices using serial execution (It is quite a big "iron" hardware).
This allowed them to increase the resource utilization of the available server hardware and resulted in a significant drop in overall runtime - actually they saved more than 50% of the original runtime which is quite a good result given that they have achieved that on exactly the same hardware!
Query transformation is a set of techniques used by the optimizer to rewrite a query and optimizer it better. Few optimization paths open up to the optimizer after query transformation. Some query transformations must be costed to be chosen and some do not need to be costed. For example, if a table can be eliminated completely from the join, then that transformation is applied and need to cost that transformation is minimal.
Test case
We will use the following test case to illustrate the concepts behind Query transformation. Some of the optimizations that we see here works from version 11gR1 onwards and so, these test cases might not work in the versions 10g and below.
create table backup.t1 (n1 number not null primary key, n2 number not null, n3 varchar2(256) ); insert into backup.t1 select n1, n1, lpad ( n1, 250,'x') from (select level n1 from dual connect by level <=100); create table backup.t2 (n1 number not null primary key , n2 number not null, n3 varchar2(256) ); alter table backup.t1 add constrainT t1_fk foreign key (n2) references backup.t2(n1) insert into backup.t2 select n1, n1, lpad ( n1, 250,'x') from (select level n1 from dual connect by level <=100); insert into backup.t1 select n1, n1, lpad ( n1, 250,'x') from (select level n1 from dual connect by level <=100);
Join elimination (JE)
In RAC, every data block is mastered by an instance. Mastering a block simply means that master instance keeps track of the state of the block until the next reconfiguration event (due to instance restart or otherwise).
Hash to the master
These data blocks are mastered in block ranges. For example, range of blocks starting from file 10, block 1 through block 128 may be mastered by instance 1, blocks from file 10, block 129 through 256 are mastered by instance 2 etc. Of course, there are differences between various versions 10g, 11g etc, but Idea here is that block ranges are uniformly mastered between various instances so that Global cache grants are evenly distributed among the instances. Interestingly, length of the block range is 128 from 10g onwards (Julian Dyke mentioned that is 1089 in 9i, but I have not personally tested it). Of course, Support recommends you to unset db_file_multiblock_read_count which will be auto adjusted to 128 which means that Full block range can be read with fewer GC messages, I suppose. I digress.
Further, Michael Möller pointed out that this hash-algorithm is further optimized: The hash-algorithm used when initially computing the master node from the DBA, results in a “virtual master”, which is then translated to a real (online&open) master by a lookup table (the length of which is the maximum number of possible nodes (128 ?). This means that when one node goes off/on-line, RAC does NOT have to recalculate the hash for all blocks, but only distribute the new Hash-to-node table. (One can later visualize dynamic remastering as an additional lookup table between the hash value and node. This table also needs redistributing on node changes.)
A devastating performance degradation of materialized view fast refreshes can happen in versions after 9i - and can be healed rather easily by simply setting the hidden parameter _mv_refresh_use_stats or, a bit surprisingly, by locking statistics on the logs. The problem can manifest at least in the currently-latest patchsets of 10g, 11gR1 and 11gR2 (10.2.0.4, [...]
It is very disappointing to me that I had to cancel my trip to RMOUG training days. I am sick and was not able to catch the flight due to that.
But, I can always share my presentations here. I had two presentations planned in this training day and can be accessed as below:
Advanced RAC troubleshooting
Riyaj_Advanced_rac_troubleshooting_RMOUG_2010_doc
Riyaj_Advanced_rac_troubleshooting_RMOUG_2010_ppt
Why optimizer hates my sql
Riyaj_Why_optimizer_hates_my_sql_2010
RMOUG training days audience: Please accept my sincere apologies.
As a minor but useful new feature, xplan is now able to integrate into its report the DDL of tables (and indexes) referenced by the plan, calling dbms_metadata.get_ddl transparently.
This is mostly useful to get more details about referenced tables' constraints and partitions definition - to complement their CBO-related statistics that xplan reports about.
This feature [...]
I will be presenting on two topics in HOTSOS’ 2010, an Oracle performance focused conference, in my home town Dallas, Texas. You can read about my presentation topics in HOTSOS’ 2010 Riyaj . This symposium is a valued conference for performance engineers, DBAs and developers who are interested to know learn about performance. There are many great speakers presenting in this conference and the main page for this conference is HOTSOS ’2010 . BTW, My friend Alex Gorbachev interviewed Gary Goodman and posted a video in his blog also.
Tanel Poder is conducting the HOTSOS training day this year. You can’t miss his training day and I heard that he is working on a MOTS (Mother Of all Tuning Script) and planning to release that in HOTSOS ’2010.
On behalf of the Dallasites, I invite you to visit Dallas and attend this great conference.
I wish you a Happy and prosperous New Year ’2010!
Global cache performance metrics are not correctly measured. It is not understood clearly either. There are even few blogs and web pages disseminating incorrect information. This blog entry is an attempt to offer few methods and scripts to understand global cache performance.
Always review all instances
It is very important to review the performance metrics from all instances in that RAC cluster, not just one instance that you are connected. If you have access to AWR reports, then it is critical to generate AWR reports (or statspack reports) from all instances. But, the problem is that, DBAs tend to generate AWR reports after logging in to each instance iteratively, enter couple of parameters and then reports are generated. Not exactly a convenient practice.
REM connect to each instance separately, type in the beginning snap_id and ending snap_id for each node etc.. sqlplus mydba@proddb1 @$ORACLE_HOME/rdbms/admin/awrrpt.sql exit; sqlplus mydba@proddb2 @$ORACLE_HOME/rdbms/admin/awrrpt.sql exit; sqlplus mydba@proddb3 @$ORACLE_HOME/rdbms/admin/awrrpt.sql exit;
There are few issues with this approach. It is a cumbersome practice if the instance count is higher. In addition to that, all of AWR reports are, in turn, accessing underlying AWR tables. Physically, rows from all instances are together in the same block and so, by executing these reports connecting to various instances, Global cache traffic is increased. If the database is suffering from Global cache (GC) performance issues then generating reports connecting to various instances is probably not a grand idea.
This post investigates the improvements that have been made in 11gR2 to the fast refresh engine of materialized views (MVs) that are set to be automatically refreshed at commit time. We speak about join-only materialized views only in this post, as always with the help of a test case.
As noted in the post of mine [...]
Recent comments
6 days 10 min ago
6 days 6 hours ago
3 weeks 5 days ago
11 weeks 5 days ago
14 weeks 5 days ago
19 weeks 1 day ago
19 weeks 1 day ago
19 weeks 3 days ago