Data visualization is a useful method to identify performance patterns. In most cases, I pull custom performance metrics from AWR repository and use tableau to visualize the data. Of course, you can do the visualization using excel spreadsheet too.
We had huge amount of PX qref waits in a database:
After collaborating with many performance engineers in a RAC database, I have come to realize that there are common pattern among the (mis)diagnosis. This blog about discussing those issues. I talked about this in Hotsos 2014 conference also.
Here are the golden rules of RAC performance diagnostics. These rules may not apply general RAC configuration issues though.
Looks like, this may be better read as a document. So, please use the pdf files of the presentation and a paper. Presentation slide #10 shows indepth coverage on gc buffer busy* wait events. I will try to blog about that slide later (hopefully).
I blogged about Dynamic Resource Mastering (DRM) in RAC here . DRM freezes the global resources during the reconfiguration event and no new resources can be allocated during the reconfiguration. This freeze has a dramatic effect of inducing huge amount of waits for gc buffer busy [acquire|release] events and other gcs drm freeze release, gcs remaster events. In database version 12c, DRM has been improved further.
A major improvement I see is that not all resources are frozen at any time. Essentially, resources are broken down in to partitions and only a resource partition is frozen. This improvement should decrease the impact of DRM related waits tremendously.
LMON Trace file
I will be presenting in HOTSOS symposium 2014 discussing correct methods to diagnose RAC performance issues. Very surprisingly, even very senior performance engineers make mistakes in their analysis while reviewing RAC issues. Come to my presentation and learn the golden rules of RAC performance diagnostics.
It is easier to create one or two AWR reports quickly using OEM. But, what if you have to create AWR reports for many snapshots? For example, your Oracle support analyst wants you to supply 10 1-hour AWR reports from 10AM to 8PM in a 8 node cluster? That’s about 80 AWR reports to create! Okay, okay, I may(!) be overselling it, but you get the point. It is useful to have a script to create AWR report for all instances for a given range of snapshot IDs. Following scripts are handy:
I blogged about DFS lock handle contention in an earlier blog entry. SV resources in Global Resource Directory (GRD) is used to maintain the cached sequence values. I will further probe the internal mechanics involved in the cached sequences. I will also discuss minor changes in the resource names to support pluggable databases (version 12c).
Let’s create an ordered sequence in rs schema and then query values from the sequence few times.
create sequence rs.test_seq order cache 100; select rs.test_seq.nextval from dual; -- repeated a few times. ... / 21
Sequence values are permanently stored in the seq$ dictionary table. Cached sequence values are maintained in SV resources in GRD and SV resource names follows the naming convention to include object_id of the sequence. I will generate a string using a small helper script and we will use that resource name to search in the GRD.
A quick note, Expert Oracle RAC book co-written by me is available now: Expert Oracle RAC 12c. I have written about 6 chapters covering the RAC internals that you may want to learn :) I even managed to discuss the network internals in deep, after all, network is one of the most important component of a RAC cluster.
This blog entry is to discuss a method to identify the objects inducing higher amount of redo. First,we will establish that redo size increased sharply and then identify the objects generating more redo. Unfortunately, redo size is not tracked at a segment level. However, you can make an educated guess using ‘db block changes’ statistics. But, you must use logminer utility to identify the objects generating more redo scientifically.
Detecting redo size increase
AWR tables (require Diagnostics license) can be accessed to identify the redo size increase. Following query spools the daily rate of redo size. You can easily open the output file redosize.lst in an Excel spreadsheet and graph the data to visualize the redo size change. Use pipe symbol as the delimiter while opening the file in excel spreadsheet.
A new major release (version 3.0) of my XPLAN_ASH tool is available for download.
In addition to many changes to the way the information is presented and many other smaller changes to functionality there is one major new feature: XPLAN_ASH now also supports S-ASH, the free ASH implementation.
If you run XPLAN_ASH in a S-ASH repository owner schema, it will automatically detect that and adjust accordingly.
XPLAN_ASH was tested against the latest stable version of S-ASH (2.3). There are some minor changes required to that S-ASH release in order to function properly with XPLAN_ASH. Most of them will be included in the next S-ASH release as they really are only minor and don't influence the general S-ASH functionality at all.
A new version 2.0 of the XPLAN_ASH utility introduced here is available for download.You can download the latest version here.The change log tracks the following changes:- Access check- Conditional compilation for different database versions- Additional activity summary- Concurrent activity information (what is/was going on at the same time)- Experimental stuff: Additional I/O summary- More pretty printing- Experimental stuff: I/O added to Average Active Session Graph (renamed to Activity Timeline)- Top Execution Plan Lines and Top Activities added to Activity Timeline- Activity Timeline is now also shown for serial execution when TIMELINE option is specified- From 184.108.40.206 on: We get the ACTUAL DOP from the undocumented PX_FLAGS colu