Oakies Blog Aggregator

Performance Metrics and XML Reporting in PeopleSoft

I am working with a PeopleSoft system that makes extensive use of XML Publisher reporting.  Generally these reports are based on SQL queries that are defined in the PeopleSoft PS/Query utility. 

These queries are run by an Application Engine program PSXPQRYRPT that runs the SQL query and produces the report in a single process.  This is a generic application engine program that runs any XML report.  Line 45 (in PeopleTools 8.49) of step MAIN.ExecRPT executes the report with the ProcessReport PeopleCode command.

#eeeeee; border: 0px solid rgb(0, 0, 0); overflow: auto; padding-left: 4px; padding-right: 4px; width: 100%;">
&oRptDefn.ProcessReport(&TemplateId, &Languaged, &AsOfDate, &oRptDefn.GetOutDestFormatString(&OutDestFormat));

Batch Timings
Analysis of the Application Engine batch timings indicate that nearly all the time in this Application Engine program is spent in PeopleCode, and that this not SQL execution time.  This is misleading.  The ProcessReport command is PeopleCode, but behind the scenes it also issues the SQL in the report data source.  Not all the time is SQL, but the Application Engine Batch Timings does not count any of this as SQL because it is not in the PeopleCode SQL Class. 

Let’s look at an example Batch Timings report (I have edited it down, removing zero and insignificant timings).

#eeeeee; border: 0px solid rgb(0, 0, 0); overflow: auto; padding-left: 4px; padding-right: 4px; width: 100%;">
PeopleSoft Application Engine Timings
(All timings in seconds)

C o m p i l e E x e c u t e F e t c h Total
SQL Statement Count Time Count Time Count Time Time
------------------------------ ------- -------- ------- -------- ------- -------- --------
PeopleCode
Record.SelectByKey PSDBFIELD 8 0.1 8 0.0 0.1
SELECT PSPRCSRQST 2 0.1 2 0.0 0.1
SELECT PSXPTMPLDEFN 3 0.1 3 0.0 0.1
--------
0.3
AE Program: PSXPQRYRPT
MAIN.ExecRpt.H 1 0.0 1 0.0 1 0.0 0.0
--------
0.0
------------------------------------------------------------------------------------------
Call Non-SQL SQL Total
PeopleCode Count Time Time Time
------------------------------ ------- -------- -------- --------
AE Program: PSXPQRYRPT
MAIN.ExecRpt 1 643.2 0.3 643.5
-------- -------- --------
643.2 0.3 643.5
------------------------------------------------------------------------------------------
E x e c u t e
PEOPLECODE Builtin/Method Count Time
------------------------------------------------------- ------- --------
Boolean(Type 5) BuiltIns 90 0.1
DateTime(Type 11) BuiltIns 1 0.1
SQL(Type 524290) Methods 19 0.1
SQL(Type 524290) BuiltIns 9 0.1
Record(Type 524291) Methods 1104 0.1
Session(Type 524303) Methods 207 633.2
JavaObject(Type 524315) BuiltIns 6 2.2
PostReport(Type 524324) Methods 2 0.7
------------------------------------------------------------------------------------------
Total run time : 644.0
Total time in application SQL : 0.3 Percent time in application SQL : 0.0%
Total time in PeopleCode : 643.2 Percent time in PeopleCode : 99.9%
------------------------------------------------------------------------------------------
  • The total execution time of is 644 seconds.
  • 643.2s are reported as being in PeopleCode, and as coming from the MAIN.ExecRpt step.
  • Only 0.3s of that is SQL time, and that comes from PeopleCode functions in the SQL Class  (CreateRecord , CreateSQL etc.)
  • The ProcessReport method does not appear in the PeopleCode SQL analysis section of the report, because it is not classed as SQL.

Identifying the Report ID and Report Source
So, the next question is how can I find out which report this process is running.  Different instances of this report may be running different queries.

You can get the list files generated by a process from the Content Management PeopleTools table PS_CDM_FILE_LIST.  Application Engine processes usually produce various log files (with extensions .aet, .trc and .log), the name of the other file is the same as the name of the report followed by an extension that will vary depending on format.

You can look at the report definition on-line under Reporting Tools -> XML Publisher -> Report Definition, and that will show you the Data Source ID

The Data Source is defined in a component accessible from the previous entry in the same menu.

Analysing Execution Times
I have written this query to aggregate execution time for PSXPQRYRPT by output file name. 

  • PSPRCSRQST is the process request table, from which I get the run time of the process.
  • PS_CDM_FILE_LIST lists the files generated by the process which would be posted to the Report Repository. If I exclude the usual trace files, I am left with the report ID.
  • PSXPRPTDEFN is the report definition record, from which I can get the data source ID
  • PSXPDATSRC specifies the data source.  A data source type (DS_TYPE) of QRY indicates a PeopleSoft PS/Query.  If it is a private query, the OPRID will have a value.
#eeeeee; border: 0px solid rgb(0, 0, 0); overflow: auto; padding-left: 4px; padding-right: 4px; width: 100%;">
column report_defn_id heading 'Report ID'      format a12 
column ds_type heading 'Type' format a4
column ds_id heading 'Data Source ID' format a30
column oprid heading 'Owner' format a10
column processes heading 'Prcs' format 990
column reprots heading 'Reps' format 9,990
column secs heading 'Total|Seconds' format 999,990
column median heading 'Median|Seconds' format 999,990
column variance heading 'Variance' format 9990.0
SELECT d.report_defn_id, d.ds_type, d.ds_id, d.oprid
, SUM(y.processes) processes
, SUM(y.reports) reports
, SUM(y.secs) secs
, median(y.secs) median
, variance(y.secs) variance
FROM (
SELECT x.prcsinstance
, x.filename
, COUNT(DISTINCT x.prcsinstance) processes
, COUNT(*) reports
, SUM(x.secs) secs
FROM (
SELECT r.prcsinstance
, f.filename
, 86400*(r.enddttm-r.begindttm)*ratio_to_report(1) over (partition by r.prcsinstance) secs
FROM sysadm.psprcsrqst r
, sysadm.ps_cdm_file_list f
WHERE r.prcsname = 'PSXPQRYRPT'
AND r.prcsinstance = f.prcsinstance
AND NOT f.cdm_file_type IN('AET','TRC','LOG')
AND r.begindttm >= TRUNC(SYSDATE)
) x
GROUP BY x.prcsinstance, x.filename
) y
, sysadm.psxprptdefn d
WHERE d.report_defn_id = SUBSTR(y.filename,1,instr(y.filename,'.')-1)
GROUP BY d.report_defn_id, d.ds_type, d.ds_id, d.oprid
ORDER BY secs DESC
/

I can now see which report process has been executed run how many times, how many copies of the report have been produced, and where the processing time is being spent, and so which one I should look at first.

#eeeeee; border: 0px solid rgb(0, 0, 0); overflow: auto; padding-left: 4px; padding-right: 4px; width: 100%;">
Total   Median
Report ID Type Data Source ID Owner Prcs Reps Seconds Seconds Variance
------------ ---- ------------------------------ ---------- ---- ------ -------- -------- --------
XXX_WK_LATE QRY XXX_WKLY_LATENESS_RPT 20 20 2,973 148 3702.9
XXX_HRAM_CON QRY XXX_HRPD_CNT_AMD_RPT_QRY 92 92 2,677 27 108.4
XXX_CKOUT_RP QRY XXX_CHECKOUT_REPORT 47 47 2,043 41 347.7
XXX_BNK_RPT QRY XXX_BNK_DTLS_SCH_QRY 1 75 141 141 0.0

Conclusion

If you have a SQL performance problem with an XML report because the source PS/Query performs poorly, the batch timings will lead you to believe that you have a PeopleCode problem and not a SQL problem. 

Beware, this may not be the case.

Most of the execution time for XML Reports sourced from PS/Queries is almost certain to be SQL execution time.  The above query will tell you which queries are consuming how much time, and so inform your tuning effort.

Performance Metrics and XML Reporting in PeopleSoft

I am working with a PeopleSoft system that makes extensive use of XML Publisher reporting.  Generally these reports are based on SQL queries that are defined in the PeopleSoft PS/Query utility. 

These queries are run by an Application Engine program PSXPQRYRPT that runs the SQL query and produces the report in a single process.  This is a generic application engine program that runs any XML report.  Line 45 (in PeopleTools 8.49) of step MAIN.ExecRPT executes the report with the ProcessReport PeopleCode command.

#eeeeee; border: 0px solid rgb(0, 0, 0); overflow: auto; padding-left: 4px; padding-right: 4px; width: 100%;">
&oRptDefn.ProcessReport(&TemplateId, &Languaged, &AsOfDate, &oRptDefn.GetOutDestFormatString(&OutDestFormat));

Batch Timings
Analysis of the Application Engine batch timings indicate that nearly all the time in this Application Engine program is spent in PeopleCode, and that this not SQL execution time.  This is misleading.  The ProcessReport command is PeopleCode, but behind the scenes it also issues the SQL in the report data source.  Not all the time is SQL, but the Application Engine Batch Timings does not count any of this as SQL because it is not in the PeopleCode SQL Class. 

Let’s look at an example Batch Timings report (I have edited it down, removing zero and insignificant timings).

#eeeeee; border: 0px solid rgb(0, 0, 0); overflow: auto; padding-left: 4px; padding-right: 4px; width: 100%;">
PeopleSoft Application Engine Timings
(All timings in seconds)

C o m p i l e E x e c u t e F e t c h Total
SQL Statement Count Time Count Time Count Time Time
------------------------------ ------- -------- ------- -------- ------- -------- --------
PeopleCode
Record.SelectByKey PSDBFIELD 8 0.1 8 0.0 0.1
SELECT PSPRCSRQST 2 0.1 2 0.0 0.1
SELECT PSXPTMPLDEFN 3 0.1 3 0.0 0.1
--------
0.3
AE Program: PSXPQRYRPT
MAIN.ExecRpt.H 1 0.0 1 0.0 1 0.0 0.0
--------
0.0
------------------------------------------------------------------------------------------
Call Non-SQL SQL Total
PeopleCode Count Time Time Time
------------------------------ ------- -------- -------- --------
AE Program: PSXPQRYRPT
MAIN.ExecRpt 1 643.2 0.3 643.5
-------- -------- --------
643.2 0.3 643.5
------------------------------------------------------------------------------------------
E x e c u t e
PEOPLECODE Builtin/Method Count Time
------------------------------------------------------- ------- --------
Boolean(Type 5) BuiltIns 90 0.1
DateTime(Type 11) BuiltIns 1 0.1
SQL(Type 524290) Methods 19 0.1
SQL(Type 524290) BuiltIns 9 0.1
Record(Type 524291) Methods 1104 0.1
Session(Type 524303) Methods 207 633.2
JavaObject(Type 524315) BuiltIns 6 2.2
PostReport(Type 524324) Methods 2 0.7
------------------------------------------------------------------------------------------
Total run time : 644.0
Total time in application SQL : 0.3 Percent time in application SQL : 0.0%
Total time in PeopleCode : 643.2 Percent time in PeopleCode : 99.9%
------------------------------------------------------------------------------------------
  • The total execution time of is 644 seconds.
  • 643.2s are reported as being in PeopleCode, and as coming from the MAIN.ExecRpt step.
  • Only 0.3s of that is SQL time, and that comes from PeopleCode functions in the SQL Class  (CreateRecord , CreateSQL etc.)
  • The ProcessReport method does not appear in the PeopleCode SQL analysis section of the report, because it is not classed as SQL.

Identifying the Report ID and Report Source
So, the next question is how can I find out which report this process is running.  Different instances of this report may be running different queries.

You can get the list files generated by a process from the Content Management PeopleTools table PS_CDM_FILE_LIST.  Application Engine processes usually produce various log files (with extensions .aet, .trc and .log), the name of the other file is the same as the name of the report followed by an extension that will vary depending on format.

You can look at the report definition on-line under Reporting Tools -> XML Publisher -> Report Definition, and that will show you the Data Source ID

The Data Source is defined in a component accessible from the previous entry in the same menu.

Analysing Execution Times
I have written this query to aggregate execution time for PSXPQRYRPT by output file name. 

  • PSPRCSRQST is the process request table, from which I get the run time of the process.
  • PS_CDM_FILE_LIST lists the files generated by the process which would be posted to the Report Repository. If I exclude the usual trace files, I am left with the report ID.
  • PSXPRPTDEFN is the report definition record, from which I can get the data source ID
  • PSXPDATSRC specifies the data source.  A data source type (DS_TYPE) of QRY indicates a PeopleSoft PS/Query.  If it is a private query, the OPRID will have a value.
#eeeeee; border: 0px solid rgb(0, 0, 0); overflow: auto; padding-left: 4px; padding-right: 4px; width: 100%;">
column report_defn_id heading 'Report ID'      format a12 
column ds_type heading 'Type' format a4
column ds_id heading 'Data Source ID' format a30
column oprid heading 'Owner' format a10
column processes heading 'Prcs' format 990
column reprots heading 'Reps' format 9,990
column secs heading 'Total|Seconds' format 999,990
column median heading 'Median|Seconds' format 999,990
column variance heading 'Variance' format 9990.0
SELECT d.report_defn_id, d.ds_type, d.ds_id, d.oprid
, SUM(y.processes) processes
, SUM(y.reports) reports
, SUM(y.secs) secs
, median(y.secs) median
, variance(y.secs) variance
FROM (
SELECT x.prcsinstance
, x.filename
, COUNT(DISTINCT x.prcsinstance) processes
, COUNT(*) reports
, SUM(x.secs) secs
FROM (
SELECT r.prcsinstance
, f.filename
, 86400*(r.enddttm-r.begindttm)*ratio_to_report(1) over (partition by r.prcsinstance) secs
FROM sysadm.psprcsrqst r
, sysadm.ps_cdm_file_list f
WHERE r.prcsname = 'PSXPQRYRPT'
AND r.prcsinstance = f.prcsinstance
AND NOT f.cdm_file_type IN('AET','TRC','LOG')
AND r.begindttm >= TRUNC(SYSDATE)
) x
GROUP BY x.prcsinstance, x.filename
) y
, sysadm.psxprptdefn d
WHERE d.report_defn_id = SUBSTR(y.filename,1,instr(y.filename,'.')-1)
GROUP BY d.report_defn_id, d.ds_type, d.ds_id, d.oprid
ORDER BY secs DESC
/

I can now see which report process has been executed run how many times, how many copies of the report have been produced, and where the processing time is being spent, and so which one I should look at first.

#eeeeee; border: 0px solid rgb(0, 0, 0); overflow: auto; padding-left: 4px; padding-right: 4px; width: 100%;">
Total   Median
Report ID Type Data Source ID Owner Prcs Reps Seconds Seconds Variance
------------ ---- ------------------------------ ---------- ---- ------ -------- -------- --------
XXX_WK_LATE QRY XXX_WKLY_LATENESS_RPT 20 20 2,973 148 3702.9
XXX_HRAM_CON QRY XXX_HRPD_CNT_AMD_RPT_QRY 92 92 2,677 27 108.4
XXX_CKOUT_RP QRY XXX_CHECKOUT_REPORT 47 47 2,043 41 347.7
XXX_BNK_RPT QRY XXX_BNK_DTLS_SCH_QRY 1 75 141 141 0.0

Conclusion

If you have a SQL performance problem with an XML report because the source PS/Query performs poorly, the batch timings will lead you to believe that you have a PeopleCode problem and not a SQL problem. 

Beware, this may not be the case.

Most of the execution time for XML Reports sourced from PS/Queries is almost certain to be SQL execution time.  The above query will tell you which queries are consuming how much time, and so inform your tuning effort.

New Additions To My Blogroll

Thought it was time to update my blogroll with a couple of new additions. Both Charles Hooper and Randolf Geist have provided excellent advice and information on the OTN forums for quite some time and both have excellent blogs that are well worth checking out. They’re also both co-authors in a new Oracle book I’m keen to get [...]

Improving performance with pipelined table functions

Using pipelined functions as a performance tuning tool. January 2010

Build Less (DB Design Version)

37Signals, the company behind few highly successful web-based applications, has published a book about their business building experience. Knowing that the company is both successful and has an unconventional business development philosophy, I decided to browse a bit.

One of the essays that caught my attention is “Build Less”. The idea is that instead of having more features than the competition (or more employees or whatever), you should strive to have less. To avoid any sense of irony – the essay is very short :)

One of the suggestions I would add to the essay is:
“Keep less data”

Keeping a lot of data is a pain. Indexing, partitioning, tuning, backup and recovery – everything is more painful when you have terabytes instead of gigabytes. And when it comes to cleaning data out, it always causes endless debates on how long to keep the data (3 month? 7 years?) and different life-cycle options (move to “old data” system? archiving? how to purge? What is the process?).

What’s more, a lot of the time customers would really prefer we won’t keep the data. Maybe its privacy concerns (when we keep a lot of search history) or difficulty in generating meaningful reports or just plain confusion caused by all those old projects floating around.

Google taught us that all the data should be stored forever. But perhaps your business can win by keeping less data.

RMOUG Presentations

Like many other DBAs, I’ll be attending RMOUG training days conference on Feb 17-18 in Denver. I’ll give two presentations in the conference. On the same day, just thinking about it makes me exhausted.

The first presentation is “Everything DBAs need to know about TCP/IP Networks”. Here’s the paper and the slides. I’ll also present this at NoCOUG’s winter conference in Pleasanton, CA. Maybe you prefer to catch me there.

The second presentation is “Analyzing Database Performance using Time Series Techniques”. Here’s the paper and the slides.

I still have time to improve the presentations and papers – so comments are very welcome :)

The Core Performance Fundamentals Of Oracle Data Warehousing – Table Compression

[back to Introduction] Editor’s note: This blog post does not cover Exadata Hybrid Columnar Compression. The first thing that comes to most people’s mind when database table compression is mentioned is the savings it yields in terms of disk space. While reducing the footprint of data on disk is relevant, I would argue it is the lesser of the benefits for data warehouses. Disk capacity is very cheap and generally plentiful, however, disk bandwidth (scan speed) is proportional to the number of spindles, no mater what the disk capacity and thus is more expensive. Table compression reduces the footprint on the disk drives that a given data set occupies so the amount of physical data that must be read off the disk platters is reduced when compared to the uncompressed version. For example, if 4000 GB of raw data can compress to 1000 GB, it can be read off the same disk drives 4X as fast because it is reading and transferring 1/4 of the data off the spindles (relative to the uncompressed size). Likewise, table compression allows for the database buffer cache to contain more data without having to increase the memory allocation because more rows can be stored [...]

BAARF - Battle against any raid 5


BAARF - Battle Against any Raid Five
In the chart below, 1 represents good and 5 represents bad.
Raid 5 runs into read performance issues in a mixed workload because every write takes two reads plus a write. These two reads can conflict with other readers. Also because Raid 5 is typically configured with small stripe sizes, say 32K or 64K, to mitigate write penalties, a multiblock read can easily span several disks, increasing the chances of concurrency read conflicts.


IO operations per second


Vendors say that read cache will take care of IO Operations per sec, but if you have a sustained throughput it will only sustain for so long
Here is an example from James Morle http://scaleability.com :


Seagate Barracuda 4LP

Seagate Cheetah 73LP

Capacity

2.16GB

73.4GB

Rotation Speed

7200rpm

10000rpm

Rotational Delay(avg)

4.1ms

3ms

Time to read 32Kb

6ms

3ms

Seek Time (avg)

9.4ms

4.9

Total time for Single I/O

19.5ms

10.9ms

I/O per second (conservative)

51

92

IOPs/sec per 100GB

2550

126


The Oracle Wait Interface Is Useless (sometimes) – part 3a

OK, here it is, the ‘first part of the last part’, though the topics discussed in these articles will be discussed more over time in my blog and in Tanel’s. I’ve split it into two subparts, because it was just getting insanely long as single posting.

Before I get going on this one, let’s just clear up a misunderstanding from the first part of this series. The first part uses a specific example which, for clarity reasons, will continue to be used for the remainder of the series. The example shown is some kind of query-related issue in this case, but the approach shown here is a more general one that will locate the root cause in many, many other cases than just the case of a bad query. This SQL example is still a good one, though, because there are no waits, lots of CPU, and no system calls. But the underlying problem might not be caused by poor SQL, and diving straight into SQL tuning could waste you 2 days fixing SQL when that is not the root cause. Just because we happen to be running SQL, it does not mean that it is a SQL problem. What we are trying to achieve here is a more robust approach to root cause diagnosis of Oracle performance problems, one where we don’t guess, we don’t assume, we don’t stick with things we’ve seen before: We quantitatively find the real problem. That might be the smart way, it might be the lazy way – those might be the same thing! But most crucially, it’s the fast way to fixing the real problem.

So, back to the main topic. In the first part of this blog, I walked through the example problem and showed that we have a problem that has the following attributes:

  • The user is reporting a server response time problem
  • Zero Oracle waits
  • 100% CPU
  • Zero system calls
  • DB Time is showing all the CPU and elapsed time is spent executing SQL (in this example…)

Note to Kevin: Though I stated the query ‘never’ returns in part one, I was, of cause, using my usual exaggerative vernacular. I meant, of course, that it was taking a very long time to return, but still returning correct results. Sorry about that – it will ‘never’ happen again.

At this point, we might jump into a prolonged SQL tuning exercise. But let’s not for now, because that is still guesswork in my opinion. Let’s make the following reasonable assumption: The whole system is running slowly, using huge quantities of CPU, and this is a mission-critical system where you cannot simply start killing queries and re-executing them in your SQL*Plus session. That’s a reasonable assumption – I have worked on many, many, production systems where this is the case. So let’s carry on diagnosing without actually interfering with the production system in any way.

This is where we might deviate from time-based diagnostics. Not because we don’t think that time is the single most valuable metric on which to determine what ‘fast’ and ‘slow’ actually mean in quantitative terms, but because there is a bit of a shortcut available to us here that might just nail the cause of the problem. That shortcut is session-based statistics, and they have nothing to do with time, being simple counters.

This is where it makes no sense at all to re-invent the wheel, and was why Tanel is on the hook for this part: Tanel has already done some significant work on session-based statitics analysis and has written a great tool, snapper, for just this purpose. If you have not already read part two of this series, head over to Tanel’s blog now to catch up before proceeding!

OK, so we’ve abandoned the wait interface, and let’s assume that the sessions statistics did not yield anything particularly obvious. Or, maybe you just want to dig a little deeper, or possibly take yet another short cut to the truth? Now it’s time to look beyond Oracle and into the Operating System. The Operating System has a completely different viewpoint of the operation of Oracle, notably through the use of execution stack profiling. Don’t run away just yet, it’s not as difficult as you might imagine.

Various methods exist to probe the execution stack across the various Operating Systems. All of these should be used with caution, as they have varying degrees of intrusiveness on the operation of Oracle. The more intrusive methods, such as gdb, and the (possibly) less tested methods (oradebug short_stack) have very real potential to crash the process that you are attaching to, or worse. Don’t say I didn’t warn you! However, other methods for profiling the stack are less intrusive, and are already very robust: I’m talking here about Dtrace and variants.

I have held off from getting too excited by Dtrace in the past, even though the technology always looked amazing. The reason for holding back was that it was a single-OS tool, and without anything comparable on the Linux platform I didn’t see it being useful to >75% of my customers. That’s changing now, though, with the production release of Systemtap (partial, at least) in RHEL 5.4 and there is even a similar tool for AIX named Probevue. So now I’m excited about this technology!

Before digging into Dtrace and friends, I think it’s important that we take a little diversionary refresh into what a stack actually is, and why it is important. For those with a good coding background, look away now, and rejoin the group for part 3b. Just to keep your interest level, though, would it be interesting to get a second-by-second view of where the time goes for a process, down to the following levels of detail?

  • Which line of the plan did I spend most time in?
  • How much time did I spend waiting for page faults?
  • Which wait events did I spend most time in?

That’s going to be part 3b… so please check back periodically!

Back to the “Stack Primer for DBAs”: Consider the following elementary piece of C code:

#include 

int i;

void my_function();

void
main() {

	scanf("%c",&i);
	my_function();

}

Don’t worry about the details of the language if you are unfamiliar with C. The only things to observe here are that we have one “function”, named ‘main()’, which is making a call to another function ‘my_function()’. The main() function is a special one – all standalone C programs have a main() function, which is the initial point of execution. A C program is just a set of compiled machine code instructions at execute time, just like any other computer program. These instructions occupy a chunk of memory, where each memory location is either a machine code instruction or a piece of data. When the code gets down to the call to my_function() in main(), the execution must jump (or branch) to the address that my_function()’s machine code instructions reside at. Before this can happen, an amount of context must be stored, namely:

  • The ‘return’ address, so that execution can resume in the main() function after my_function() is complete
  • Any ‘data’ passed to my_function() (the function arguments)

In C, this information is pushed onto the stack and then the execution branches to the entry point of my_function(). As a sidenote, the sharing of data and execution context in this way is a common way to exploit security holes by overwriting execution context with oversized arguments from the data portion. We’re not interested in all that stuff for this – we are only interested in observing the stack contents. If we took a stack dump while the program were in the my_function() function, it would look like this:

my_function()
main()

Pretty simple, huh? If there were arguments for main() and my_function(), they would also be shown in some way. Most stack trace facilities can only show primitive datatypes (such as integers) and all other more complex arguments (such as structs) are shown as pointers (addresses) to the actual data without decoding the actual content, as follows:

my_function(42, 0x12345678)
main(0x45678900)

Some tools will print a stack trace (known as a backtrace, because it is unwound backwards from the current position in the stack) with some known pointers decoded into human readable form, such as simple struct datatypes. Anyway, let’s have a look at something a bit more complicated – an Oracle backtrace taken from an errorstack trace file:

----- Call Stack Trace -----
calling              call     entry                argument values in hex
location             type     point                (? means dubious value)
-------------------- -------- -------------------- ----------------------------
skdstdst()+36        call     kgdsdst()            000000000 ? 000000000 ?
                                                   7FFFB3456278 ? 000000001 ?
                                                   7FFFB345A778 ? 000000000 ?
ksedst1()+98         call     skdstdst()           000000000 ? 000000000 ?
                                                   7FFFB3456278 ? 000000001 ?
                                                   000000000 ? 000000000 ?
ksedst()+34          call     ksedst1()            000000001 ? 000000001 ?
                                                   7FFFB3456278 ? 000000001 ?
                                                   000000000 ? 000000000 ?
dbkedDefDump()+2736  call     ksedst()             000000001 ? 000000001 ?
                                                   7FFFB3456278 ? 000000001 ?
                                                   000000000 ? 000000000 ?
ksedmp()+36          call     dbkedDefDump()       000000001 ? 000000000 ?
                                                   7FFFB3456278 ? 000000001 ?
                                                   000000000 ? 000000000 ?
ksdxfdmp()+1837      call     ksedmp()             000000001 ? 000000000 ?
                                                   7FFFB3456278 ? 000000001 ?
                                                   000000000 ? 000000000 ?
ksdxcb()+1782        call     ksdxfdmp()           7FFFB345BAE0 ? 000000011 ?
                                                   000000003 ? 7FFFB345BA40 ?
                                                   7FFFB345B9A0 ? 000000000 ?
sspuser()+112        call     ksdxcb()             000000001 ? 000000011 ?
                                                   000000001 ? 000000001 ?
                                                   7FFFB345B9A0 ? 000000000 ?
__restore_rt()       call     sspuser()            000000001 ? 000000011 ?
                                                   000000001 ? 000000001 ?
                                                   7FFFB345B9A0 ? 000000000 ?
semtimedop()+10      signal   __restore_rt()       000018000 ? 7FFFB345C730 ?
                                                   000000001 ?
                                                   FFFFFFFFFFFFFFFF ?
                                                   FFFFFFFFFFD23940 ?
                                                   000000000 ?
sskgpwwait()+259     call     semtimedop()         000018000 ? 7FFFB345C730 ?
                                                   000000001 ? 7FFFB345C6D8 ?
                                                   FFFFFFFFFFD23940 ?
                                                   000000000 ?
skgpwwait()+151      call     sskgpwwait()         7FFFB345CB94 ? 00A9A15C0 ?
                                                   07848C4D8 ? 0002DC6C0 ?
                                                   7FFF00000000 ? 000000000 ?
ksliwat()+1816       call     skgpwwait()          7FFFB345CB94 ? 00A9A15C0 ?
                                                   000000000 ? 0002DC6C0 ?
                                                   000000000 ? 000000000 ?
kslwaitctx()+157     call     ksliwat()            078666348 ? 078666348 ?
                                                   005F5DFA7 ? 000000000 ?
                                                   100000000 ? 000000000 ?
kslwait()+136        call     kslwaitctx()         7FFFB345CE30 ? 000000000 ?
                                                   005F5DFA7 ? 000000000 ?
                                                   100000000 ? 000000000 ?
psdwat()+107         call     kslwait()            005F5DFA7 ? 000000167 ?
                                                   000000000 ? 005F5DFA7 ?
                                                   000000000 ? 000000000 ?
pevm_icd_call_commo  call     psdwat()             005F5DFA7 ? 000000167 ?
n()+421                                            000000000 ? 005F5DFA7 ?
                                                   000000000 ? 000000000 ?
pfrinstr_ICAL()+164  call     pevm_icd_call_commo  7FFFB345E0F0 ? 000000000 ?
                              n()                  000000001 ? 000000004 ?
                                                   7FAF00000001 ? 000000000 ?
pfrrun_no_tool()+63  call     pfrinstr_ICAL()      7FAF7E8A3500 ? 06A03B9AA ?
                                                   7FAF7E8A3570 ? 000000004 ?
                                                   7FAF00000001 ? 000000000 ?
pfrrun()+1025        call     pfrrun_no_tool()     7FAF7E8A3500 ? 06A03B9AA ?
                                                   7FAF7E8A3570 ? 000000004 ?
                                                   7FAF00000001 ? 000000000 ?
plsql_run()+769      call     pfrrun()             7FAF7E8A3500 ? 000000000 ?
                                                   7FAF7E8A3570 ? 7FFFB345E0F0 ?
                                                   7FAF00000001 ? 070C18646 ?
peicnt()+296         call     plsql_run()          7FAF7E8A3500 ? 000000001 ?
                                                   000000000 ? 7FFFB345E0F0 ?
                                                   7FAF00000001 ? 000000000 ?
kkxexe()+520         call     peicnt()             7FFFB345E0F0 ? 7FAF7E8A3500 ?
                                                   7FAF7E8C4028 ? 7FFFB345E0F0 ?
                                                   7FAF7E8C1FD0 ? 000000000 ?
opiexe()+14796       call     kkxexe()             7FAF7E8A5128 ? 7FAF7E8A3500 ?
                                                   000000000 ? 7FFFB345E0F0 ?
                                                   7FAF7E8C1FD0 ? 000000000 ?
kpoal8()+2283        call     opiexe()             000000049 ? 000000003 ?
                                                   7FFFB345F678 ? 7FFFB345E0F0 ?
                                                   7FAF7E8C1FD0 ? 000000000 ?
opiodr()+1149        call     kpoal8()             00000005E ? 00000001C ?
                                                   7FFFB3462750 ? 7FFFB345E0F0 ?
                                                   7FAF7E8C1FD0 ? 5E00000001 ?
ttcpip()+1251        call     opiodr()             00000005E ? 00000001C ?
                                                   7FFFB3462750 ? 000000000 ?
                                                   008C5E7F0 ? 5E00000001 ?
opitsk()+1628        call     ttcpip()             00A9B0890 ? 0086BA768 ?
                                                   7FFFB3462750 ? 000000000 ?
                                                   7FFFB34621B0 ? 7FFFB3462954 ?
opiino()+953         call     opitsk()             00A9B0890 ? 000000000 ?
                                                   7FFFB3462750 ? 000000000 ?
                                                   7FFFB34621B0 ? 7FFFB3462954 ?
opiodr()+1149        call     opiino()             00000003C ? 000000004 ?
                                                   7FFFB3463E48 ? 000000000 ?
                                                   7FFFB34621B0 ? 7FFFB3462954 ?
opidrv()+565         call     opiodr()             00000003C ? 000000004 ?
                                                   7FFFB3463E48 ? 000000000 ?
                                                   008C5E2A0 ? 7FFFB3462954 ?
sou2o()+98           call     opidrv()             00000003C ? 000000004 ?
                                                   7FFFB3463E48 ? 000000000 ?
                                                   008C5E2A0 ? 7FFFB3462954 ?
opimai_real()+128    call     sou2o()              7FFFB3463E20 ? 00000003C ?
                                                   000000004 ? 7FFFB3463E48 ?
                                                   008C5E2A0 ? 7FFFB3462954 ?
ssthrdmain()+209     call     opimai_real()        000000002 ? 7FFFB3464010 ?
                                                   000000004 ? 7FFFB3463E48 ?
                                                   008C5E2A0 ? 7FFFB3462954 ?
main()+196           call     ssthrdmain()         000000002 ? 7FFFB3464010 ?
                                                   000000001 ? 000000000 ?
                                                   008C5E2A0 ? 7FFFB3462954 ?
__libc_start_main()  call     main()               000000002 ? 7FFFB34641B8 ?
+253                                               000000001 ? 000000000 ?
                                                   008C5E2A0 ? 7FFFB3462954 ?
_start()+36          call     __libc_start_main()  0009D3D74 ? 000000002 ?
                                                   7FFFB34641A8 ? 000000000 ?
                                                   008C5E2A0 ? 7FFFB3462954 ?

I know, it’s a bit much to deal with at first glance. But take it a piece at a time, reading in conjunction with Metalink note 175982.1, and it all begins to make sense. Actually, I’m guessing some of these, because the Metalink note doesn’t give all the answers, but it seems to make sense.

Let’s start at the bottom, which is the beginning of the execution of the C program named ‘oracle’. The first two lines are the GLIBC execution wrappers that eventually call Oracle’s main() statement which starts at line 110 of the listing. The next several lines above are the initialisation of Oracle and the (O)racle (P)rogramatic (I)nterface, which is the internal API used in Oracle. TTCPIP is the Two-Task-Communication interface with my SQL*Plus process, but the first interesting line is at line 74, kkxexe(), which is the (k)ernel (k)ompile e(x)ecute) (exe)cute statement call, which is the start of the actual SQL processing. In this case it is actually a simple PL/SQL call to DBMS_LOCK.SLEEP(100000), and the entry to the Oracle Wait interface can be seen at line 50 with the kslwait() call, eventually ending up with a semtimedop() call, which is the way the sleep call has been implemented. Incidentally, the preceding sskgpwwait() call at line 37 is Oracle’s platform-specific code which, in this case, is Red Hat Linux.  At line 11 we see the call to ksedst(), which probably stands for (k)ernel (s)ervices (e)rrorstack (d)ump (s)tack (t)race, which is the part of code actually producing the stack trace we are reading.

So what does all this actually mean in terms of seeing what our Oracle session is up to? Well, we can see what is called, by whom, and how long each call took. We don’t need to know all the function names and what they are responsible for, but we can come up with a selection of interesting ones that we would like to pay attention to. Thinking in terms of SQL execution we can see, for example, a nested loop rowsource function calling out to an index fetch rowsource function, which in turn calls out to a table fetch function.

Even just a few consecutive stack traces of a running query will make it very plain which rowsource Oracle is spending the most time within, simply because of the laws of probability. If you look once a second for a minute, and every stack you see has the ‘index fetch’ rowsource as the current function, then you probably spent most of the last minute doing that kind of processing. But we can do better than that… we can start to look at quantifying where the time is spent, and also to supplement that information with other interesting aspects of the operating system operation, aspects which are somewhat  concealed from Oracle and yet can detrimentally affect its performance.

Join me back for part 3b for the conclusion of this series of articles!