Search

Top 60 Oracle Blogs

Recent comments

Oakies Blog Aggregator

DBID Is Not Definitive When Used As An Identifier

Our Audit Trail toolkit PFCLATK has some brief documentation on the page that's linked here but in summary it is a comprehensive toolkit that allows quick and easy deployment of an audit trail into a customers database. We are currently....[Read More]

Posted by Pete On 12/03/19 At 09:20 PM

sys_op_lbid

I’ve made use of the function a few times in the past, for example in this posting on the dangers of using reverse key indexes, but every time I’ve mentioned it I’ve only been interested in the “leaf blocks per key” option. There are actually four different variations of the function, relevant to different types of index and controlled by setting a flag parameter to one of 4 different values.

The call to sys_op_lbid() take 3 parameters: index (or index [sub]partition object id, a flag vlaue, and a table “rowid”, where the flag value can be one of L, R, O, or G. The variations of the call are as follows:

  • L – the function will return the row directory address  (i.e. something that look like a rowid) of the first index entry in the leaf block that holds the index entry for the referenced table rowid. The effect of this is that the number of distinct values returned by calling the function for every row in the table is equal to the number of index leaf blocks which current hold an active entry.
  • R – Relevant only to bitmap indexes; the function will return the row directory address of the bitmap index entry for the referenced table rowid. The effect of this is that the number of distinct values returned by calling the function for every row in the table is equal to the number of index entries in the bitmap index.
  • O – Relevent only to the primary key index of an index organized table with an overflow. The function is used with a non-key column instead of a rowid and returns a rowid that corresponds to the row directory entry in the overflow segment. An interesting detail of the overflow entries is that there is an “nrid” (next rowid) pointer in the primary key index entry that does not get deleted when all the columns in the related overflow entry are set null – so you can delete all the data from the overflow (set every overflow column in every row to null) and the primary key clustering factor would not change.
  • G – Relevent only to secondary indexes on an index organized table. Like the L and R options this function takes a rowid (which is a special case for IOTs) as one of its inputs and uses the block guess from the secondary index to construct a row directory entry for the first entry in the primary key leaf block that corresponds to that block guess. This serves two purposes – it allows Oracle to calculate the clustering factor of the secondary index (as you walk the secondary index in order how much do you jump around the leaf blocks of the primary key), and it allows Oracle to produce the pct_direct_access figure for the secondary index by joining the secondary index to the primary key index on primary key, and comparing the ‘G’ result for the secondary with the ‘L’ result from the primary, which gives a count of the number of times the guess is correct.

These observations can be confirmed by gathering stats on different structures with trace enabled, and doing a couple of block dumps. For reference the following is just a simple script to create an index organized table with overflow and secondary index:


rem
rem     Script:         sys_op_lbid_2.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Dec 2018
rem

create table t1(
        id      constraint t1_pk primary key,
        v1      ,
        v2      ,
        v3      ,
        padding 
)
organization index
pctthreshold 2
overflow
as
with generator as (
        select 
                rownum id
        from dual 
        connect by 
                level <= 1e4 -- > comment to avoid WordPress format issue
)
select
        rownum,
        lpad(rownum,30),
        lpad(rownum,30),
        lpad(rownum,40),
        rpad('x',100,'x')
from
        generator       v1,
        generator       v2
where
        rownum <= 1e4 -- > comment to avoid WordPress format issue
;

create index t1_i1 on t1(v3);

alter session set sql_trace true;

begin
        dbms_stats.gather_table_stats(
                ownname     => null,
                tabname     => 'T1',
                method_opt  => 'for all columns size 1'
        );
end;
/

alter session set sql_trace false;

select
        object_id, object_name
from
        user_objects
order by
        object_id
;

The significance of the query for object_id and data_object_id shows up in the trace file (and subsequent dumps) when Oracle uses one or other of the values in its SQL and rowid construction.

Here are the interesting SQL statements generated as the stats are gathered – but cosmetically altered to be reader-friendly. In order they are:

  1. Stats for primary key of IOT: using the ‘L’ option for counting leaf blocks and the ‘O’ option for the clustering factor into overflow segment.
  2. Stats for secondary index of IOT: using the ‘L’ option for counting leaf blocks and the ‘G’ option for the clustering factor into the primary key index
  3. Calculate pct_direct_access: the ‘L’ option gives the actual leaf block in the primary key index, the ‘G’ option gives the leaf block guessed by the secondary index

select 
        /*+ index(t,t1_pk) */ 
        count(*) as nrw,
        count(distinct sys_op_lbid(351334,'L',t.rowid)) as nlb,
        null as ndk,
        (sys_op_lbid(351334,'O',V1),1) as clf
from
        t1 t 
where 
        id is not null
;


select 
        /*+ index(t,t1_i1) */ 
        count(*) as nrw,
        count(distinct sys_op_lbid(351335,'L',t.rowid)) as nlb,
        null as ndk,
        sys_op_countchg(sys_op_lbid(351335,'G',t.rowid),1) as clf
from
        t1 t 
where 
        v3 is not null
;


select
        case when count(*) = 0
                then 100
                else round(
                        count(
                                case when substr(gdba,7,9)=substr(lbid,7,9)
                                        then 1
                                        else null
                                end
                        )/count(*)*100
                )
        end
from    (
        select
                /*+
                        ordered
                        use_hash(i.t1 t2)
                        index_ffs(t2,t1_pk)
                */
                sys_op_lbid(351334,'L',t2.rowid) lbid,
                gdba
        from (
                select
                        /*+ index_ffs(t1,t1_i1) */
                        sys_op_lbid(351335,'G',t1.rowid) gdba,
                        t1.ID
                from
                        t1 t1
                ) i,
`               t1 t2
        where
                i.id = t2.id
        )
;

The strange substr(,7,9) that appears in the join between the primary key index and the secondary index is needed because the ‘G’ option uses the object_id of the table to turn an absolute block guess into a rowid while the ‘L’ option is using the data_object_id of the primary key index to turn its block addrss into a rowid. (This means there may be variants of this SQL for IOTs using partitioning.)

 

Hardening and Securing The Oracle Database Training in London

I posted last week that I will teach my two day class " How to Perform a Security Audit of an Oracle Database " with Oracle University in London on the 29th and 30th April 2019. We have now added....[Read More]

Posted by Pete On 11/03/19 At 11:52 AM

Oracle Magazine

Generally my blog is just snippets of tech content that take my interest as I encounter them (most commonly when looking at AskTOM). If I think they’ll be useful, I’ll just plonk them out right there and then. If you prefer your content in longer (and more structured Smile) form, then also I publish longer form articles on Oracle Magazine every couple of months. Below is a consolidated list of my articles. I’ll try to keep this list updated as I add new ones.

Old Dog, New Tricks, Part 2
Here’s a new SQL syntax for hierarchy processing.

Improved Defaults in Oracle Database 12c
The new DEFAULT clause provides better values for getting started and better performance.

Excellent Extensions
New features in Oracle Database 12c Release 2 make external tables even more flexible.

Tighter PL/SQL and SQL Integration
PL/SQL functions perform better in SQL in Oracle Database 12c.

All Aboard the SQL*Loader Express
A new processing mode takes the hassle out of dealing with flat file loading.

Long and Overflowing
LISTAGG in Oracle Database 12c Release 2 solves the problem of excessively long lists.

Assume the Best; Plan for the Worst
Here’s a technique for delivering better performance through optimistic code programming.

A Fresh Look at Auditing Row Changes
Triggers can provide auditing information, but there’s a future in flashback.

Better Tools for Better Data
New functions in Oracle Database 12c Release 2 solve data validation challenges.

Unintended Side Effects
Ensure that the code you write does not create problems elsewhere in your applications.

Write in a Read-Only Database
Run reports and DML against your standby database with Oracle Active Data Guard.

Open for Exchange
FOR EXCHANGE in Oracle Database 12c Release 2 takes the detective work out of partition exchanges.

A Higher-Level Perspective on SQL Tuning
The commonly missed first steps of tuning a SQL statement

Are We All on the Same Page?
Pagination of data can make (or break) your database.

Old Dog, New Tricks
Take advantage of SQL extensions for hierarchy processing.

SQL prevents database corruption and injection, except in the ridiculous movie’s hacker scenes.

SQL is the Structured Query Language used to define and manipulate data in most of the databases in the world, and the most critical ones (banks, hospitals, airlines, secret services… ). And then, it gives the impression that with SQL you can do whatever you want, bypassing all application control, as if it were a backdoor to your database, wide opened on the network.

Superman 3 “overide all security” command

Programmers always laugh when seeing ridiculous hacking scenes in movies. In 2016 there was this “use SQL to corrupt their database” line in Jason Bourne (nothing to do with JSON or /bin/sh, by the way, it’s a movie) and recently in StarTrek: discovery series the hacking 'audit' was explained as “The probe used multiple SQL injections”. I’ve put the links at the end of this post.

I just want to explain that SQL is not a problem in this area, but can be the solution. SQL is not only a powerful language to manipulate data: it also has all features to encapsulate and protect the access to the database.

Actually, SQL is the implementation of Edgar F. Codd rule number 5:

Pro SQL Server Relational Database Design and Implementation
By Louis Davidson, Jessica Moss

When one says that SQL is DDL (Data Definition Language) and DML (Data Manipulation Language), and forgets the DCL (Data Control Language), he suggests that we can do anything, like dropping or changing data, and forgets all this ‘authorization’ part. And DDL does not only CREATE and DROP the tables, but we can (and should) create views to implement the encapsulation of data access, like the ‘private’ and ‘public’ declarations we find in other languages. And stored procedure when we need a procedural language more complex rules.

If we can “use SQL to corrupt their database”, that’s not because of SQL. That’s because of the developers who did not care about security and maybe didn’t know that:

  • schemas should be used to isolate the different layers — internal ones vs. the ones exposed to the users
  • views must be used to hide the internal private columns or rows, and expose only the public ones. Views are not only for queries (SELECT), but also modifications (INSERT, UPDATE, DELETE) can be made on views.
  • grant/revoke table access must be done according to the visibility specification. Never do a GRANT ALL PRIVILEGES TO PUBLIC thinking that the front-end will guard all access.
  • stored procedures should be used to expose the application services, rather than giving direct DML access on tables or views to the front-end

Here is a clear definition, by Philip Salvisberg, about the right design of database application where data is not exposed directly but through a secured application API defined in the database:

The Pink Database Paradigm (PinkDB) - Philipp Salvisberg's Blog

With this, impossible to “use SQL to corrupt their database” because the SQL statements are not written by the user, but coded by the developer who implemented this API, with all access rules enforced. But then, what about hackers trying to change the API behavior by trying StarTreck’s “multiple SQL injections”?

SQL Injection

The idea is exposed in the following were a user can pass some SQL statements through the application API to tweak a SELECT statement and change it to be a DROP TABLE one:

xkcd: Exploits of a Mom

But this, again, is not a weakness of SQL but the wrong design and misunderstanding of Codd’s Rule 5 which states that there is two way to manipulate data: “interactive and by program”. If the user is able to do this SQL injection, that’s because the developer has used the wrong one. Probably by laziness, he used the ad-hoc query language instead of the programmer’s one.

Interactive

Yes, SQL can be used by end-users and this is why it is a 4th generation declarative language, looking like an English sentence describing the result we want. Here, users can do data manipulation without the need to learn a procedural language. The security access, in this case, relies entirely upon the DDL (view definition) and DCL (privileges granted) after the end-user authentication (login with his username). The user has not been granted DROP TABLE privilege, and will never be able to do it.

But that is not correct when the application connects to the database with a generic account, like when one program runs to serve multiple users and transactions, because this account has probably more privileges to cope with the multiple user profiles.

By program

As Rule 5 mentions, the SQL language can be used by programs. There, it is not the user who writes the SQL statement. The SQL statement is coded in advance by the application developer. And it is parsed and compiled. And then it is executed, many times, for different end-users, with different parameters. In this case, SQL is not a simple text command passed as a character string to the database engine. Here, SQL defines a cursor that is prepared from a SQL text that contains some variables. And the cursor is executed, passing values to those variables. There is no way to ‘inject’ SQL in this because the user can inject only values to the parameters: he cannot inject code.

Do you take user input and build from it a string to pass to Runtime.exec() in Java or os.system() in Python? This is exactly what you do with something like:

Statement stmt = con.createStatement();
ResultSet rs=stmt.executeQuery(
"select * from students where name='"+argv[1]+"'"
);

This is subject to SQL injection because you can pass a value that will change the WHERE clause predicate.

The right way to program a database access is to prepare a statement (once), and then execute it:

Statement stmt=con.prepareStatement(
"select * from students where name=:1"
);
stmt.setString(1,argv[1]);
ResultSet rs = stmt.executeQuery(stmt);

Here there’s no way to pass something else than a String value for the name column.

This is mandatory for security, but also for performance. Do you imagine that you compile a different program for each user input?

In summary, are you an end-user running ad-hoc queries or a professional programmer who builds a secured and efficient application? SQL is the right language for both, but you must pick the right approach. Or your code will be ridiculous like these movies hacking scenes:

“Use SQL to corrupt their database” (Jason Bourne, 2016)


“The probe used multiple SQL injections” (Star Trek: Discovery, 2019)

Using SSMS with Power BI

I’m curious by nature and many have shown interest when I connect Power BI to the SQL Profiler to collect performance information, but if you can do that, what happens when you connect it to the SQL Server Management Studio, (SSMS)?

That can seem quite foreign, but if you can connect it to the SQL Profiler, you can connect it to the SSMS. Why you can do this is clearly understood when you begin to look underneath the covers of the PBIX file and the processes that run from your desktop.

The PBIX File

All Power BI files end with .pbix. As I and others have written about, you can make a copy of the file, (because we DBAs always like to work from copies vs. the originals) then rename the file from .pbix to .zip. You can then unzip the file and see the base components that make up a Power BI report and visuals:

https://dbakevlar.com/wp-content/uploads/2019/03/enrollmentzip-300x97.jpg 300w, https://dbakevlar.com/wp-content/uploads/2019/03/enrollmentzip-768x249.jpg 768w" sizes="(max-width: 1024px) 100vw, 1024px" />

You now have a breakdown of the DataModel, DataMashup, the Diagram Layouts, the state, the Metadata, Security, Settings and Version. It’s kind of cool to know you can dissect the file once unzipped and make changes to specific features this way, as folks in the community have documented.

The Desktop Processes and Ports

While the Power BI report is open, you’ll see what is running using the netstat -b -n command from the command prompt as an administrator:

https://dbakevlar.com/wp-content/uploads/2019/03/netstatpbi-300x275.jpg 300w, https://dbakevlar.com/wp-content/uploads/2019/03/netstatpbi-768x705.jpg 768w" sizes="(max-width: 855px) 100vw, 855px" />

You’ll notice that pbidesktop.exe is port 54125, with multiple child threads. You’ll also notice there is an executable also sourced from the pbidesktop.exe process called msmdsrv.exe. That’s the executable for an Analysis Server. Your data model is being run by trimmed down Analysis Server in Power BI. If you have an Analysis Server port, you can connect to it with SSMS using the ID for Power BI Desktop. In our example above, the ID is 54125 and as the desktop is running on your PC, it would be your “localhost” for the server name. Open up SSMS and update the server type to Analysis Server and I type in the following, using my Active Directory login to connect:

https://dbakevlar.com/wp-content/uploads/2019/03/ssmsas-300x202.jpg 300w, https://dbakevlar.com/wp-content/uploads/2019/03/ssmsas-768x518.jpg 768w, https://dbakevlar.com/wp-content/uploads/2019/03/ssmsas.jpg 1195w" sizes="(max-width: 1024px) 100vw, 1024px" />

You will connect to the trimmed down Analysis Server behind the PBI Desktop report you have running on your screen and have limited interactive options. If you need proof, here’s the expanded list from SSMS:

https://dbakevlar.com/wp-content/uploads/2019/03/ssmsas2-300x242.jpg 300w, https://dbakevlar.com/wp-content/uploads/2019/03/ssmsas2-768x620.jpg 768w, https://dbakevlar.com/wp-content/uploads/2019/03/ssmsas2.jpg 1519w" sizes="(max-width: 1024px) 100vw, 1024px" />

Notice that it matches my list of tables from the Power BI interface for the same report:

https://dbakevlar.com/wp-content/uploads/2019/03/ssms_pbi-158x300.jpg 158w, https://dbakevlar.com/wp-content/uploads/2019/03/ssms_pbi-768x1458.jpg 768w, https://dbakevlar.com/wp-content/uploads/2019/03/ssms_pbi.jpg 831w" sizes="(max-width: 461px) 100vw, 461px" />

If we expand SSMS interface for the Analysis Server data model further, we can match the columns, calculated columns and measures, as well:

https://dbakevlar.com/wp-content/uploads/2019/03/ssms_as4-300x226.jpg 300w, https://dbakevlar.com/wp-content/uploads/2019/03/ssms_as4-768x578.jpg 768w" sizes="(max-width: 1024px) 100vw, 1024px" />

There are limitations- In a full Analysis Server data model, you would be able to execute SQL, MDX, XML and DAX queries. This appears to be disabled from the backend of the PBI data model, which would require you to only perform, at least any DAX and MDX, in the UI of Power BI.

https://dbakevlar.com/wp-content/uploads/2019/03/ssas_ms5-1-300x26.jpg 300w, https://dbakevlar.com/wp-content/uploads/2019/03/ssas_ms5-1-768x66.jpg 768w" sizes="(max-width: 1024px) 100vw, 1024px" />

As you can see above, the Execute button is missing from the toolbar and is missing, no matter if you open an XMLA, DAX, MDX and for the SQL Query, you can’t connect to a database engine. The database isn’t a full SQL Server or even appear to be a Windows Internal Database, (WID).

What you can do from here, is connect to the SQL Profiler from the SSMS UI and trace performance for the Power BI report, (data loads, data refreshes, DAX, MDX, etc) along with durations for distributed work to remote data sources.

Well, I hope my little run through Power BI with SSMS was interesting and have a great weekend.



Tags:  ,


Del.icio.us



Facebook

TweetThis

Digg

StumbleUpon




Copyright © DBAKevlar [Using SSMS with Power BI], All Right Reserved. 2019.

Effective PeopleSoft Performance Monitoring

This advice note describes how to configure PeopleSoft systems on Oracle so that performance metrics are collected that are useful performance monitoring and subsequent performance tuning.

Contents

  • Oracle RDBMS Instrumentation
    • On-Line Component/Page Information
    • Application Engine Step Information
    • Cobol/nVision Instrumentation
    • nVision Layout Instrumentation
    • 2nd Database Connection Instrumentation
  • PeopleTools Performance Metrics
    • Cobol and Application Engine Batch Timings
    • PeopleSoft Performance Monitor

Summary of Recommendations

  • Set EnableAEMonitoring=1 in all Application Server and process scheduler domains in order to enable PeopleSoft instrumentation on-line, in the integration broker, and in Application Engine programs.
  • Implement the PSFTAPI package and trigger described above to set module and action at the start of all processes.
  • Implement Fine-Grained Audit policy and handler to instrument nVision processes if required.
  • Reduce AWR snapshot frequency from the default of 1 hour to every 15 minutes.
  • Enable Application Engine batch timings to at least database and preferably also file by setting AETrace=1152 in all Application Server and Process Scheduler domain configuration files.
  • Enable Cobol statement timings report should be written to log file by setting TraceSQL=128 in all Process Scheduler domain configuration files.
  • Performance Monitor is complex to set up and the delivered analytics are limited.  Nonetheless, the information that can be obtained from the data collected can be worth the effort of configuration to address on-line configuration and performances issues.

Oracle RDBMS Instrumentation

Oracle provides the dbms_application_info package to set certain attributes on the current database session.  These attributes are visible in some of the dynamic performance version, and are picked up by Active Session History (ASH) and can also be seen in AWR reports, Enterprise Manager screens, SQL trace files, and other performance utilities such as EDB360.  The package was first documented in Oracle 7.3.3

"Application developers can use the DBMS_APPLICATION_INFO package to record the name of the executing module or transaction in the database for use later when tracking the performance of various modules… System administrators can also use this information to track resource usage by module. When an application registers with the database, its name and actions are recorded in the V$SESSION and V$SQLAREA views. Your applications should set the name of the module and name of the action automatically each time a user enters that module. The module name could be the name of a … code segment in an … application. The action name should usually be the name or description of the current transaction within a module."

Oracle 7 Tuning, release 7.3.3 ©Oracle 1997, Chapter 23 Registering Applications

See also One of my Favourite Database Things: DBMS_APPLICATION_INFO
If module and action are consistently set to meaningful values as the application executes it is then possible to determine from where SQL statements originated and how different parts of the application are performing.
ASH is separately licenced as part the Diagnostics Pack, that is only available on Enterprise Edition of the database.  However, most PeopleSoft customers running on Oracle do so on Enterprise Edition and are licenced for the diagnostics pack.
PeopleSoft has set client_info since PeopleTools 7.53 so the PeopleSoft operator ID can be associated with the database session, mainly to allow auditing to be done with database triggers.  However, this attribute is not persisted to the ASH data.  However, client_id is also set to the operator ID, and this is collected by ASH.
PeopleTools has set module and action since PeopleTools 8.50, though the exact values used have changed sometimes with the PeopleTools version.
A new application server/process scheduler domain parameter EnableAEMonitoring was introduced in PeopleTools 8.54 to control this behaviour and it is not enabled by default.  If monitoring is not enabled module defaults to the program name and action remains blank.
Recommendation: Set EnableAEMonitoring=1 in all application server and process scheduler domains in order to enable PeopleSoft instrumentation on-line, in the integration broker, and in Application Engine programs.
See also:

On-Line Component/Page Information

In the online application, module and action are set to the component and page name respectively.  In a search dialogue, Action is set to 'xyzzy'.
In the Integration Broker module and action are set to service name and queue name.

Application Engine Step Information

In Application Engine, module is set to a string that includes the name of the Application Engine main program the was called and the sessionid_num for the current process instance number recorded on the process scheduler request table PSPRCSQUE.  For example: PSAE.PSPMCSOSUM.1448
Later on, it may be necessary to use regular expressions in SQL to process this string before profiling the ASH data.
Action is set to string concatenated from the Application Engine program, section, step name, and step type.  For example: PSPMCSOSUM.GETCNT.CNT.P
The program name may be different from that shown in module if one AE program calls another.
Note: Application Engine still doesn't reset ACTION on DO SELECT steps

Cobol/nVision Instrumentation

Cobol, nVision, and SQR do not set module and action.  Instead, they can be set at the start of every program initiated by Process Scheduler using a trigger on PSPRCSRQST.  The first thing a process does when it is initiated is to set the run status on its scheduler request record to 7, indicating that it is processing.  This is visible in the Process Monitor component.  A trigger on this transition can set module and action for the session to the program and process instance number on that row of data.
This technique was used prior to PeopleSoft adding instrumentation to PeopleTools.  It can still be applied to all processes, including Application Engine because any PeopleSoft instrumentation will simply overwrite the value set by the trigger.

Recommendation: Implement the PSFTAPI package and trigger described above in order to set module and action at the start of all processes.
The same technique was also used prior to the introduction of ASH to enable Oracle SQL Trace if a particular run control was specified, and is still occasionally useful.

nVision Layout Instrumentation

One of the challenges of tuning and monitoring nVision is to be able to identify each report being run.   nVision reports always run as the same process name, either NVSRUN for a single report, RPTBOOK for a report book of many reports, or DRILLDWN for a nVision drill-down query.  Knowing the process instance is useful because then we can look up the operator and run control ID
However, it would also be useful to know the report ID being run.  When each individual nVision report starts it queries the runtime parameters from the PS_NVS_REPORT PeopleTools table.  There is no update, so it is not possible to capture this with a DML trigger.  Instead, it is possible to define a fine-grained audit policy on the query and set module within a PL/SQL handler package that is invoked by the fine-grained audit.

Recommendation: Implement Fine-Grained Audit policy and handler to instrument nVision processes if required

2nd Database Connection Information

PeopleSoft programs use a second database connection to increment sequence numbers to minimise the row level locking on such tables.  It is like an AUTONOMOUS_TRANSACTION in PL/SQL.  There is no PeopleSoft instrumentation on this session.  It is possible to use an AFTER LOGON trigger to set client_info, module and action.

Oracle Automatic Workload Repository (AWR) Snapshots

PeopleSoft generates a lot of non-shareable SQL.

  • Dynamically generated SQL, often in PeopleCode, concatenates strings of SQL with bind variables, thus the bind variables become literals in the final SQL statement.  Statements with different literal values are considered to be different statements with different SQL_IDs.
  • Similarly, dynamic Cobol statements result in literal values in the SQL statement.
  • %BIND() variables in Application Engine will also become literal values in the SQL Statement unless the ReUseStatement attribute is set on the AE step, however, this cannot be set on statements with dynamic fragments code are introduced with %BIND(…,NOQUOTES).
  • Application Engine programs that use temporary records can use different non-shared instances of the record in different executions, and this also results in different statements with different SQL_IDs.
  • See also Performance Benefits of ReUse Statement Flag in Application Engine

Consequently, the library caching is not particularly efficient in PeopleSoft, and dynamic SQL statements are often quickly aged out of the library cache.  AWR snapshots can only capture the SQL that is in the library cache at the time of the snapshot.  If the SQL statement, or at least a statement with the same force matching signature or plan hash value, cannot be found the AWR it cannot be identified or analysed.  Therefore, it is advantageous to increase the snapshot frequency on PeopleSoft systems.
Recommendation: Reduce AWR snapshot frequency from the default of 1 hour to every 15 minutes.  This change results in only a modest increase overhead in processing and space on AWR, but it is worth the additional information that is captured.
This advice also applies to Statspack that may be used if you are not licenced for the Diagnostics Pack.

PeopleTools Performance Metrics

Batch Timings

PeopleSoft Application Engine and Cobol programs can emit batch timings reports on successful completion.

Application Engine

Application Engine batch timings are controlled by the AETrace flag in the Process Scheduler domain configuration file and for on-line AE programs in the Application Server domain configuration files.

  • AETrace=128: batch timings report is written to the AE Trace file to
  • AETrace=1024: batch timings are written to PS_BAT_TIMINGS% tables in the database

The overhead of batch timings is negligible while the program is running because it is accounted in memory and only written to file or database when the process completes successfully.
Recommendation: Enable Application Engine batch timings to at least database and preferably also file by setting AETrace=1152 in all Application Server and Process Scheduler domain configuration files.
The trace setting in the process scheduler configuration can be overridden by setting process specific command line parameter overrides in the process definition.  This is often done to set other trace settings, it is also common to see these unintentionally left in place longer than necessary.  If trace is set in this way it should always also set the batch timings flags.
See PeopleBooks -> Development Tools -> Application Engine -> Enabling Application Engine Tracing

Cobol

PeopleSoft Cobol programs can only write batching timings reports to file and not to the database.  This is controlled by a different parameter.

  • TraceSQL = 128: Enable Cobol statement timings report 

Recommendation: Enable Cobol statement timings report should be written to log file by setting TraceSQL=128 in all Process Scheduler domain configuration files.
This trace setting can also be overridden by setting process specific command line parameter overrides in the process definition.  If trace is set in this way it should always also set the batch timings flags.

PeopleSoft Performance Monitor

This provides information about the performance of the PIA including response times for the online transactions.  Metrics are stored in a separate monitoring PeopleSoft system to minimize the effect of measurement intrusion.  It optionally samples the state of each web server, application server and process scheduler collecting operating system and Tuxedo metrics.  It also has a PIA session trace capability.
The sampled data includes the number of busy application server processes and length of inbound Tuxedo service queues.  This data can be used to validate the sizing of the application servers.
Recommendation: Performance Monitor is complex to set up and the delivered analytics are limited.  Nonetheless, the information that can be obtained from the data collected can be worth the effort of configuration to address on-line configuration and performances issues.

Stop The DBA Reading Data in Subtle Ways

The Problem: Dan asked me a question about whether the DBA can be stopped from accessing views such as V$SQL or V$SQL_BIND_CAPTURE with Database Vault because these views can be used to read data from the SGA. I have covered....[Read More]

Posted by Pete On 08/03/19 At 03:41 PM

Append hint

One of the questions that came up on the CBO Panel Session at the UKOUG Tech2018 conference was about the /*+ append */ hint – specifically how to make sure it was ignored when it came from a 3rd party tool that was used to load data into the database. The presence of the hint resulted in increasing amounts of space in the table being “lost” as older data was deleted by the application which then didn’t reuse the space the inserts always went above the table’s highwater mark; and it wasn’t possible to change the application code.

The first suggestion aired was to create an SQL Patch to associate the hint /*+ ignore_optim_embedded_hints */ with the SQL in the hope that this would make Oracle ignore the append hint. This won’t work, of course, because the append hint is not an optimizer hint, it’s a “behaviour” hint.

There are, however, various things that will make the append hint invalid – sometimes to the great surprise of the people using it. The three things I can think of at present are:

  • row-level triggers
  • enabled foreign key constraints
  • non-unique indexes enforcing unique constraints

It’s been some time since I last ran a detailed check so I’m not going to guarantee that the following claims are still true – if it matters to you then it’s easy enough to create a little test of (say) 10,000 rows inserted into a non-empty, indexed table.

  • Row level triggers turn array processing into single row processing, so there’s a significant increase in redo generation.
  • Non-unique indexes enforcing unique constraints have (small but) potentially undesirable effects on the optimizer and on run-time and may turn array processing into single row processing.
  • Foreign key constraint require checking which may have some impact, but doesn’t turn array processing into single row processing.

Of the three options the foreign key constraint seemed to me to be the best strategy to disable the hint with minimal side effects, so my answer was:

“Create a new table with no data and a primary key constraint; add an invisible column to the target table, index it (online) and add a foreign key constraint from the column to the new table”.

My thinking on this was that the foreign key will disable the append hint. The column will always be null which means it will always satisfy the foreign key constraint without being checked and it won’t introduce any index maintenance overheads. Net effect: /*+ append */ is disabled with no extra overheads.

Footnote

The append hint is also ignored if the table is an Index Organized Table (IOT), but changing a heap table to an IOT is probably not a sensible choice if all you want to do is disable the hint – the potential for unexpected client side anomalies is too high, and if the table is heavily indexed the processing overhead for the secondary indexes could be fairly significant.

If I recall correctly the person asking the question said that the “do nothing” trigger option sounded like something they would be most comfortable with and they’d live with the overheads. I understand the feeling – an invisible column with an invisible index and extra foreign key constraint sounds too close to the boundary where mixing and matching simple features ends up hitting some unexpected behaviour (i.e. a bug).

Update 14th March 2019

A note on tweeter has reminded me that distributed transactions introduce another limitation.  Inserting across a database link will work only if the insert is into a local table; the first insert below will append, the second will not.


insert /*+ append */ into local_table select * from table@remote_location;

insert /*+ append */ into table@remote_location select * from local_table;

There are a couple more limitations given in the comments – disable table locking, and adding a security policy (VPD / RLS / FGAC / OLS).

Comparison between #Oracle and #Exasol

After having worked with both databases for quite some time, this is what I consider to be the key differences between Oracle and Exasol. Of course the two have much in common: Both are relational databases with a transaction management system that supports the ACID model and both follow the ANSI SQL standard – both with some enhancements. Coming from Oracle as I do, much in Exasol looks quite familiar. But let’s focus on the differences:

Strengths

Oracle is leading technology for Online Transaction Processing (OLTP). If you have a high data volume with many users doing concurrent changes, this is where Oracle shines particularly.

Exasol is leading technology for analytical workloads. If you want to do real-time ad hoc reporting on high data volume, this is where Exasol shines particularly.

Architecture

Data Format & In-Memory processing

Oracle uses a row-oriented data format, which is well suited for OLTP but not so much for analytical workloads. That’s why Hybrid Columnar Compression (only available on Engineered Systems respectively on Oracle proprietary storage) and the In-Memory Column Store (extra charged option) have been added in recent years.

Exasol uses natively a compressed columnar data format and processes this format in memory. That is very good for analytical queries but bad for OLTP because one session that does DML on a table locks that table against DML from other sessions. Read Consistent SELECT is possible for these other sessions, though.

Oracle was designed for OLTP at times when memory was scarce and expensive. Exasol was designed to process analytical workloads in memory.

Clustering

Oracle started as a non-clustered (single instance) system. Real Application Clusters (RAC) have been added much later. The majority of Oracle installations is still non-clustered. RAC (extra charged option) is rather an exception than the rule. Most RAC installations are 2-node clusters with availability as the prime reason, scalability being rather a side aspect.

Exasol was designed from the start to run on clustered commodity Intel servers. Prime reasons were MPP performance and scalability with availability being rather a side aspect.

Data Distribution

This doesn’t matter for most Oracle installations, only for RAC. Here, Oracle uses a shared disk architecture while Exasol uses a shared nothing architecture, which is optimal for performance because every Exasol cluster node can operate on a different part of the data in parallel. Drawback is that after adding nodes to an Exasol cluster, the data has to be re-distributed.

With Exadata, Oracle tries to compensate the performance disadvantage of the shared disk architecture by enabling the storage servers to filter data locally for analytical workloads. This approach leads to better performance than Oracle can deliver on other (non-proprietary) platforms.

Availability & Recoverability

Clearly, Oracle is better in this area. A non-clustered Oracle database running in archive log mode will enable you to recover every single committed transaction you did since you took the last backup. With Exasol, you can only restore the last backup and all changes since then are lost. You can safeguard an Oracle database against site failure with a standby database at large distance without performance impact. Exasol doesn’t have that. With RAC, you can protect an Oracle database against node failure. The database stays up (the Global Resource Directory is frozen for a couple of seconds, though) upon node failure with no data loss.

If an Exasol cluster node fails, this leads to a database restart. Means no availability for a couple of seconds and all sessions get disconnected. But also no data loss. Optionally, Exasol can be configured as Synchronous Dual Data Center – similar to Oracle’s Extended RAC.

Complexity & Manageability

I realized that there’s a big difference between Exasol and Oracle in this area when I was teaching an Exasol Admin class recently: Some seasoned Oracle DBAs in the audience kept asking questions like “We can do this and that in Oracle, how does that work with Exasol?” (e.g. creating Materialized Views or Bitmap Indexes or an extra Keep Cache) and my answer was always like “We don’t need that with Exasol to get good performance”.

Let’s face it, an Oracle database is probably one of the most complex commercial software products ever developed. You need years of experience to administer an Oracle database with confidence. See this recent Oracle Database Administration manual to get an impression. It has 1690 pages! And that’s not yet Real Application Clusters, which is additionally 492 pages. Over 2100 pages of documentation to dig through, and after having worked with Oracle for over 20 years, I can proudly say that I actually know most of it.

In comparison, Exasol is very easy to use and to manage, because the system takes care of itself largely. Which is why our Admin class can have a duration of only two days and attendees feel empowered to manage Exasol afterwards.

That was intentionally so from the start: Exasol customers are not supposed to study the database for years (or pay someone who did) in order to get great performance. Oracle realized that being complex and difficult to manage is an obstacle and came out with the Autonomous Database – but that is only available in the proprietary Oracle Cloud.

Performance

Using comparable hardware and processing the same (analytical) workload, Exasol outperforms any competitor. That includes Oracle on Exadata. Our Presales consultants regard Exadata as a sitting duck, waiting to get shot on a POC. I was personally shocked to learn that, after drinking the Oracle Kool-Aid myself for years.

In my opinion, these two points are most important: Exasol is faster and at the same time much easier to manage! I mean anything useless could be easy to manage, so that’s not an asset on its own. But together with delivering striking performance, that’s really a big deal.

Licensing

This is and has always been a painpoint for Oracle customers: The licensing of an Oracle database is so complex and fine granular that you always wonder “Am I allowed to do this without violating my license? Do we really need these features that we paid for? Are we safe if Oracle does a License Audit?” With Exasol, all features are always included and the two most popular license types are totally easy to understand: You pay either for the data volume loaded into the cluster or for the amount of memory assigned to the database. No sleepless nights because of that!

Cloud

This topic becomes increasingly important as many of our new customers want to deploy Exasol in the cloud. And you may have noticed that Oracle pushes going cloud seriously over the last years.

Exasol runs with all features enabled in the cloud: You can choose between Amazon Web Services, (AWS), Microsoft Azure and ExaCloud

AWS

This is presently the most popular way our customers run Exasol in the cloud. See here for more details.

MS Azure

Microsoft’s cloud can also be used to run Exasol, which gives you the option to choose between two major public cloud platforms. See here for more details.

ExaCloud

Hosted and managed by Exasol, ExaCloud is a full database-as-a-service offering. See here for more details.

Hybrid Exasol deployments that combine cloud with on-prem can also be used, just depending on customer requirements.

Oracle offers RAC only on the Oracle Cloud platform, not on public clouds. Various other features are also restricted to be available only in Oracle’s own cloud. The licensing model has been tweaked to favor the usage of Oracle’s own cloud over other public clouds generally.

Customer Experience

Customers love Exasol, as the recent Dresner report confirms. We get a perfect recommendation score. I can also tell that from personal encounters: Literally every customer I met is pleased with our product and our services!

Conclusion

Oracle is great for OLTP and okay for analytical workloads – especially if you pay extra for things like Partitioning, RAC, In-Memory Column Store and Exadata. Then the performance you get for your analytical workload might suit your present demand.

Exasol is totally bad for OLTP but best in the world for analytical workloads. Do you think your data volume and your analytic demands will grow?