Search

Top 60 Oracle Blogs

Recent comments

Oakies Blog Aggregator

Oracle on Azure- Options, Options, Options

I’ve been very busy allocating 60% of my time towards Oracle on Azure migrations.  The biggest challenge right now isn’t getting Oracle on Azure, but keeping my percentage of time allocated to only 60%.

I love Oracle technology-  yes, a Microsoft employee who LOVES Microsoft and Azure cloud, also LOVES. Oracle.  I won’t apologize and I know I’m not the only one.   Every company I’ve worked for, even when I worked at Oracle, required my multi-platform skill set.  I’ve never worked anywhere that had ONLY Oracle, only Linux, only Microsoft, only SQL Server, etc.  I saw it only as an opportunity for all of us and I really don’t have the time to worry if there are individuals are upset by it.

The More Things Change…

Running Oracle on Azure VM environments aren’t that different from running Oracle on VMs in your on-premises for a DBA.  The DBAs and developers that I work with still have their jobs, still work with their favorite tools and also get the chance to learn new skills such as cloud administration.

https://dbakevlar.com/wp-content/uploads/2019/10/B450580E-3008-4C6E-8BEC... 300w, https://dbakevlar.com/wp-content/uploads/2019/10/B450580E-3008-4C6E-8BEC... 768w" sizes="(max-width: 800px) 100vw, 800px" />

Adding a viable cloud  such as Azure results in infrastructure, features and applications to create a full ecosystem that removes demands such as a data center, costs for hardware that will only age, less easier scaling.

When Complex, Automate

To help simplify the process, I am constantly authoring scripts to deploy front he Azure cloud.  I write all my scripts in BASH, so if they want to deploy an Oracle VM, I make it really simple with one script that asks what version of Oracle is desired and the script sets everything else up so the DBA can get to the important part of creating a database.

I love that I can work with customers and open up this new world for them.  I really, really do.  Many customers love their Oracle databases and if they want to keep them and have a desire to move to Azure, I can migrate their Oracle databases over with everything else.  I don’t feel pressure to migrate them off of Oracle.

I’ll be meeting with five customers in the next 24 hrs, in person, to present on all the Oracle on Azure that we can support, including the Oracle and Azure cloud partnership, Oracle on Azure VMs, options for Oracle applications and middleware and architecture changes when moving to the cloud.  Its going to be AWESOME.



Tags:  , ,


Del.icio.us



Facebook

TweetThis

Digg

StumbleUpon




Copyright ©  [Oracle on Azure- Options, Options, Options], All Right Reserved. 2019.

Which Version Number Corresponds Which PeopleTools Object?

Recently somebody asked me "Our “CRM” values in PSLOCK and PSVERSION are growing tremendously and we don’t know why. We will be opening a case with Oracle but … do you know what that “CRM” value is related to? We are not using the CRM product in our PeopleSoft installation."
There is some documentation on the Oracle Support website, but it is not exhaustive and may not be completely up to date.

The page in the PTRef utility that describes the relationship of version numbers to PeopleTools tables is one of the few static pages in the tool.  I have now updated it with the information in the above Oracle support notes, but there are other version numbers.
In the previous blog post, I showed how to increment version numbers before updating PeopleTools objects.  I knew RDM (the Record Definition Manager) is the OBJECTTYPENAME for PSRECDEFN because I worked that out by tracing Application Designer while it saved a record change.  That remains the only completely reliable way to determine the relationship.
However, version number matching is also a useful technique, though when it does not provide a completely definitive answer, it dramatically cuts down the amount of work then necessary.
I have written a little PL/SQL code, that is delivered with PTRef, that extracts the maximum version number for each PeopleTools table in PeopleSoft (other than the PS%DEL) tables and stores it on a working storage table (I used the PLAN_TABLE because it is always there on an Oracle database).  Then you can compare the version number on PSLOCK/PSVERSION with the maximum version on the PeopleTools object.
If the version number is 1, you can’t do the match because the version number has never been incremented, at least since it was last reset by the VERSION application engine.
If the version is only updated occasionally you may have some trouble with duplicate matches. In my example, 3 tables have a maximum version number of 80, while only one version number is 80.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 100%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">RECNAME                   MAX        CNT
------------------ ---------- ----------
PSAEAPPLDEFN 80 3504
PSMSGNODEDEFN 80 78
PSOPRVERDFN 80 1468

I would guess that OPRVM matches PSOPRVERDFN, and the above support notes confirm this, but otherwise, you would have to check that manually with trace.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 100%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">OBJECTTY    VERSION
-------- ----------
OPRVM 80

Higher version numbers are easier to match because they are less likely to have duplicate matches.
So to return to the original question, what is CRM?  In my sample system, version numbers CRM and SCATM are both 3.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 100%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">OBJECTTY    VERSION
-------- ----------
CRM 3
SCATM 3

However, only PSCONTDEFN has a maximum version of 3.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 100%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">RECNAME                   MAX        CNT
------------------ ---------- ----------
PSCONTDEFN 3 7567

Again not a perfect match, but again Oracle Support Note 664848.1 confirms that CRM corresponds to PSCONTDEFN.  CRM stands for Content Registry Manager.
So the question now becomes what is updating the content definitions, and hence increasing the version number?  It turned out to be an entity registry synchronisation process that was run daily.
It is perfectly legitimate for many updated rows on the PeopleTools table can be given the same version number.  The version number does not need to be incremented again for each row being updated, but then the row-level locks on PSVERSION and PSLOCK created by their updates must be held until the update on the PeopleTools table is committed.  That can increase contention on the version number update.  The alternative is to commit after each update and then increment the version numbers again.  Many PeopleSoft processes do exactly that, and it can, in turn, lead to massive increase in some version numbers.

Reading the Active Session History Compressed Export File in eDB360/SQLd360 as an External Table

I am a regular user of SQLDB360 (the single distribution for Carlos Sierra's eBD360, and Mauro Pagano's SQLd360 tools) for remote performance analysis.  eDB360 reports on the whole database, SQLd360 reports on a single SQL ID.  eDB360 also runs SQLd360 reports for the top SQL statements.  Both tools extract ASH data to a flat-file.  Originally, it was intended that these were loaded with the eAdam utility, but eDB360 no longer executes the eAdam scripts itself.
The eDB360/SQLd360 output zip files contain an export of the ASH data that is spooled from the database in SQL Plus, compressed with gzip, and put into a tarball.  You should find a single .tar file that contains several gzipped files.  In eDB360, dba_hist_active_sess_history.txt.gz is simply the result of:

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 100%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">SELECT * FROM dba_hist_active_sess_history

In SQLd360 it will be for just the specified SQL_ID:

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 100%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">SELECT * FROM dba_hist_active_sess_history
WHERE sql_id = '…'

The exact structure of the file varies with the version of Oracle, as more columns have been added to the view with successive releases.  There used to be an eAdam structure control file from which the DDL for a repository can be generated, but it is no longer generated.  However, it is easy to deal with this manually because many of the reports query DBA_HIST_ACTIVE_SESS_HISTORY and you will find a description of the view in each report.

Loading Compressed File as an External Table

Two database directories must be created in the database used to read the data file.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 100%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">GRANT CREATE ANY DIRECTORY TO scott;
CREATE OR REPLACE DIRECTORY edb360 as '/media/sf temp';
CREATE OR REPLACE DIRECTORY exec_dir AS '/usr/bin';
GRANT EXECUTE ON DIRECTORY exec_dir TO scott;
  • Directory edb360 points to where the data file is located.  
  • The external table will load the compressed zip file directly, uncompressing it on the fly, without writing an uncompressed version of the file to disk, using the preprocessor facility.  Directory exec_dir is the location of the Unix zcat executable. 
    • N.B. It is important that this is the actual directory where the executable resides, not the linked /bin directory

Now you can create an external table.  I have created it in the SCOTT schema rather than SYS.
There are a few things to bear in mind

  1. DISABLE_DIRECTORY_LINK_CHECK is used to suppress the check that the data file being loaded is not be referenced from a directory link.  In my case, this is necessary because I am referencing a directory on a host machine from within a Linux VM running in VirtualBox.  However, this does not suppress the check for files executed by the preprocessor.  exec_dir must not point to a linked directory.
  2. The compressed data file is pre-processed (effectively piped) through the zcat command.  This is like importing a compressed dump file through a Unix FIFO
  3. The # character has had to be removed from all column names.
  4. The format of date fields in the data file must be specified explicitly to convert them properly.
  5. The name of the compressed file to be imported is specified in the LOCATION clause.
  6. The reject limit has been deliberately set to 100.  A small non-zero value.  The file is exported from DBA_HIST_ACTIVE_SESS_HISTORY.  The actual columns change from version to version, so if the definition of the external table does not match the definition in the database you may get errors.  Limiting the rejections to 100 means that the .log and .bad files are easier to manage when resolving the error.  When the definitions match no rows should be rejected.
#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 100%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">DROP TABLE scott.ash_load PURGE;

CREATE TABLE scott.ash_load
(SNAP_ID NUMBER
,DBID NUMBER
,INSTANCE_NUMBER NUMBER
,SAMPLE_ID NUMBER
,SAMPLE_TIME TIMESTAMP(3)
,SESSION_ID NUMBER
,SESSION_SERIAL NUMBER
,SESSION_TYPE VARCHAR2(10)
,FLAGS NUMBER
,USER_ID NUMBER
,SQL_ID VARCHAR2(13)
,IS_SQLID_CURRENT VARCHAR2(1)
,SQL_CHILD_NUMBER NUMBER
,SQL_OPCODE NUMBER
,SQL_OPNAME VARCHAR2(64)
,FORCE_MATCHING_SIGNATURE NUMBER
,TOP_LEVEL_SQL_ID VARCHAR2(13)
,TOP_LEVEL_SQL_OPCODE NUMBER
,SQL_PLAN_HASH_VALUE NUMBER
,SQL_PLAN_LINE_ID NUMBER
,SQL_PLAN_OPERATION VARCHAR2(64)
,SQL_PLAN_OPTIONS VARCHAR2(64)
,SQL_EXEC_ID NUMBER
,SQL_EXEC_START DATE
,PLSQL_ENTRY_OBJECT_ID NUMBER
,PLSQL_ENTRY_SUBPROGRAM_ID NUMBER
,PLSQL_OBJECT_ID NUMBER
,PLSQL_SUBPROGRAM_ID NUMBER
,QC_INSTANCE_ID NUMBER
,QC_SESSION_ID NUMBER
,QC_SESSION_SERIAL NUMBER
,PX_FLAGS NUMBER
,EVENT VARCHAR2(64)
,EVENT_ID NUMBER
,SEQ NUMBER
,P1TEXT VARCHAR2(64)
,P1 NUMBER
,P2TEXT VARCHAR2(64)
,P2 NUMBER
,P3TEXT VARCHAR2(64)
,P3 NUMBER
,WAIT_CLASS VARCHAR2(64)
,WAIT_CLASS_ID NUMBER
,WAIT_TIME NUMBER
,SESSION_STATE VARCHAR2(7)
,TIME_WAITED NUMBER
,BLOCKING_SESSION_STATUS VARCHAR2(11)
,BLOCKING_SESSION NUMBER
,BLOCKING_SESSION_SERIAL NUMBER
,BLOCKING_INST_ID NUMBER
,BLOCKING_HANGCHAIN_INFO VARCHAR2(1)
,CURRENT_OBJ NUMBER
,CURRENT_FILE NUMBER
,CURRENT_BLOCK NUMBER
,CURRENT_ROW NUMBER
,TOP_LEVEL_CALL NUMBER
,TOP_LEVEL_CALL_NAME VARCHAR2(64)
,CONSUMER_GROUP_ID NUMBER
,XID RAW(8)
,REMOTE_INSTANCE NUMBER
,TIME_MODEL NUMBER
,IN_CONNECTION_MGMT VARCHAR2(1)
,IN_PARSE VARCHAR2(1)
,IN_HARD_PARSE VARCHAR2(1)
,IN_SQL_EXECUTION VARCHAR2(1)
,IN_PLSQL_EXECUTION VARCHAR2(1)
,IN_PLSQL_RPC VARCHAR2(1)
,IN_PLSQL_COMPILATION VARCHAR2(1)
,IN_JAVA_EXECUTION VARCHAR2(1)
,IN_BIND VARCHAR2(1)
,IN_CURSOR_CLOSE VARCHAR2(1)
,IN_SEQUENCE_LOAD VARCHAR2(1)
,CAPTURE_OVERHEAD VARCHAR2(1)
,REPLAY_OVERHEAD VARCHAR2(1)
,IS_CAPTURED VARCHAR2(1)
,IS_REPLAYED VARCHAR2(1)
,SERVICE_HASH NUMBER
,PROGRAM VARCHAR2(64)
,MODULE VARCHAR2(64)
,ACTION VARCHAR2(64)
,CLIENT_ID VARCHAR2(64)
,MACHINE VARCHAR2(64)
,PORT NUMBER
,ECID VARCHAR2(64)
,TM_DELTA_TIME NUMBER
,TM_DELTA_CPU_TIME NUMBER
,TM_DELTA_DB_TIME NUMBER
,DELTA_TIME NUMBER
,DELTA_READ_IO_REQUESTS NUMBER
,DELTA_WRITE_IO_REQUESTS NUMBER
,DELTA_READ_IO_BYTES NUMBER
,DELTA_WRITE_IO_BYTES NUMBER
,DELTA_INTERCONNECT_IO_BYTES NUMBER
,PGA_ALLOCATED NUMBER
,TEMP_SPACE_ALLOCATED NUMBER
)
ORGANIZATION EXTERNAL
(TYPE ORACLE_LOADER
DEFAULT DIRECTORY temp
ACCESS PARAMETERS
(RECORDS DELIMITED BY newline
DISABLE_DIRECTORY_LINK_CHECK
PREPROCESSOR exec_dir:'zcat'
FIELDS TERMINATED BY '<,>'
MISSING FIELD VALUES ARE NULL
REJECT ROWS WITH ALL NULL FIELDS
NULLIF = BLANKS
(SNAP_ID,DBID,INSTANCE_NUMBER,SAMPLE_ID
,SAMPLE_TIME CHAR(80) date_format TIMESTAMP MASK "YYYY-MM-DD/HH24:mi:ss.ff"
,SESSION_ID,SESSION_SERIAL,SESSION_TYPE
,FLAGS,USER_ID
,SQL_ID
,IS_SQLID_CURRENT
,SQL_CHILD_NUMBER
,SQL_OPCODE,SQL_OPNAME
,FORCE_MATCHING_SIGNATURE,TOP_LEVEL_SQL_ID,TOP_LEVEL_SQL_OPCODE
,SQL_PLAN_HASH_VALUE,SQL_PLAN_LINE_ID
,SQL_PLAN_OPERATION,SQL_PLAN_OPTIONS
,SQL_EXEC_ID
,SQL_EXEC_START CHAR(80) date_format TIMESTAMP MASK "YYYY-MM-DD/HH24:mi:ss.ff"
,PLSQL_ENTRY_OBJECT_ID,PLSQL_ENTRY_SUBPROGRAM_ID
,PLSQL_OBJECT_ID,PLSQL_SUBPROGRAM_ID
,QC_INSTANCE_ID,QC_SESSION_ID,QC_SESSION_SERIAL,PX_FLAGS
,EVENT,EVENT_ID,SEQ
,P1TEXT,P1,P2TEXT,P2,P3TEXT,P3
,WAIT_CLASS,WAIT_CLASS_ID,WAIT_TIME
,SESSION_STATE,TIME_WAITED
,BLOCKING_SESSION_STATUS,BLOCKING_SESSION,BLOCKING_SESSION_SERIAL
,BLOCKING_INST_ID,BLOCKING_HANGCHAIN_INFO
,CURRENT_OBJ,CURRENT_FILE,CURRENT_BLOCK,CURRENT_ROW
,TOP_LEVEL_CALL,TOP_LEVEL_CALL_NAME
,CONSUMER_GROUP_ID,XID,REMOTE_INSTANCE,TIME_MODEL
,IN_CONNECTION_MGMT,IN_PARSE,IN_HARD_PARSE,IN_SQL_EXECUTION
,IN_PLSQL_EXECUTION,IN_PLSQL_RPC,IN_PLSQL_COMPILATION
,IN_JAVA_EXECUTION,IN_BIND,IN_CURSOR_CLOSE,IN_SEQUENCE_LOAD
,CAPTURE_OVERHEAD,REPLAY_OVERHEAD,IS_CAPTURED,IS_REPLAYED
,SERVICE_HASH,PROGRAM,MODULE,ACTION,CLIENT_ID,MACHINE,PORT,ECID
,TM_DELTA_TIME,TM_DELTA_CPU_TIME,TM_DELTA_DB_TIME
,DELTA_TIME
,DELTA_READ_IO_REQUESTS,DELTA_WRITE_IO_REQUESTS
,DELTA_READ_IO_BYTES,DELTA_WRITE_IO_BYTES,DELTA_INTERCONNECT_IO_BYTES
,PGA_ALLOCATED,TEMP_SPACE_ALLOCATED
))
LOCATION ('dba_hist_active_sess_history.txt.gz')
) REJECT LIMIT 100
/

Now the external table can be queried in exactly the same way that you would query DBA_HIST_ACTIVE_SESS_HISTORY in order to profile DB time, although bear in mind than an external table can only be full scanned.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 100%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">set timi on lines 200 pages 99
COLUMN force_matching_signature FORMAT 99999999999999999999
COLUMN min(sample_time) FORMAT a26
COLUMN max(sample_time) FORMAT a26
SELECT min(sample_time), max(sample_time), COUNT(*) samples
FROM scott.ash_load
/

MIN(SAMPLE_TIME) MAX(SAMPLE_TIME) SAMPLES
-------------------------- -------------------------- ----------
21-JUN-19 05.00.36.115 AM 21-JUL-19 01.00.00.823 PM 1213333

Understanding PeopleTools Object Version Numbers

I was recently asked a question about PeopleSoft version numbers, but before I address that directly, I think it would be useful to explain what is their general purpose.

Caching

The PeopleSoft data model and application are mostly stored in the database in PeopleTools tables.  These tables are queried as the application executes.  For example, when you open a component, the component and pages, including all the PeopleCode, the definition of any records used, and so on have to be loaded into the component buffer.  Ultimately this information comes from the PeopleTools tables.  To save the overhead of repeatedly querying these tables, PeopleSoft caches this data locally in physical files the application server and process scheduler domains.  The application servers also cache some of this information in memory to save visiting the local physical cache.  Application Designer also maintains a physical cache.
Over time, as the application executes, the cache files build up.  Occasionally, when it is necessary to delete the cache files and then it becomes clear just how significant is the overhead of the PeopleTools queries as a period of poor performance is seen as the application builds up fresh cache files.
Physical cache files are created in directories in the application server and process scheduler Tuxedo domains.  By default, each process maintains its own private cache.  Separate directories of cache files are created for each type of PeopleSoft server process in each domain.    Pairs of cache files are created in each directory for each object type as needed.  There is also a CACHE.LOK file in each directory that is used to ensure that only one process is accessing that cache directory concurrently.
It is possible to run with a shared physical cache, but then it is read-only and must be pre-generated.  It is very rare to see this implemented, because everyone expects to continuously deliver changes over time, and if you had a shared cache you would have to deliver an updated set of shared cache file to every domain every time you delivered a new PeopleTools object.
The cache files come in pairs.  The name of the cache files is the Object Type Name.  This corresponds to the OBJECTTYPENAME on the PSLOCK and PSVERSION tables.  The .DAT file contains the data to be cached.  The .KEY file is an index for the .DAT file, and it also holds the version number of the cached object.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 100%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">
-rw------- 1 psadm2 oracle 5228492 Jun 12 06:37 RDM.DAT
-rw------- 1 psadm2 oracle 69120 Jun 12 06:37 RDM.KEY
-rw------- 1 psadm2 oracle 0 Oct 26 2015 ROLM.DAT
-rw------- 1 psadm2 oracle 24192 Oct 26 2015 ROLM.KEY
-rw------- 1 psadm2 oracle 0 Oct 26 2015 RSM.DAT
-rw------- 1 psadm2 oracle 24192 Oct 26 2015 RSM.KEY

Version Numbers

Version numbers track when a cached PeopleTools object has been changed, either by Application Designer, or a change in configuration, or the application.  The version numbers are sequences generated from two PeopleTools tables PSLOCK and PSVERSION that hold the highest version number for each type of object.  These two tables have the same structure.
#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 100%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">SQL> desc psversion
Name Null? Type
----------------------------------------- -------- ----------------------------
OBJECTTYPENAME NOT NULL VARCHAR2(8 CHAR)
VERSION NOT NULL NUMBER(38)

There are now over 100 different version numbers, each with a specific object type name that each track a specific PeopleTools object.  There is a global version number, with the object type name of SYS, that is incremented whenever any other version number is incremented.
I have no idea why two identical tables of version numbers were created.  I can see no reason for this, but it has been like this since the version numbers were changed (if I remember correctly) in PeopleTools 7.  In early versions of PeopleTools, not all version numbers were on both tables, but in at least PeopleTools 8.55 only one object type appears on PSVERSION and not PSLOCK.
When an object is changed, the object and global version numbers are incremented, and the incremented object version number is recorded on the object in the PeopleTools table.  The version number on the object is also stored in the physical cache files when the object is cached.  If the version on the database is higher than that in the cache file, then the PeopleSoft process knows it must query the latest version from the PeopleTools table and update the cache file.

How to Update Version Numbers

It is not generally recommended, nor strictly speaking supported, to update PeopleTools tables directly with SQL.  Apart from the risk of updating them incorrectly, or to invalid values, you also need to ensure that the changes are picked up by PeopleTools processes and that they do not simply continue to read the cached data.  However, occasionally, it is the pragmatic way to doing something.  
Here is an example from Chapter 5 of PeopleSoft for the Oracle DBA that shows how to maintain version numbers so the change is picked up by PeopleTools processes.  I want to mark alternate search key indexes as unique where there is a unique key on a record because they are unique because the unique key is a subset of their columns.  Then Application Designer will build the indexes as unique.  
#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 100%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">UPDATE psversion SET version = version + 1
WHERE objecttypename IN('SYS','RDM');

UPDATE pslock SET version = version + 1
WHERE objecttypename IN('SYS','RDM');

UPDATE psrecdefn
SET version = (
SELECT version FROM psversion WHERE objecttypename = 'RDM')
WHERE recname = '';

UPDATE psindexdefn a
SET a.uniqueflag = 1
WHERE a.uniqueflag = 0
AND a.indextype = 3
AND EXISTS(
SELECT 'x'
FROM psindexdefn k
WHERE k.recname = a.recname
AND k.indexid = '_'
AND k.indextype = 1
AND k.uniqueflag = 1)
AND a.recname = '';

I am updating a PeopleTools object (PSINDEXDEFN) that doesn't have a version number, but its parent is PSRECDEFN that does have a version number.  I happen to know that object type RDM (the Record Definition Manager) generates the version number for PSRECDEFN.  I found that out by tracing Application Designer while it saved a record change.  That is the only completely reliable method to determine which sequence is used for which record.  However, I will discuss another less onerous matching method in a subsequent blog post.

I must increment the RDM and SYS version numbers and write the new RDM version number to the updated rows on PSRECDEFN.  Next time a PeopleSoft process needs to read a record definition it will check the version numbers.  The increment of the SYS object tells PeopleSoft than an object number has changed, and then it will detect that the RDM version number has changed so it has to reload and cache objects with version numbers greater than the last cached version number for that object.

Solving accountability for ssh/linux servers without creating personal accounts

One of the continuing issues that I see is accountability (who did what on what server) on Linux or other operating systems that are using ssh.

The traditional solution for this problem is personal accounts, which provides logon and logoff registration.
Personal accounts often do have the least amount of rights which means typical administration tasks can not be done with them, in order to actually do something you need to sudo to an application account or root, which is logged by sudo, on enterprise linuxes to /var/log/secure. So some important facts are logged (logon/logoff/switching users), but a lot of important things are not logged at all, or maybe can be indirectly derived by combining data of operating system audit together with application specific logging.

With cloud, everything gotten a lot more dynamic, and it’s really common for me to see client cloud environments where there aren’t personal accounts at all, everybody just uses the common cloud user (ec2-user, opc, etc.). Needless to say, this is an error and an issue if you want to build a professional setup. Maybe it’s me and I am unlucky, but I have a sneaking suspicion this is actually quite common.

First things first: if you are serious about security and want to actually track what is happening, you MUST have central logging. Otherwise, if a user can switch to root, it can stop the logging and overwrite or change the evidence. Of course with central logging when a user can switch to root, the logging can still be stopped, but then at least you should have proof who did that, because the logging on another machine can not be changed. If not, you might as well not take any action at all, because you can’t be sure what you see is true.

Personal accounts: I think in almost any situation that is even a tiny bit dynamic, creating and removing users locally on each server is not a sustainable practice; even if you can get it up to date, you need to continuously maintain it. Some cloud and some on-premises situations have LDAP. By centralising the user administration, user management becomes manageable. However, the amount of audit information you get from it is limited (the aforementioned logon, logoff and switch user administration). It allows you to grant access to an individual server per user. One downside is that you need to add operating system packages to provide the LDAP authentication functionality, and obviously you explicitly need to configure each server to authenticate users with an LDAP server.

There is another way, which is way less intrusive, which is using native secure shell daemon functionality (so no extra installs): that is authenticate CA (certificate authority) signed user keys. To use this, you need an additional key pair, which serves as the CA keys, put the CA public key on each server and point the secure shell daemon to this key being the CA key. This serves as an additional authentication, so all existing secure shell daemon functionality, like username/password authentication and public key authentication using the authorized_keys file is still valid. This authentication mechanism is also self contained: there is no communication needed to an external authentication service (like an LDAP server); it uses a key that is placed on the server. The downside is that it’s an all-or-nothing setting, which means that you get access to all hosts which have a certain CA key set, you can’t disable a certain host for a certain user.

The authentication of with using CA key based authentication is based on the user’s signed key instead of named account. This is a huge difference from using LDAP/named accounts, where the authentication is done based on the actual account name. If you let users sign their key with the CA key theirselves, you need to keep track of the identity of who signed the key, because that declares the actual identity, not the account that logged on to a server.

This functionality is provided by native ssh key management software. If you want to use authentication of users based on CA signed keys, you need to manually securely store and protect CA private key, and keep an impeccable registration of who signed which key, because that declares the actual user identity. For the user there isn’t a lot of difference, its private key is still the ultimate declaration of its true identity (as you will see, this mechanism requires a user to specify its public and private keys).

Luckily, there is a tool that provides the key signing functionality together with auditing (the registration), and secure authentication of a user: Hashicorp vault.

It’s my advise to let users use certificates for authentication to vault rather than passwords, additional to their ssh key pair. Probably you can write an entire blogpost about it, but passwords are cumbersome.

1. User side initial setup.
A user needs two key pairs, one for authenticating to vault using a pem encoded pair:

$ openssl req -newkey rsa:2048 -new -nodes -x509 -days 365 -keyout fritshoogland_key.pem -out fritshoogland_pub.pem

This will ask some questions, like country, state, locality, etc.
It’s important that a user fills out its true name with ‘common name’ and password with ‘password’.
As an administrator you can check the contents of the public key with ‘openssl x509 -in PUBKEY.pem -text’.

The other one is the regular ssh keypair:

$ ss-keygen -t rsa

It will ask for a password, and save the files in ~/.ssh, ‘id_rsa’ for the private key, and ‘id_rsa.pub’ for the public key.

It is important to realise that the ultimate identity is dependent on the private keys of both key pairs, which should be kept securely by the user and never handed over anyone else. This setup and any normal usage ever only needs the public keys to be provided publicly, which is the function of a public key, hence the name.

2. vault side initial setup.
There are two mechanisms that are going to be used inside vault, which is an authentication mechanism based on certificates, and a secrets service for ssh.
The certificate authentication facility needs to be setup using a vault root/admin account. The first thing is to setup the certificate authentication method:

$ vault auth enable cert

It only needs to be done once for a given path, and the authentication mechanism can optionally be mounted at a different path by adding ‘-path=’ if you want to use separate the certificates from others, because this allows the same authentication mechanism to be used more than once.

The next thing that needs to be setup is the ssh secrets service (this too can be mounted at a different path by adding ‘-path’):

$ vault secrets enable ssh

Now we can extract the CA public key from this setup, which is used as the CA public key on the servers to validate the user keys signed with it:

$ vault write ssh/config/ca > trusted-user-ca-key.pem

The next thing we need is a signing role that will be used to sign a user key with the CA. This allows you to specify any properties you want the user CA signed key to hold, like ttl (time to live) and allowed_users:

echo "{
  \"allow_user_certificates\": true,
  \"allowed_users\": \"vagrant\",
  \"default_extensions\": [
    {
      \"permit-pty\": \"\"
    }
  ],
  \"key_type\": \"ca\",
  \"default_user\": \"vagrant\",
  \"ttl\": \"24h\"
}" | vault write ssh-client/roles/sign-role -

There are a couple of limitations set here:
Line 3: this specifies that only a user named ‘vagrant’ is allowed (which is the default user in my lab, alike ec2-user, opc, etc.).
Line 6: permit-pty is needed to let the session get access to the terminal.
Line 11: the TTL, time to live, of the generated signed key is 24 hours. This means every day a new signed key needs to be obtained. Because a user should be able to do this itself, this doesn’t generate an administrative burden. This does allow you to disable access for any user using this signed keys authentication method in a day.

The last thing in this step is to create a policy that allows access to the sign role. This policy must be granted to a user (certificate) so it can actually perform the CA key signing. In order to deny access, you can simply remove this policy from a certificate, which then disables the ability to perform the CA key signing.

echo "path \"ssh/sign/sign-role\" {
  capabilities = [\"create\", \"update\"]
} " | vault policy write policy_ssh-client -

3. ssh daemon setup to allow CA signed key authentication
The changes to the servers that must allow CA signed authentication are really modest.

First of all, the CA key that is obtained in step 2 must be transferred to the server. Currently I am doing this in the /etc/ssh directory, where some other keys are stored too, together with the other ssh settings.

The second thing that needs to be done, is the ssh daemon configuration file, /etc/ssh/sshd_config must be changed to include the setting TrustedUserCAKeys, which must be set to trusted-user-ca-key.pem file:

TrustedUserCAKeys trusted-user-ca-key.pem

After this change, the ssh daemon must be restarted to pick up the configuration change (systemctl restart ssh). This should not interrupt current sessions.

4. Enable a user by uploading its key into vault and assign the ssh policy
The next step is to have a user that must be provided access upload it’s certificate public key in vault and bound to the policy . This of course must be done by the vault administrator. This task does two things at the same time: a) upload the key (certificate=) and b) attach the policy policy_ssh-client (policies=):

$ vault write auth/cert/certs/frits.hoogland display_name="Frits Hoogland" certificate=@fritshoogland_pub.pem policies="policy_ssh-client"

Please mind only the public key is needed.
Also mind that at this point, no access is possible for the user at this point, vault has been configured to allow the user to be authenticated by the key pair for which the public key fritshoogland_pub.pem is uploaded, but no user public key is signed yet.

5. Let the user sign its key with the CA key using vault
In order to let a user sign its key, it can use its public and private certificate pair for authentication, and let vault sign the public key:

$ vault write -field=signed_key ssh/sign/sign-role public_key=@$HOME/.ssh/id_rsa.pub -client-cert=fritshoogland_pub.pem -client-key=fritshoogland_key.pem > ~/.ssh/signed-cert.pub

Now a signed key has been created, it can be used to log on to a server that has the CA key that signed this key set. To do this, simply specify ‘-i’ (for identity) a second time (after the private key):

$ ssh -i ~/.ssh/id_rsa -i ~/.ssh/signed-cert.pub vagrant@192.168.66.51
Last login: Sun Oct 13 13:09:03 2019 from 192.168.66.50

As you can see, the amount of work for a user needing access using this mechanism is really modest, it’s a single command to obtain/refresh the signed key. With signed keys that last 24 hours, you need to obtain a new signed key every day.

The audit log of vault will tell which certificate authenticated and shows the public ssh key that is signed, and the response, which contains the signed key and the serial of the signed key (use the ‘jq’ executable to format the json audit log; cat /var/log/vault/audit.log | jq):

{
  "time": "2019-10-13T13:05:36.524628543Z",
  "type": "response",
  "auth": {
    "client_token": "s.yr6lGhRxwK8ySc4cgEC6HBIi",
    "accessor": "w4Q5bVyabJ9c3hfgeG1gdyuS",
    "display_name": "cert-Test User",
    "policies": [
      "default",
      "policy_ssh-client"
    ],
    "token_policies": [
      "default",
      "policy_ssh-client"
    ],
    "metadata": {
      "authority_key_id": "f5:02:c8:54:6b:bd:36:66:1f:55:d2:4d:60:a8:0c:d0:19:32:e0:bb",
      "cert_name": "test_user",
      "common_name": "testuser",
      "serial_number": "12954687727334453172",
      "subject_key_id": "f5:02:c8:54:6b:bd:36:66:1f:55:d2:4d:60:a8:0c:d0:19:32:e0:bb"
    },
    "entity_id": "64cd1dd1-f94e-6370-8f8d-bc9ae68babf3",
    "token_type": "service"
  },
  "request": {
    "id": "c28e810a-8a52-1760-b346-1c4bd44e3800",
    "operation": "update",
    "client_token": "s.yr6lGhRxwK8ySc4cgEC6HBIi",
    "client_token_accessor": "w4Q5bVyabJ9c3hfgeG1gdyuS",
    "namespace": {
      "id": "root"
    },
    "path": "ssh-client/sign/sign-role",
    "data": {
      "-client-cert": "test_pub.pem",
      "-client-key": "test_key.pem",
      "public_key": "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCue3QBya7lZLt2JxOwMTPQQF0NrV/ahCNXr/NX0iFkC6PDtSQZ00YN34JXviR8IL4NHuvW/MMXGJFMHk9Y6sXgY6coIkhan/DVhJlt+fSzUEdUXAWygR4Nsq8Bmk8a3YEU5lBjOfdHGHLM42lG3FmZpTdNDLMGaLkAvFjsYqklsT4mEfxjkBHeY3fbt5zKoPkNnLS3m/O4oFO0uwT6Qo8CjlN6lFibiFNpUC2t+2b0knfrFZn0Jc5u+4JdJoFMvh8xGckCL85r2hFSS85ENHEwLq3kKMb2V7AOW06RaneFC5LDp93q31aUWi8nc2xtKMTQDzf/qUcpgB+xhKW0Bejl test@localhost.localdomain\n"
    },
    "remote_address": "127.0.0.1"
  },
  "response": {
    "data": {
      "serial_number": "f11b7664d9c1ed4b",
      "signed_key": "ssh-rsa-cert-v01@openssh.com AAAAHHNzaC1yc2EtY2VydC12MDFAb3BlbnNzaC5jb20AAAAgmqm82e96fHybofd18OK9X42OSX9Y8sBReFxNtU0gSdoAAAADAQABAAABAQCue3QBya7lZLt2JxOwMTPQQF0NrV/ahCNXr/NX0iFkC6PDtSQZ00YN34JXviR8IL4NHuvW/MMXGJFMHk9Y6sXgY6coIkhan/DVhJlt+fSzUEdUXAWygR4Nsq8Bmk8a3YEU5lBjOfdHGHLM42lG3FmZpTdNDLMGaLkAvFjsYqklsT4mEfxjkBHeY3fbt5zKoPkNnLS3m/O4oFO0uwT6Qo8CjlN6lFibiFNpUC2t+2b0knfrFZn0Jc5u+4JdJoFMvh8xGckCL85r2hFSS85ENHEwLq3kKMb2V7AOW06RaneFC5LDp93q31aUWi8nc2xtKMTQDzf/qUcpgB+xhKW0Bejl8Rt2ZNnB7UsAAAABAAAAVXZhdWx0LWNlcnQtVGVzdCBVc2VyLWFlYTkwNGFiOTE3YjNlZGM1MzJjMDBhNDU5NDc1NDFmYTE5NmIwNjA5ZjJkZDdmNmY1MDI4YWJhZTBmODgwMWQAAAALAAAAB3ZhZ3JhbnQAAAAAXaMhAgAAAABdpHKgAAAAAAAAABIAAAAKcGVybWl0LXB0eQAAAAAAAAAAAAACFwAAAAdzc2gtcnNhAAAAAwEAAQAAAgEArPomMYoG/HajnbzfLVdFOjGP64lXS1+wdnG97C0glHHyvP7E8kcK8Iqt7PbCTY7hbpajF2Z/PTqAgp2DNtdvEftD4HKxUF7Qpa40fToBcj0SVcA/Ht4qfJ1dH8fOdnOCOL9/wUgUmOQlprwYbvvPkjX9Rg4kYkoxkBrT1ZLg5+0QTmly/44ZVphrsqyXk3vOydcnyK8MTd5IVZ0hLNTNx/cDBeCnLwBkg2cs1us5b6uRqbUchqjNP61eHyPCEJykhsFpRSFCdqVHU9gkynj00/6JWuNRBtP0DFiJPJ8oUI9BLUBeBQn5jEgfhh8obnZ2ih/M7LOHF1cYggAVSgtG+XQ+jOzYr39pvLbABYncXCATSW2M62F6bnFFMcCixx5vBwvhAXMiOJpENjmfmmaCpa17t4SmSs284taNmPa5Upq17zyy/QBofeCrz35qBuGfAlO7G9jGP+/tkTOv2lbMw+BJGRobaH/1uypkX7NpuG8rEDdht5xm9pr1Xwdb1iD632nKEkLDtOjrH2X/PR9k9EWPbBhF3HtAPzv/esKYVa7Cm6heLEVPRn5ZsBXJA2+4j3Kq2cwDVp5DFpEttAf4trMx3S4AP9rHwS02Y5zWdRT16HdYAwjfpY0+m0kAdNdKwStdxTTG4wgzjYN8dZSOZP2UIe1LA5bFm1HbBieCjysAAAIPAAAAB3NzaC1yc2EAAAIAqrEuTPFL9ULr/46Qx7kPCY292yxgqilLqlcqEmJ8fwFiukRk/w4wk+qgnLLAc72aPD4tUqjpw0xB3QaCmdh0YK6TAjQ+RC8hXYKz6OHZMlbFBSXRwa7poKWSs4YVu+M+WtQ9oibYpHrZcEj5N3mO+XHP/mbrEa/jZi9yvqwI4bJ4OY1ktH6bihLd7q79pejWq5c1+1ppswSyO2tcyJShGYb8V/UPmIRgqo2OvMjnrgTtF7VnPZHh+H/kSlB+6PiTgdQQDVSf72cUi9hGcgXCas71bAFeamq/fvoeB2dKfl7ZrhjGKE5Mx90G8gVGulz8a2kbMOgP3bjNvKlc6DfKiJuHbpfxyNn/9P/cvYFYONMFxup3H/3/rDVs4a4M+Qp9nDmalGe+4muwdlLdt6Y/dkG3WAbJvKvVPpXjFca234Y2gSAv3lJVqURHbaxkE1fus3gCmNtjRNcHA/rGQW/vnEvVjXLfRBHRyAdT+TY38iewG1tk7iEx6EKx77PgFtgMO01pQLYe94VfG2ynuBOlUfIDms/gm6jwVfo/PUR1hP/Q5vTMCNBt1RwgEwa0EWI7LhNApVE66FXDAJ6a4aUvulXc8KdWODytJzaHMhM5mpn88xNFeH7SK0aeEs4C4Fu4XQVrobm8eE0Xz9K7faRXCpdNtrtRvh8joU0H+GfVsHE=\n"
    }
  }
}

The important part here to link information in /var/log/secure to signed certificate is the serial.
In this case, the serial is f11b7664d9c1ed4b. You need to convert that hexadecimal number to decimal:

echo "ibase=16; F11B7664D9C1ED4B" | bc
17373610163033992523

(please mind I needed to install bc on Centos7, and the hexadecimal number must be specified in uppercase)

Now if we move over to the server where this key was used, we can simply search for ‘17373610163033992523’ in the /var/log/messages file:

# grep 17373610163033992523 /var/log/secure
Oct 13 13:09:03 localhost sshd[4959]: Accepted publickey for vagrant from 192.168.66.50 port 50936 ssh2: RSA-CERT ID vault-cert-Test User-aea904ab917b3edc532c00a45947541fa196b0609f2dd7f6f5028abae0f8801d (serial 17373610163033992523) CA RSA SHA256:Jcv3wpnbWWRlHDCRNqm6jfhB9qKnvCByBRIR4wr7CLI
Oct 13 13:09:09 localhost sshd[4986]: Accepted publickey for vagrant from 192.168.66.50 port 50938 ssh2: RSA-CERT ID vault-cert-Test User-aea904ab917b3edc532c00a45947541fa196b0609f2dd7f6f5028abae0f8801d (serial 17373610163033992523) CA RSA SHA256:Jcv3wpnbWWRlHDCRNqm6jfhB9qKnvCByBRIR4wr7CLI

The second session (at 13:09:09) used sshd process 4986. We can use the linux auditing facility to display all use of that process id:

# ausearch -p 4959 -i
type=USER_AUTH msg=audit(10/13/2019 13:09:09.152:905) : pid=4986 uid=root auid=unset ses=unset subj=system_u:system_r:sshd_t:s0-s0:c0.c1023 msg='op=key algo=ssh-rsa-cert-v01@openssh.com size=2048 fp=none rport=50938 acct=vagrant exe=/usr/sbin/sshd hostname=? addr=192.168.66.50 terminal=? res=success'
----
type=USER_ACCT msg=audit(10/13/2019 13:09:09.154:906) : pid=4986 uid=root auid=unset ses=unset subj=system_u:system_r:sshd_t:s0-s0:c0.c1023 msg='op=PAM:accounting grantors=pam_unix,pam_localuser acct=vagrant exe=/usr/sbin/sshd hostname=192.168.66.50 addr=192.168.66.50 terminal=ssh res=success'
----
type=CRYPTO_KEY_USER msg=audit(10/13/2019 13:09:09.160:907) : pid=4986 uid=root auid=unset ses=unset subj=system_u:system_r:sshd_t:s0-s0:c0.c1023 msg='op=destroy kind=session fp=? direction=both spid=4987 suid=sshd rport=50938 laddr=192.168.66.51 lport=22  exe=/usr/sbin/sshd hostname=? addr=192.168.66.50 terminal=? res=success'
----
type=USER_AUTH msg=audit(10/13/2019 13:09:09.160:908) : pid=4986 uid=root auid=unset ses=unset subj=system_u:system_r:sshd_t:s0-s0:c0.c1023 msg='op=success acct=vagrant exe=/usr/sbin/sshd hostname=? addr=192.168.66.50 terminal=ssh res=success'
----
type=CRED_ACQ msg=audit(10/13/2019 13:09:09.160:909) : pid=4986 uid=root auid=unset ses=unset subj=system_u:system_r:sshd_t:s0-s0:c0.c1023 msg='op=PAM:setcred grantors=pam_env,pam_unix acct=vagrant exe=/usr/sbin/sshd hostname=192.168.66.50 addr=192.168.66.50 terminal=ssh res=success'
----
type=LOGIN msg=audit(10/13/2019 13:09:09.160:910) : pid=4986 uid=root subj=system_u:system_r:sshd_t:s0-s0:c0.c1023 old-auid=unset auid=vagrant tty=(none) old-ses=4294967295 ses=29 res=yes
----
type=USER_ROLE_CHANGE msg=audit(10/13/2019 13:09:09.276:911) : pid=4986 uid=root auid=vagrant ses=29 subj=system_u:system_r:sshd_t:s0-s0:c0.c1023 msg='pam: default-context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 selected-context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 exe=/usr/sbin/sshd hostname=192.168.66.50 addr=192.168.66.50 terminal=ssh res=success'
----
type=USER_START msg=audit(10/13/2019 13:09:09.296:912) : pid=4986 uid=root auid=vagrant ses=29 subj=system_u:system_r:sshd_t:s0-s0:c0.c1023 msg='op=PAM:session_open grantors=pam_selinux,pam_loginuid,pam_selinux,pam_namespace,pam_keyinit,pam_keyinit,pam_limits,pam_systemd,pam_unix,pam_lastlog acct=vagrant exe=/usr/sbin/sshd hostname=192.168.66.50 addr=192.168.66.50 terminal=ssh res=success'
----
type=USER_LOGIN msg=audit(10/13/2019 13:09:09.348:917) : pid=4986 uid=root auid=vagrant ses=29 subj=system_u:system_r:sshd_t:s0-s0:c0.c1023 msg='op=login id=vagrant exe=/usr/sbin/sshd hostname=192.168.66.50 addr=192.168.66.50 terminal=/dev/pts/1 res=success'

What you see here is at first the process of the secure shell daemon working with the authentication of the process, before it’s actually providing shell access. This is noticeable by the session-id, ses, which unset. Once it has authenticated, it is given a true audit session number, which is 29. Now we can look at everything this session id, including changing user, using this session id:

# ausearch --session 29 -i
----
type=LOGIN msg=audit(10/13/2019 13:09:09.160:910) : pid=4986 uid=root subj=system_u:system_r:sshd_t:s0-s0:c0.c1023 old-auid=unset auid=vagrant tty=(none) old-ses=4294967295 ses=29 res=yes
----
type=USER_ROLE_CHANGE msg=audit(10/13/2019 13:09:09.276:911) : pid=4986 uid=root auid=vagrant ses=29 subj=system_u:system_r:sshd_t:s0-s0:c0.c1023 msg='pam: default-context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 selected-context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 exe=/usr/sbin/sshd hostname=192.168.66.50 addr=192.168.66.50 terminal=ssh res=success'
----
type=USER_START msg=audit(10/13/2019 13:09:09.296:912) : pid=4986 uid=root auid=vagrant ses=29 subj=system_u:system_r:sshd_t:s0-s0:c0.c1023 msg='op=PAM:session_open grantors=pam_selinux,pam_loginuid,pam_selinux,pam_namespace,pam_keyinit,pam_keyinit,pam_limits,pam_systemd,pam_unix,pam_lastlog acct=vagrant exe=/usr/sbin/sshd hostname=192.168.66.50 addr=192.168.66.50 terminal=ssh res=success'
----
type=CRYPTO_KEY_USER msg=audit(10/13/2019 13:09:09.296:913) : pid=4989 uid=root auid=vagrant ses=29 subj=system_u:system_r:sshd_t:s0-s0:c0.c1023 msg='op=destroy kind=server fp=SHA256:22:dd:a8:71:eb:4d:44:5f:61:6a:4e:eb:55:9b:b5:f1:3c:bb:44:d2:3f:56:9d:a5:f8:3a:74:69:e4:02:4b:01 direction=? spid=4989 suid=root  exe=/usr/sbin/sshd hostname=? addr=? terminal=? res=success'
----
type=CRYPTO_KEY_USER msg=audit(10/13/2019 13:09:09.296:914) : pid=4989 uid=root auid=vagrant ses=29 subj=system_u:system_r:sshd_t:s0-s0:c0.c1023 msg='op=destroy kind=server fp=SHA256:77:fc:eb:26:0c:37:76:9d:b6:89:98:ca:5a:25:ad:d2:b9:c0:0b:01:4f:fb:e1:0d:a8:b8:45:41:56:68:ee:49 direction=? spid=4989 suid=root  exe=/usr/sbin/sshd hostname=? addr=? terminal=? res=success'
----
type=CRYPTO_KEY_USER msg=audit(10/13/2019 13:09:09.296:915) : pid=4989 uid=root auid=vagrant ses=29 subj=system_u:system_r:sshd_t:s0-s0:c0.c1023 msg='op=destroy kind=server fp=SHA256:9e:58:bc:ae:c9:68:4c:91:cc:a9:65:0a:a2:cd:e5:a8:62:40:14:22:79:80:52:da:0f:cd:78:87:f1:6c:d6:7f direction=? spid=4989 suid=root  exe=/usr/sbin/sshd hostname=? addr=? terminal=? res=success'
----
type=CRED_ACQ msg=audit(10/13/2019 13:09:09.296:916) : pid=4989 uid=root auid=vagrant ses=29 subj=system_u:system_r:sshd_t:s0-s0:c0.c1023 msg='op=PAM:setcred grantors=pam_env,pam_unix acct=vagrant exe=/usr/sbin/sshd hostname=192.168.66.50 addr=192.168.66.50 terminal=ssh res=success'
----
type=USER_LOGIN msg=audit(10/13/2019 13:09:09.348:917) : pid=4986 uid=root auid=vagrant ses=29 subj=system_u:system_r:sshd_t:s0-s0:c0.c1023 msg='op=login id=vagrant exe=/usr/sbin/sshd hostname=192.168.66.50 addr=192.168.66.50 terminal=/dev/pts/1 res=success'
----
type=USER_START msg=audit(10/13/2019 13:09:09.348:918) : pid=4986 uid=root auid=vagrant ses=29 subj=system_u:system_r:sshd_t:s0-s0:c0.c1023 msg='op=login id=vagrant exe=/usr/sbin/sshd hostname=192.168.66.50 addr=192.168.66.50 terminal=/dev/pts/1 res=success'
----
type=CRYPTO_KEY_USER msg=audit(10/13/2019 13:09:09.355:919) : pid=4986 uid=root auid=vagrant ses=29 subj=system_u:system_r:sshd_t:s0-s0:c0.c1023 msg='op=destroy kind=server fp=SHA256:9e:58:bc:ae:c9:68:4c:91:cc:a9:65:0a:a2:cd:e5:a8:62:40:14:22:79:80:52:da:0f:cd:78:87:f1:6c:d6:7f direction=? spid=4990 suid=vagrant  exe=/usr/sbin/sshd hostname=? addr=? terminal=? res=success'
----
type=USER_END msg=audit(10/13/2019 13:58:38.973:922) : pid=4986 uid=root auid=vagrant ses=29 subj=system_u:system_r:sshd_t:s0-s0:c0.c1023 msg='op=login id=vagrant exe=/usr/sbin/sshd hostname=? addr=? terminal=/dev/pts/1 res=success'
----
type=USER_LOGOUT msg=audit(10/13/2019 13:58:38.973:923) : pid=4986 uid=root auid=vagrant ses=29 subj=system_u:system_r:sshd_t:s0-s0:c0.c1023 msg='op=login id=vagrant exe=/usr/sbin/sshd hostname=? addr=? terminal=/dev/pts/1 res=success'
----
type=CRYPTO_KEY_USER msg=audit(10/13/2019 13:58:38.982:924) : pid=4986 uid=root auid=vagrant ses=29 subj=system_u:system_r:sshd_t:s0-s0:c0.c1023 msg='op=destroy kind=server fp=SHA256:9e:58:bc:ae:c9:68:4c:91:cc:a9:65:0a:a2:cd:e5:a8:62:40:14:22:79:80:52:da:0f:cd:78:87:f1:6c:d6:7f direction=? spid=4989 suid=vagrant  exe=/usr/sbin/sshd hostname=? addr=? terminal=? res=success'
----
type=CRYPTO_KEY_USER msg=audit(10/13/2019 13:58:38.983:925) : pid=4986 uid=root auid=vagrant ses=29 subj=system_u:system_r:sshd_t:s0-s0:c0.c1023 msg='op=destroy kind=session fp=? direction=both spid=4989 suid=vagrant rport=50938 laddr=192.168.66.51 lport=22  exe=/usr/sbin/sshd hostname=? addr=192.168.66.50 terminal=? res=success'
----
type=USER_END msg=audit(10/13/2019 13:58:38.992:926) : pid=4986 uid=root auid=vagrant ses=29 subj=system_u:system_r:sshd_t:s0-s0:c0.c1023 msg='op=PAM:session_close grantors=pam_selinux,pam_loginuid,pam_selinux,pam_namespace,pam_keyinit,pam_keyinit,pam_limits,pam_systemd,pam_unix,pam_lastlog acct=vagrant exe=/usr/sbin/sshd hostname=192.168.66.50 addr=192.168.66.50 terminal=ssh res=success'
----
type=CRED_DISP msg=audit(10/13/2019 13:58:38.992:927) : pid=4986 uid=root auid=vagrant ses=29 subj=system_u:system_r:sshd_t:s0-s0:c0.c1023 msg='op=PAM:setcred grantors=pam_env,pam_unix acct=vagrant exe=/usr/sbin/sshd hostname=192.168.66.50 addr=192.168.66.50 terminal=ssh res=success'
----
type=CRYPTO_KEY_USER msg=audit(10/13/2019 13:58:38.992:928) : pid=4986 uid=root auid=vagrant ses=29 subj=system_u:system_r:sshd_t:s0-s0:c0.c1023 msg='op=destroy kind=server fp=SHA256:22:dd:a8:71:eb:4d:44:5f:61:6a:4e:eb:55:9b:b5:f1:3c:bb:44:d2:3f:56:9d:a5:f8:3a:74:69:e4:02:4b:01 direction=? spid=4986 suid=root  exe=/usr/sbin/sshd hostname=? addr=? terminal=? res=success'
----
type=CRYPTO_KEY_USER msg=audit(10/13/2019 13:58:38.992:929) : pid=4986 uid=root auid=vagrant ses=29 subj=system_u:system_r:sshd_t:s0-s0:c0.c1023 msg='op=destroy kind=server fp=SHA256:77:fc:eb:26:0c:37:76:9d:b6:89:98:ca:5a:25:ad:d2:b9:c0:0b:01:4f:fb:e1:0d:a8:b8:45:41:56:68:ee:49 direction=? spid=4986 suid=root  exe=/usr/sbin/sshd hostname=? addr=? terminal=? res=success'
----
type=CRYPTO_KEY_USER msg=audit(10/13/2019 13:58:38.992:930) : pid=4986 uid=root auid=vagrant ses=29 subj=system_u:system_r:sshd_t:s0-s0:c0.c1023 msg='op=destroy kind=server fp=SHA256:9e:58:bc:ae:c9:68:4c:91:cc:a9:65:0a:a2:cd:e5:a8:62:40:14:22:79:80:52:da:0f:cd:78:87:f1:6c:d6:7f direction=? spid=4986 suid=root  exe=/usr/sbin/sshd hostname=? addr=? terminal=? res=success'

This shows the information that the linux audit facility logged for this session.

Conclusion
This is a solution for accountability in dynamic environments, where traditional solutions like LDAP would lesser easy fit in.
It works by using native secure shell daemon functionality, which is using an additional certificate that is used as “authority”, hence the name “certificate authority” (CA), which is set in the secure shell daemon as CA public certificate.
The essence is that an additional key is produced using the CA key, which is called a signed key. This key is linked with the CA public key, and therefore is authenticated by it. A single person should be the only owner of this signed key, because the signed key is what is what determines the identity.

Vault acts as a server where the signing and the administration of the signing is audited and automated so users can self-service generating CA signed keys, the only thing an administrator has to do is upload a certificate and grant the ssh policy, and remove the policy from a certificate of a person that should not get access anymore.

Because authentication and ssh functionality can be mounted multiple times in vault, you can create multiple groups of certificates which can use multiple CAs (using multiple mounts of the ssh functionality).

There is more functionality that can be set, allowed, disallowed etc. with CA signed keys.

Video : Oracle REST Data Services (ORDS) : Create Basic RESTful Web Services Using PL/SQL

Today’s video is a brief run through creating RESTful web services using Oracle REST Data Services (ORDS) and PL/SQL.

This is based on the following article, but the article has a load more examples and variations compared to the video.

I don’t mention handling complex payloads or status information, but you can find that here.

You can see all my other ORDS related content here.

The star of today’s video is Alan Arentsen. Finding a clip without him giggling or laughing was kind-of tough… </p />
</p></div>

    	  	<div class=

SELECT ANY DICTIONARY - What Privileges Does it Have - SELECT_CATALOG_ROLE

There has been a few blog posts over the years discussing what is the difference between SELECT ANY DICTIONARY and the SELECT_CATALOG_ROLE. Hemant posted in 2014 about the difference between SELECT ANY DICTIONARY and SELECT_CATALOG_ROLE . This post was a....[Read More]

Posted by Pete On 11/10/19 At 01:59 PM

v$session

Here’s an odd, and unpleasant, detail about querying v$session in the “most obvious” way. (And if you were wondering what made me resurrect and complete a draft on “my session id” a couple of days ago, this posting is the reason). Specifically if you want to select some information for your own session from v$session the query you’re likely to use in any recent version of Oracle will probably be of the form:


select {list for columns} from v$session where sid = to_number(sys_context('userenv','sid'));

Unfortunately that one little statement hides two anomalies – which you can see in the execution plan. Here’s a demonstration cut from an SQL*Plus session running under 19.3.0.0:


SQL> select * from table(dbms_xplan.display_cursor);

SQL_ID  gcfrzq9knynj3, child number 0
-------------------------------------
select program from V$session where sid = sys_context('userenv','sid')

Plan hash value: 2422122865

----------------------------------------------------------------------------------
| Id  | Operation                 | Name            | Rows  | Bytes | Cost (%CPU)|
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |                 |       |       |     1 (100)|
|   1 |  MERGE JOIN CARTESIAN     |                 |     1 |    33 |     0   (0)|
|   2 |   NESTED LOOPS            |                 |     1 |    12 |     0   (0)|
|*  3 |    FIXED TABLE FULL       | X$KSLWT         |     1 |     8 |     0   (0)|
|*  4 |    FIXED TABLE FIXED INDEX| X$KSLED (ind:2) |     1 |     4 |     0   (0)|
|   5 |   BUFFER SORT             |                 |     1 |    21 |     0   (0)|
|*  6 |    FIXED TABLE FULL       | X$KSUSE         |     1 |    21 |     0   (0)|
----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
------ -------------------------------------------
   3 - filter("W"."KSLWTSID"=TO_NUMBER(SYS_CONTEXT('userenv','sid')))
   4 - filter("W"."KSLWTEVT"="E"."INDX")
   6 - filter((BITAND("S"."KSUSEFLG",1)<>0 AND BITAND("S"."KSSPAFLG",1)<>0 AND 
              "S"."INDX"=TO_NUMBER(SYS_CONTEXT('userenv','sid'))
              AND INTERNAL_FUNCTION("S"."CON_ID") AND "S"."INST_ID"=USERENV('INSTANCE')))

As you can see, v$session is a join of 3 separate structures – x$kslwt (v$session_wait), x$ksled (v$event_name), and x$ksuse (the original v$session as it was some time around 8i), and the plan shows two “full tablescans” and a Cartesian merge join. Tablescans and Cartesian merge joins are not necessarily bad – especially where small tables and tiny numbers of rows are concerned – but they do merit at least a brief glance.

x$ksuse is a C structure in the fixed SGA and that structure is a segmented array (which seems to be chunks of 126 entries in 19.3, and chunks of 209 entries in 12.2 – but that’s fairly irrelevant). The SID is simply the index into the array counting from 1, so if you have a query with a predicate like ‘SID = 99’ Oracle can work out the address of the 99th entry in the array and access it very quickly – which is why the SID column is reported as a “fixed index” column in the view v$indexed_fixed_column.

But we have two problems immediately visible:

  1. the optimizer is not using the “index” to access x$ksuse despite the fact that we’re giving it exactly the value we want to use (and we can see a suitable predicate at operation 6 in the plan)
  2. the optimizer has decided to start executing the query at the x$kslwt table

Before looking at why thing’s have gone wrong, let’s check the execution plan to see what would have happened if we’d copied the value from the sys_context() call into a bind variable and queried using the bind variable – which we’ll keep as a character type to make it a fair comparison:

SQL_ID  cm3ub1tctpdyt, child number 0
-------------------------------------
select program from v$session where sid = to_number(:v1)

Plan hash value: 1627146547

----------------------------------------------------------------------------------
| Id  | Operation                 | Name            | Rows  | Bytes | Cost (%CPU)|
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |                 |       |       |     1 (100)|
|   1 |  MERGE JOIN CARTESIAN     |                 |     1 |    32 |     0   (0)|
|   2 |   NESTED LOOPS            |                 |     1 |    12 |     0   (0)|
|*  3 |    FIXED TABLE FIXED INDEX| X$KSLWT (ind:1) |     1 |     8 |     0   (0)|
|*  4 |    FIXED TABLE FIXED INDEX| X$KSLED (ind:2) |     1 |     4 |     0   (0)|
|   5 |   BUFFER SORT             |                 |     1 |    20 |     0   (0)|
|*  6 |    FIXED TABLE FIXED INDEX| X$KSUSE (ind:1) |     1 |    20 |     0   (0)|
----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("W"."KSLWTSID"=TO_NUMBER(:V1))
   4 - filter("W"."KSLWTEVT"="E"."INDX")
   6 - filter(("S"."INDX"=TO_NUMBER(:V1) AND BITAND("S"."KSSPAFLG",1)<>0
              AND BITAND("S"."KSUSEFLG",1)<>0 AND INTERNAL_FUNCTION("S"."CON_ID") AND
              "S"."INST_ID"=USERENV('INSTANCE')))


When we have a (character) bind variable instead of a sys_context() value the optimizer manages to use the “fixed indexes”., but it’s still started executing at x$kslwt, and still doing a Cartesian merge join. The plan would be the same if the bind variable were a numeric type, and we’d still get the same plan if we replaced the bind variable with a literal number.

So problem number 1 is that Oracle only seems able to use the fixed index path for literal values and simple bind variables (plus a few “simple” functions). It doesn’t seem to use the fixed indexes for most functions (even deterministic ones) returning a value and the sys_context() function is a particular example of this.

Transitivity

Problem number 2 comes from a side-effect of something that I first described about 13 years ago – transitive closure. Take a look at the predicates in both the execution plans above. Where’s the join condition between x$ksuse and x$kslwt ? There should be one, because the underlying SQL defining [g]v$session  has the following joins:

from
      x$ksuse s,
      x$ksled e,
      x$kslwt w
where
      bitand(s.ksspaflg,1)!=0
and   bitand(s.ksuseflg,1)!=0
and   s.indx=w.kslwtsid       -- this is the SID column for v$session and v$session_wait
and   w.kslwtevt=e.indx
 

What’s happened here is that the optimizer has used transitive closure: “if a = b and b = c then a = c” to clone the predicate “s.indx = to_number(sys_context(…))” to “w.kslwtsid = to_number(sys_context(…))”. But at the same time the optmizer has eliminated the predicate “s.indx = w.kslwtsid”, which it shouldn’t do because this is 12.2.0.1 and ever since 10g we’ve had the parameter _optimizer_transitivity_retain = true — but SYS is ignoring the parameter setting.

So we no longer have a join condition between x$ksuse and x$kslwt – which means there has to be a cartesian merge join between them and the only question is whether this should take place before or after the join between x$kslwt and x$ksled. In fact, the order doesn’t really matter because there will be only one row identified in x$kslwt and one row in x$ksuse, and the join to x$ksled is simply a lookup (by undeclarable unique key) to translate an id into a name and it will take place only once whatever we do about the other two structures.

But there is a catch – especially if your sessions parameter is 25,000 (which it shouldn’t be) and the number of connected sessions is currently 20,000 (which it shouldn’t be) – the predicate against x$ksuse does a huge amount of work as it walks the entire array testing every row (and it doesn’t even do the indx test first – it does a couple of bitand() operations). Even then this wouldn’t be a disaster – we’re only talking a couple of hundredths of a second of CPU – until you find the applications that run this query a huge number of times.

We would prefer to avoid two full tablescans since the arrays could be quite large, and of the two it’s the tablescan of x$ksuse that is going to be the greater threat; so is there a way to bypass the threat?  Once we’ve identified the optimizer anomaly we’ve got a pointer to a solution. Transitivity is going wrong, so let’s attack the transitivity. Checking the hidden parameters we can find a parameter: _optimizer_generate_transitive_pred which defaults to true, so let’s set it to false for the query and check the plan:

select  /*+   opt_param('_optimizer_generate_transitive_pred','FALSE')
*/  program from  v$session where  sid = sys_context('userenv','sid')

Plan hash value: 3425234845

----------------------------------------------------------------------------------
| Id  | Operation                 | Name            | Rows  | Bytes | Cost (%CPU)|
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |                 |       |       |     1 (100)|
|   1 |  NESTED LOOPS             |                 |     1 |    32 |     0   (0)|
|   2 |   NESTED LOOPS            |                 |     1 |    28 |     0   (0)|
|   3 |    FIXED TABLE FULL       | X$KSLWT         |    47 |   376 |     0   (0)|
|*  4 |    FIXED TABLE FIXED INDEX| X$KSUSE (ind:1) |     1 |    20 |     0   (0)|
|*  5 |   FIXED TABLE FIXED INDEX | X$KSLED (ind:2) |     1 |     4 |     0   (0)|
----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - filter(("S"."INDX"="W"."KSLWTSID" AND BITAND("S"."KSSPAFLG",1)<>0
              AND BITAND("S"."KSUSEFLG",1)<>0 AND "S"."INDX"=TO_NUMBER(SYS_CONTEXT('user
              env','sid')) AND INTERNAL_FUNCTION("S"."CON_ID") AND
              "S"."INST_ID"=USERENV('INSTANCE')))
   5 - filter("W"."KSLWTEVT"="E"."INDX")


Although it’s not nice to insert hidden parameters into the optimizer activity we do have a result. We don’t have any filtering on x$kslwt – fortunately this seems to be limited in size (but see footnote) to the number of current sessions (unlike x$ksuse which has an array size defined by the sessions parameter or derived from the processes parameter). For each row in x$kslwt we do an access into x$ksuse using the “index” (note that we don’t see access predicates for the fixed tables, we just have to note the operation says FIXED INDEX and spot the “index-related” predicate in the filter predicate list), so this strategy has reduced the number of times we check the complex predicate on x$ksuse rows.

It’s still far from ideal, though. What we’d really like to do is access x$kslwt by index using the known value from sys_context(‘userenv’,’sid’). As it stands the path we get from using a hidden parameter which isn’t listed as legal for the opt_param() hint is a plan that we would get if we used an unhinted query that searched for audsid = sys_context(‘userenv’,’sessionid’).

SQL_ID  7f3f9b9f32u7z, child number 0
-------------------------------------
select  program from  v$session where  audsid =
sys_context('userenv','sessionid')

Plan hash value: 3425234845

----------------------------------------------------------------------------------
| Id  | Operation                 | Name            | Rows  | Bytes | Cost (%CPU)|
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |                 |       |       |     1 (100)|
|   1 |  NESTED LOOPS             |                 |     2 |    70 |     0   (0)|
|   2 |   NESTED LOOPS            |                 |     2 |    62 |     0   (0)|
|   3 |    FIXED TABLE FULL       | X$KSLWT         |    47 |   376 |     0   (0)|
|*  4 |    FIXED TABLE FIXED INDEX| X$KSUSE (ind:1) |     1 |    23 |     0   (0)|
|*  5 |   FIXED TABLE FIXED INDEX | X$KSLED (ind:2) |     1 |     4 |     0   (0)|
----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - filter(("S"."INDX"="W"."KSLWTSID" AND BITAND("S"."KSSPAFLG",1)<>0
              AND BITAND("S"."KSUSEFLG",1)<>0 AND "S"."KSUUDSES"=TO_NUMBER(SYS_CONTEXT('
              userenv','sessionid')) AND INTERNAL_FUNCTION("S"."CON_ID") AND
              "S"."INST_ID"=USERENV('INSTANCE')))
   5 - filter("W"."KSLWTEVT"="E"."INDX")


The bottom line, then, seems to be that if you need a query by SID against v$session to be as efficient as possible then your best bet is to load a numeric variable with the sys_context(‘userenv’,’sid’) and then select where “sid = :bindvariable”.  Otherwise query by audsid, or use a hidden parameter to affect the optimizer.

Until the the SYS schema follows the _optimizer_transitivity_retain parameter or treats sys_context() the same way it treats a bind variable there is always going to be some unnecessary work when querying v$session and that excess will grow with either the number of connected sessions (if you optimize the query) or with the value of the sessions parameter.

Footnote

In (much) older versions of Oracle v$session_wait sat on top of x$ksusecst, which was part of the same C structure as x$ksuse. In newer versions of Oracle x$kslwt is a structure that is created on demand in the PGA/UGA – I hope that there’s a short cut that allows Oracle to find the waiting elements in x$ksuse[cst] efficiently, rather than requiring a walk through the whole thing, otherwise a tablescan of the (nominally smaller) x$kslwt structure will be at least as expensive as a tablescan of the x$ksuse structure.

Update (just a few minutes after posting)

Bob Bryla has pointed out in a tweet that there are many “bugs” not fixed until 19.1 for which the workaround is to set “_optimizer_transitivity_retain” to false. So maybe this isn’t an example of SYS doing something particularly strange – it may be part of a general reworking of the mechanism that still has a couple of undesirable side effects.

Bob’s comment prompted me to clone the x$ tables into real tables in a non-SYS schema and model the fixed indexes with primary keys, and I found that the resulting plan (though very efficient) still discarded the join predicate. So we may be seeing the side effects of a code enhancement relating to generating predicates that produce unique key access paths. (A “contrary” test, the one in the 2013 article I linked to, still retains the join predicate for the query that has non-unique indexes.)

 

Hashicorp vault and ansible: using certificate based authentication for playbooks

In first steps with with hashicorp vault and ansible I explained how to setup Hashicorp vault for use with Ansible.

The authentication of the playbook with Hashicorp vault in the playbooks was done in two ways:
– using a username and password in the playbook itself (which I discourage; then the authentication is readable).
– using a “authentication token” in the playbook.

The “authentication token” is obtained from vault using a username and password, and expires, so specifying that in a playbook does only spill the token. Please mind an authentication token and expires after a specified time, so it needs to created and provided just before execution, and should expire thus not being usable anymore.

There is an additional method for the playbook to authenticate itself to vault that can be used, which I spoiled in the title of this blogpost: using a certificate. You could argue that a certificate is just a longer version of a token.

I my opinion the principal difference is that a certificate is in a file, which can be produced upfront, made available before running the playbook(s), and removed afterwards. The authentication token is provided in the playbook itself.

This is how certificate based authentication is enabled on vault:

Set environment variables on the vault server to talk to it.

$ . ./vault.env
$ . ./set_root_token.sh

Create a certificate.

$ openssl req -newkey rsa:2048 -new -nodes -x509 -days 365 -keyout demo_key.pem -out demo_cert.pem

Enable certificate based authentication and upload the certificate, and bind it to policies.

$ vault auth enable cert
$ vault write auth/cert/certs/demo display_name=demo policies="test_read_kv-v2,test_read_kv" certificate=@demo_cert.pem

(please mind this assumes you created the demo setup as described in First steps with Hashicorp Vault and Ansible, otherwise you won’t have the policies test_read_kv and test_read_kv-v2 available to assign it to)

Now with the certificate available and loaded in vault, we can point to it in a playbook, which then uses the certificate to authenticate itself with vault for obtaining the secrets.

This is the playbook: demo playbook with certificate based authentication for vault

It’s important to understand that the lookup is done on the ansible master node, not on the managed node(s), so you only need to place the certificate (temporarily) there, and only need to open up the vault port for the ansible master node. This is also true for other authentication mechanisms using the hashi_vault lookup filter.

OGB Appreciation Day 2019 : It’s a Wrap (#ThanksOGB)

Yesterday was the Oracle community OGB Appreciation Day 2019.

I would like to say a big thank you to everyone who took the time to join in. Here is the list of posts I saw in chronological order. If I missed you out, give me a shout and I’ll add you. </p />
</p></div>

    	  	<div class=