Search

Top 60 Oracle Blogs

Recent comments

Oakies Blog Aggregator

The AskTOM multimedia experience!

Most people know AskTOM as the go-to resource for getting answers to the most common and sometimes the most tricky questions on the Oracle Database. AskTOM runs on the robust Application Express architecture, which is a large part of the reason that it has been in operation for over 20 years without any problems in terms of upgrades, obsolescence etc. It just keeps….on….working.

However, you might not be aware of an initiative we launched a few years back called Office Hours. We realised that within the Oracle organization, there is a plethora of subject matter experts that you would benefit from having access to on a regular basis. Some of our product managers already have outward facing roles, so you’ll already know them, but other experts in the organisation were not getting the chance to share their knowledge with the people that deserve it most – you, the developer and DBA community.

So Office Hours was launched. Monthly webinars, Q&A discussions held over Zoom spanning multiple timezones (ie, in your office hours hence the name), multiple topics and multiple experts within Oracle to ensure that you maximize the benefits you can get from the Oracle Database and related technologies.

Nearly three years on, and voila! We are fast approaching 500(!) of these sessions, so the value contained in this content is vast. With the power of the new faceted search facility in the most recent Application Express versions, today we launched the new Video section for AskTOM. Simply click on the Video tab to get access to the complete list of AskTOM Office Hours videos. The joy of faceted search in APEX low code environment simple:

0% code, 100% results, 1000% user experience!

Enjoy!

asktom_video_tab

Tuning Dynamically Generated SQL from PeopleSoft COBOL and Application Engine

When working on a performance issue, you may reach the point where you want to introduce one or more hints into a SQL statement.  It is often tempting to simply add the hints to the source code.  Doing so may appear to be simpler.  That is often the case with Application Engine programs, however, it is much more difficult in PeopleSoft COBOL programs.  

A strong argument against such code change is that having made it, you have also to functionally test the change and push it through the entire release management process to get it into production.  Then, should you ever want to change or remove the hints, you have to go through the entire process again.

Oracle provides several mechanisms to define a database object containing a hint or set of hints and to apply them to matching SQL statements.  These mechanisms work best with SQL that uses bind variables rather than literals.  If an otherwise identical SQL statement has different literal values then it has a different SQL_ID and is treated by the database as a new statement.  SQL Plan Baselines, SQL Patches and SQL Profiles match to the specific text of a SQL statement.  Different literal values will prevent matching and defeat these mechanisms.  These techniques must still be tested and migrated in a controlled manner, but they have no functional impact and so only testing of performance is needed.

SQL Profiles can also perform forced matching, where statements that are similar except for literal values are matched.  However, note that they also require licencing of Tuning Pack.

Some parts of PeopleTools and the PeopleSoft applications are better than others at producing sharable SQL with bind variables.  Most of the SQL generated by the component processor uses bind variables.  In Application Engine, if the ReUse Statement property is not set, which it is not by default, the %BIND fields are substituted with their literal values in the SQL statement.  However, if the property is set then %BIND fields become bind variables.  Over the years much more PeopleSoft code has been delivered with this attribute enabled.  Doing so has significant performance benefits (see Performance Benefits of ReUse Statement Flag in Application Engine).  

Where, under normal circumstances, I might use a baseline or patch to inject a hint or profile of hints into a particular SQL statement (i.e. where I am dealing with a single SQL_ID), if the statement has literal values that change, then each statement has a different SQL_ID.  I have experimented with setting CURSOR_SHARING to FORCE at session-level for a specific scheduled process, but I have always had very poor experiences with that approach.  It invariably causes more problems than it solves.  Instead, I use force matched SQL Profiles.

The PeopleTools documentation sets out situations where ReUse Statement cannot be set.  This includes dynamically generated code where %BIND(…,NOQUOUTES) is used to embed a piece of SQL held in a variable.  This is a very common technique in PeopleSoft; often dynamically generated code is driven by the application configuration.  

We also see a similar design in PeopleSoft's COBOL programs.  Static statements are loaded from the stored SQL statements table (PS_SQLSTMT_TBL) and do use bind variables, but dynamic statements are assembled at runtime from static fragments in the COBOL code and any variable values are embedded as literals rather than using bind variables.

Forced matching will allow a SQL profile to match a statement that is the same except for different literal values.   However, dynamic SQL statements can change in ways that are beyond that, including:

  • Different instances of working storage tables can be used by different instances of the same process.
  • Different columns can be used in select and group by clauses.
  • Different criteria can be introduced. 
  • A different number of terms in an IN() condition.

Occasionally, and I really mean very occasionally when I have exhausted other alternatives, I have dynamically created groups of SQL Profiles (still with forced matching) to cover every permutation of the variations of the dynamic SQL statement.

Example

Here is a dynamic statement from such a COBOL process, FSPCCURR.  This General Ledger Revaluation process adjusts the base currency value of the account balances by creating adjusting entries for the accounts being revalued. It creates corresponding entries for any gain or loss that results from the revaluation.  It was captured by AWR, and I have extracted it with DBMS_XPLAN.DISPLAY_AWR.
The ledger table (PS_LEDGER) is joined to a working storage table of tree selectors (PS_TREE_SEL10_R001) and working storage table (PS_CURR_WRK_RT001) and the result is put into another working storage table (PS_CURR_WRK_TBL001).
#eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;">INSERT INTO PS_CURR_WRK_TBL001 (ACCOUNT,ACCOUNTING_PERIOD,AFFILIATE,AFFI
LIATE_INTRA1,AFFILIATE_INTRA2,ALTACCT,BOOK_CODE,BUDGET_REF,CHARTFIELD1,C
HARTFIELD2,CHARTFIELD3,CLASS_FLD,CURRENCY_CD,DATE_CODE,DEPTID,FISCAL_YEA
R,FUND_CODE,GL_ADJUST_TYPE,LEDGER,OPERATING_UNIT,POSTED_BASE_AMT,POSTED_
TOTAL_AMT,POSTED_TOTAL_CR,POSTED_TOTAL_DR,POSTED_TRAN_AMT,POSTED_TRAN_CR
,POSTED_TRAN_DR,PROCESS_INSTANCE,PRODUCT,PROGRAM_CODE,PROJECT_ID,STATIST
ICS_CODE,RATE_DIV,RATE_MULT,GAINLOSS_ACCT,RESTATED_AMT,REVAL_ADJUSTMENT,
TARGET_CURRENCY) SELECT A.ACCOUNT,003,AFFILIATE,' ',' ','
',A.BOOK_CODE,BUDGET_REF,CHARTFIELD1,CHARTFIELD2,CHARTFIELD3,'
',A.CURRENCY_CD,DATE_CODE,DEPTID,2020,FUND_CODE,GL_ADJUST_TYPE,'X_UKCORE
',OPERATING_UNIT,SUM(A.POSTED_BASE_AMT),SUM(POSTED_TOTAL_AMT),0,0,0,0,0,
0002858795,PRODUCT,PROGRAM_CODE,PROJECT_ID,' ',R.RATE_DIV,R.RATE_MULT,'
',ROUND(SUM(((((A.POSTED_TOTAL_AMT) / (RATE_DIV))) * (RATE_MULT))),02
),ROUND(SUM(((((A.POSTED_TOTAL_AMT) / (RATE_DIV))) * (RATE_MULT))),02 )
- SUM(A.POSTED_BASE_AMT),'USD' FROM PS_LEDGER A , PS_TREE_SEL10_R001 B
, PS_CURR_WRK_RT001 R WHERE A.LEDGER='X_UKCORE' AND A.FISCAL_YEAR =
2020 AND A.ACCOUNTING_PERIOD IN ( 0, 1, 2, 3) AND B
.PROCESS_INSTANCE=0002858795 AND B .CHARTFIELD='ACCOUNT' AND
A.ACCOUNT>=B .RANGE_FROM_10 AND A.ACCOUNT<=B .RANGE_TO_10 AND
A.BUSINESS_UNIT='96013' AND A.CURRENCY_CD <> 'GBP' AND FROM_CUR =
A.CURRENCY_CD AND TO_CUR = 'GBP' AND R.PROCESS_INSTANCE = 0002858795
GROUP BY A.ACCOUNT,AFFILIATE,A.BOOK_CODE,BUDGET_REF,CHARTFIELD1,CHARTFIE
LD2,CHARTFIELD3,A.CURRENCY_CD,DATE_CODE,DEPTID,FUND_CODE,GL_ADJUST_TYPE,
OPERATING_UNIT,PRODUCT,PROGRAM_CODE,PROJECT_ID,RATE_DIV,RATE_MULT

However there are several copies of each of these working storage tables, and different concurrent instances of this process may be allocated different copies. 

There is also an in clause that lists the accounting periods to be processed. 
So the statement can vary. Here is another version of what is essentially the same statement with different literal values and different tables and for a different accounting period (this time period 5). The parts in bold a the ones that vary from statement to statement that are not literal values
#eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;">INSERT INTO PS_CURR_WRK_TBL001 (ACCOUNT,ACCOUNTING_PERIOD,AFFILIATE,AFFI
LIATE_INTRA1,AFFILIATE_INTRA2,ALTACCT,BOOK_CODE,BUDGET_REF,CHARTFIELD1,C
HARTFIELD2,CHARTFIELD3,CLASS_FLD,CURRENCY_CD,DATE_CODE,DEPTID,FISCAL_YEA
R,FUND_CODE,GL_ADJUST_TYPE,LEDGER,OPERATING_UNIT,POSTED_BASE_AMT,POSTED_
TOTAL_AMT,POSTED_TOTAL_CR,POSTED_TOTAL_DR,POSTED_TRAN_AMT,POSTED_TRAN_CR
,POSTED_TRAN_DR,PROCESS_INSTANCE,PRODUCT,PROGRAM_CODE,PROJECT_ID,STATIST
ICS_CODE,RATE_DIV,RATE_MULT,GAINLOSS_ACCT,RESTATED_AMT,REVAL_ADJUSTMENT,
TARGET_CURRENCY) SELECT A.ACCOUNT,005,AFFILIATE,' ',' ','
',A.BOOK_CODE,BUDGET_REF,CHARTFIELD1,CHARTFIELD2,CHARTFIELD3,'
',A.CURRENCY_CD,DATE_CODE,DEPTID,2020,FUND_CODE,GL_ADJUST_TYPE,'XXXX',OP
ERATING_UNIT,SUM(A.POSTED_BASE_AMT),SUM(POSTED_TOTAL_AMT),0,0,0,0,0,0002
991789,PRODUCT,PROGRAM_CODE,PROJECT_ID,' ',R.RATE_DIV,R.RATE_MULT,'
',ROUND(SUM(((((A.POSTED_TOTAL_AMT) / (RATE_DIV))) * (RATE_MULT))),02
),ROUND(SUM(((((A.POSTED_TOTAL_AMT) / (RATE_DIV))) * (RATE_MULT))),02 )
- SUM(A.POSTED_BASE_AMT),'AUD' FROM PS_LEDGER A , PS_TREE_SEL10_R B ,
PS_CURR_WRK_RT002 R WHERE A.LEDGER='XXXX' AND A.FISCAL_YEAR = 2020 AND
A.ACCOUNTING_PERIOD IN ( 0, 1, 2, 3, 4, 5) AND B
.PROCESS_INSTANCE=0002991789 AND B .CHARTFIELD='ACCOUNT' AND
A.ACCOUNT>=B .RANGE_FROM_10 AND A.ACCOUNT<=B .RANGE_TO_10 AND
A.BUSINESS_UNIT='13051' AND A.CURRENCY_CD <> 'AUD' AND FROM_CUR =
A.CURRENCY_CD AND TO_CUR = 'AUD' AND R.PROCESS_INSTANCE = 0002991789
GROUP BY A.ACCOUNT,AFFILIATE,A.BOOK_CODE,BUDGET_REF,CHARTFIELD1,CHARTFIE
LD2,CHARTFIELD3,A.CURRENCY_CD,DATE_CODE,DEPTID,FUND_CODE,GL_ADJUST_TYPE,
OPERATING_UNIT,PRODUCT,PROGRAM_CODE,PROJECT_ID,RATE_DIV,RATE_MULT

If I want to use SQL Profiles to introduce hints to control the execution plan, then I will need a different profile for every possible permutation.

I start by using Carlos Sierra's coe_xfr_sql_profile.sql script.  This is a part of Oracle's SQLTEXPLAIN (SQLT) tool. It generates a SQL script that generates a SQL profile to reproduce a given execution plan for a given SQL statement that was captured by AWR.
From there is not a big jump to add a SQL statement to generate all the permutations of the variations in the SQL (other than for bind variables) and create a profile inside a loop.  The exact details will vary depending on the behaviour of the program.  However, in this particular example I need:
  • Different SQL profiles will be needed for each accounting period because there will be a different list of accounting periods in the IN() condition.  Subquery factors n and n1 produce a list of accounting periods.
#eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;">WITH n AS (SELECT rownum n FROM dual CONNECT BY level<=12
), n1 AS (
SELECT n1.n period, LISTAGG(n2.n,', ') WITHIN GROUP (ORDER BY n2.n) periods
FROM n n1, n n2
WHERE n2.n <= n1.n
AND n1.n <= 12
GROUP BY n1.n
)
SELECT * FROM n1
/

PERIOD PERIODS
---------- ----------------------------------------
1 1
2 1, 2
3 1, 2, 3
4 1, 2, 3, 4
5 1, 2, 3, 4, 5
6 1, 2, 3, 4, 5, 6
7 1, 2, 3, 4, 5, 6, 7
8 1, 2, 3, 4, 5, 6, 7, 8
9 1, 2, 3, 4, 5, 6, 7, 8, 9
10 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
11 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11
12 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12
  • Lists of the various tables used for working storage can be queried from the PeopleSoft data dictionary, PSRECDEFN.  I can see that there are 5 versions of the current work table that the process can choose from.  Note that these are ordinary tables, so there are 5 different records in PSRECDEFN. 
#eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;">SELECT TO_NUMBER(substr(recname,13)) seq
, decode(sqltablename,' ','PS_'||recname,sqltablename) table_name
FROM psrecdefn
WHERE recname like 'CURR_WRK_TBL%'
/

SEQ TABLE_NAME
---------- ------------------
PS_CURR_WKK_TBL
1 PS_CURR_WKK_TBL001
2 PS_CURR_WKK_TBL002
3 PS_CURR_WKK_TBL003
4 PS_CURR_WKK_TBL004

  • However, if I was working on a temporary record used in an Application Engine program, I would need to look up the number of instances of that record.  
    • The number of non-shared temporary record tables is the sum of all the instances defined on each application engine program to which the record is allocated, plus the number of global instances, up to a maximum of 99.  Instance 0 is the shared instance.  The number can be altered in development and the additional tables built by Application Designer.  This can require additional SQL Profiles be built. 
#eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;">WITH n AS (
S
ELECT rownum-1 n FROM dual CONNECT BY LEVEL<=100
), c AS (
SELECT c.recname
, n.n instance
, DECODE(r.sqltablename, ' ', 'PS_'||r.recname,r.sqltablename)||DECODE(n.n,0,'',n.n) table_name
FROM n
, pstemptblcntvw c
, psrecdefn r
, psoptions o
WHERE r.recname = c.recname
AND n.n <= c.temptblinstances+o.temptblinstances
)
SELECT instance, table_name
FROM c
WHERE recname = 'WRK_XREF_CAL'
/

INSTANCE TABLE_NAME
---------- ----------------------------------------------------------
0 PS_WRK_XREF_CAL
1 PS_WRK_XREF_CAL1
2 PS_WRK_XREF_CAL2
3 PS_WRK_XREF_CAL3
4 PS_WRK_XREF_CAL4
5 PS_WRK_XREF_CAL5
6 PS_WRK_XREF_CAL6
  • In this particular example, I know that every permutation of all three tables could occur in all accounting period, so I simply Cartesian join all the subquery factors.  
    • In other cases, only some permutations may occur.  This must be handled in the code that is written.  Literal values do not need to be considered because the profile will be created with force matching.
    • In Application Engine, although you often see the same instance of different temporary records used in the same process, there is nothing to prevent different instances of different records being used, and so all permutations must be considered.
  • I will also concatenate the ID for each table, and also the accounting period to produce an ID string that I can use in the name of the SQL profile.
#eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;">'TS'||ts.seq||'_RT'||rt.seq||'_TBL'||wk.seq||'_'||n1.period
The profile generated by coe_xfr_sql_profile.sql will contain the complete profile of hints for the SQL statement captured by AWR.  That is officially the only way to guarantee a particular execution plan.  However, as in this example, I could specify the just that I want to introduce in the statement, effectively treating a SQL Profile as if it was a SQL Patch.  See also:
Here is the modified script.  Note the sections in bold.
  • The SQL statement in the FOR clause returns all the permutations of the variations in the SQL statement in an implicit cursor.  
  • Table names are concatenated into the SQL text from the columns in the implicit cursor.
  • Single quotation marks are doubled so that the string contains the single quotation mark.
  • It is important not to add or remove any spaces when introducing these changes.
  • Profiles are dropped and created inside the loop.  Force Matching is enabled.
#eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;">REM coe_xfr_sql_profile_FSPCCURR_tree_rate_curr_wrk.sql
SPO coe_xfr_sql_profile_FSPCCURR_tree_rate_curr_wrk.log;

WHENEVER SQLERROR CONTINUE
REM WHENEVER SQLERROR EXIT SQL.SQLCODE;
VAR signature NUMBER;

DECLARE
sql_txt CLOB;
h SYS.SQLPROF_ATTR;
e_no_sql_profile EXCEPTION;
PRAGMA EXCEPTION_INIT(e_no_sql_profile, -13833);
BEGIN
FOR i IN(
WITH n AS (SELECT rownum n FROM dual CONNECT BY level<=12
), n1 AS (
SELECT n1.n period, LISTAGG(n2.n,', ') WITHIN GROUP (ORDER BY n2.n) periods
FROM n n1, n n2
WHERE n2.n <= n1.n
GROUP BY n1.n
), ts AS (
SELECT TO_NUMBER(substr(recname,13)) seq
, decode(sqltablename,' ','PS_'||recname,sqltablename) table_name
FROM psrecdefn
WHERE recname like 'TREE_SEL10_R%'
), rt AS (
SELECT TO_NUMBER(substr(recname,12)) seq
, decode(sqltablename,' ','PS_'||recname,sqltablename) table_name
FROM psrecdefn
WHERE recname like 'CURR_WRK_RT%'
), wk AS (
SELECT TO_NUMBER(substr(recname,13)) seq
, decode(sqltablename,' ','PS_'||recname,sqltablename) table_name
FROM psrecdefn
WHERE recname like 'CURR_WRK_TBL%'
)
SELECT 'TS'||ts.seq||'_RT'||rt.seq||'_TBL'||wk.seq||'_'||n1.period id
, ts.table_name ts_table_name
, rt.table_name rt_table_name
, wk.table_name wk_table_name
, n1.period, n1.periods
FROM n1, ts, rt, wk
) LOOP
sql_txt := 'INSERT INTO '||i.wk_table_name||' (ACCOUNT,ACCOUNTING_PERIOD,AFFILIATE,AFFILIATE_INTRA1,AFFILIATE_INTRA2,ALTACCT,
BOOK_CODE,BUDGET_REF,CHARTFIELD1,CHARTFIELD2,CHARTFIELD3,CLASS_FLD,CURRENCY_CD,DATE_CODE,DEPTID,FISCAL_YEAR,
FUND_CODE,GL_ADJUST_TYPE,LEDGER,OPERATING_UNIT,POSTED_BASE_AMT,POSTED_TOTAL_AMT,POSTED_TOTAL_CR,POSTED_TOTAL_DR,
POSTED_TRAN_AMT,POSTED_TRAN_CR,POSTED_TRAN_DR,PROCESS_INSTANCE,PRODUCT,PROGRAM_CODE,PROJECT_ID,STATISTICS_CODE,
RATE_DIV,RATE_MULT,GAINLOSS_ACCT,RESTATED_AMT,REVAL_ADJUSTMENT,TARGET_CURRENCY)
SELECT A.ACCOUNT,011,AFFILIATE,
'' '','' '','' '',A.BOOK_CODE,BUDGET_REF,CHARTFIELD1,CHARTFIELD2,CHARTFIELD3,'' '',A.CURRENCY_CD,DATE_CODE,DEPTID,2016,
FUND_CODE,GL_ADJUST_TYPE,''X_UKCORE'',OPERATING_UNIT,SUM(A.POSTED_BASE_AMT),SUM(POSTED_TOTAL_AMT),0,0,0,0,0,
0001234567,PRODUCT,PROGRAM_CODE,PROJECT_ID,'' '',R.RATE_DIV,R.RATE_MULT,'' '',
ROUND(SUM(((((A.POSTED_TOTAL_AMT) / (RATE_DIV))) * (RATE_MULT))),02
),ROUND(SUM(((((A.POSTED_TOTAL_AMT) / (RATE_DIV))) * (RATE_MULT))),02 )
- SUM(A.POSTED_BASE_AMT),''GBP'' FROM PS_LEDGER A , '||i.ts_table_name||' B
, '||i.rt_table_name||' R WHERE A.LEDGER=''X_UKCORE'' AND A.FISCAL_YEAR =
2016 AND A.ACCOUNTING_PERIOD IN ( 0, '||i.periods||')
AND B .PROCESS_INSTANCE=0001234567 AND B .CHARTFIELD=''ACCOUNT'' AND
A.ACCOUNT>=B .RANGE_FROM_10 AND A.ACCOUNT<=B .RANGE_TO_10 AND
A.BUSINESS_UNIT=''12345'' AND A.CURRENCY_CD <> ''GBP''
AND FROM_CUR = A.CURRENCY_CD
AND TO_CUR = ''GBP''
AND R.PROCESS_INSTANCE = 0001234567
GROUP BY A.ACCOUNT,AFFILIATE,A.BOOK_CODE,BUDGET_REF,CHARTFIELD1,CHARTFIELD2,CHARTFIELD3,A.CURRENCY_CD,DATE_CODE,
DEPTID,FUND_CODE,GL_ADJUST_TYPE,OPERATING_UNIT,PRODUCT,PROGRAM_CODE,PROJECT_ID,RATE_DIV,RATE_MULT';
h := SYS.SQLPROF_ATTR(
q'[BEGIN_OUTLINE_DATA]',
q'[IGNORE_OPTIM_EMBEDDED_HINTS]',
q'[ALL_ROWS]',
q'[LEADING(@"SEL$1" "B"@"SEL$1" "A"@"SEL$1" "R"@"SEL$1")]',
q'[USE_NL(@"SEL$1" "A"@"SEL$1")]',
q'[SWAP_JOIN_INPUTS(@"SEL$1" "R"@"SEL$1")]',
q'[END_OUTLINE_DATA]');
:signature := DBMS_SQLTUNE.SQLTEXT_TO_SIGNATURE(sql_txt);
dbms_output.put_line(i.id||':'||:signature);

BEGIN
DBMS_SQLTUNE.drop_SQL_PROFILE (name => 'FSPCCURR_'||i.id);
EXCEPTION
WHEN e_no_sql_profile THEN NULL;
END;

IF 1=1 THEN
DBMS_SQLTUNE.IMPORT_SQL_PROFILE (
sql_text => sql_txt,
profile => h,
name => 'FSPCCURR_'||i.id,
description => 'coe FSPCCURR '||i.id||' @ '||:signature||'',
category => 'DEFAULT',
validate => TRUE,
replace => TRUE,
force_match => TRUE /* TRUE:FORCE (match even when different literals in SQL).
FALSE:EXACT (similar to CURSOR_SHARING) */ );

END IF;

END LOOP;
END;
/

column name format a30
select name, status, sql_text
from dba_sql_profiles
where name like '%FSPCCURR%'
order by 1
/
SPO OFF;

When I implemented this particular example, it created 900 SQL profiles.  Each one has a different force matching signature.  Having this many SQL Profiles is not a problem for the database because they are looked up by the signature. 
#eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;">
TS1_RT4_TBL4_12:1278395827722096644
TS2_RT4_TBL4_12:4126874017730826984

PL/SQL procedure successfully completed.

However, managing that many profiles could become a cause for concern by the DBA.  Therefore, I think it is important that they have a sensible naming convention so that it is obvious to what they relate.
#eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;">NAME                           STATUS   SQL_TEXT                                                                        
------------------------------ -------- --------------------------------------------------------------------------------

FSPCCURR_TS1_RT4_TBL4_12 ENABLED INSERT INTO PS_CURR_WRK_TBL004 (ACCOUNT,ACCOUNTING_PERIOD,AFFILIATE,AFFILIATE_IN

FSPCCURR_TS2_RT4_TBL4_12 ENABLED INSERT INTO PS_CURR_WRK_TBL004 (ACCOUNT,ACCOUNTING_PERIOD,AFFILIATE,AFFILIATE_IN


900 rows selected.

Further reading

Run With Us! Join the 2020 wwwp5K Movement

If you’re like us, you’re eager to send 2020 off to the dustbin of history. So grab your running/walking/yoga shoes and join us as we resurrect the historic #wwwp5K and celebrate reaching the 2020 finish line! As an added incentive and in the spirit of the season, we’ve also created a special wwwp5K Givz page, where participants can make a donation to three of our favorite charities: Black Girls Code, Internet Archive, and the WordPress Foundation. Automattic will match every dollar donated to any organization through the Givz page, up to $50,000.

What’s a 5K?

A 5K is the equivalent of about 3.1 miles. The virtual run will work on the honor system, but if you want to be accurate, apps like Strava, Garmin Connect, Runkeeper, Fitbit, and many others can help you measure the right distance.

Sounds awesome! How do I participate?

The virtual wwwp5K officially kicks off tomorrow, December 1, and will be open through December 31st. You can run, skip, walk, hop, walk backwards, or even swim the equivalent distance in an indoor pool — as long as you’re practicing appropriate safety precautions given local conditions and staying healthy, your activity counts.

Everyone is welcome! WordPress fans, friends, and family, as well as Automatticians around the world.

When you’re done, don’t forget to post a selfie on your WordPress site and tag it with “wwwp5k” so that we can share the love and others can read about your experience. Of course, you can also blog about your journey preparing for the wwwp5K, but most of all, we’d love to see your smiling face and happy shoes as you complete the 5K.

Is there swag?

What would a virtual run be without swag with a custom logo? To commemorate the 2020 run, we’ve created a limited edition technical shirt featuring the official wwwp5K Wapuu!

They’ll be available for purchase in the WordPress Swag Store starting tomorrow until supplies last, so don’t forget to place your order.

Will you be joining us? Let us know in the comments!

In-row LOBs

If you’re like me there are probably lots of little things that you know but find terribly easy to forget. Here’s one of my bêtes noires – starting with a little quiz:

rem
rem     Script:         lob_in_row.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Oct 2020
rem     Purpose:
rem
rem     Last tested
rem             19.3.0.0
rem             12.2.0.1
rem             11.2.0.4
rem

create table t1 (
        v1      varchar2(20),
        b1      clob,
        s1      clob
)
lob(b1) store as basicfile  b_lob,
lob(s1) store as securefile s_lob
;

insert into t1 values(
        rpad('x',20,'x'),
        rpad('x',20,'x'),
        rpad('x',20,'x')
)
/

commit;

execute dbms_stats.gather_table_stats(user,'t1');

select  column_name, avg_col_len
from    user_tab_columns
where   table_name = 'T1'
/

select  avg_row_len
from    user_tables
where   table_name = 'T1'
/

column rel_file_no new_value m_file_no
column block_no    new_value m_block_no

select
        dbms_rowid.rowid_relative_fno(rowid)    rel_file_no,
        dbms_rowid.rowid_block_number(rowid)    block_no
from
        t1
;

alter system flush buffer_cache;
alter system dump datafile &amp;m_file_no block &amp;m_block_no;

I’ve created a table with a varchar2() column, a basicfile CLOB, and a securefile CLOB, and inserted the same value into all three. By default the CLOB columns will be defined as ‘enable storage in row’ and since the inserted value is very short it will be stored “in row” for both these CLOBs. The question is:

Which column takes the most space in the data block?

To answer this question we need only gather stats and query user_tab_columns and/or user_tables – except the results may be a little surprising, so my script also uses the dbms_rowid package to find the file and block number where the row has been inserted, flushes the buffer cache to make sure that we don’t get confused by older versions of the block (you probably shouldn’t run this script on a big busy system) then dumps the block into the session trace file.

You’ll see why it’s necessary to look at the block dumps when I show you the results from gathering stats. The three sets of results below come from 11.2.0.4, 12.2.0.1, and 19.3.0.0 in that order. A key feature the databases have in common is that they were all created with a multi-byte character set:


11.2.0.4
========
COLUMN_NAME          AVG_COL_LEN
-------------------- -----------
V1                            21
B1                           123
S1                           117 

AVG_ROW_LEN
-----------
        261

12.2.0.1
========
COLUMN_NAME          AVG_COL_LEN
-------------------- -----------
V1                            21
B1                           171
S1                           165

AVG_ROW_LEN
-----------
        357

19.3.0.0
========
COLUMN_NAME          AVG_COL_LEN
-------------------- -----------
V1                            21
B1                            77
S1                            71

AVG_ROW_LEN
-----------
        169

There are three things that the three versions agree on.

  • First the row length is the sum of the column lengths.
  • Secondly the average length of the varchar2() column is 21 – remember that the avg_col_len includes the length byte(s) when we collect stats with the dbms_stats package.
  • Finally that the difference between the basicfile and securefile LOBs is 6.

But are the actual values reported for the LOB Lengths in any way truthful, or are we seeing reporting errors (that vary with version).

That’s why we need to look at the block dump, and this is what we find as the interesting bit of the dump – which is basically the same in all three cases:


block_row_dump:
tab 0, row 0, @0x1eec
tl: 172 fb: --H-FL-- lb: 0x1  cc: 3

col  0: [20]  78 78 78 78 78 78 78 78 78 78 78 78 78 78 78 78 78 78 78 78

col  1: [76]
 00 54 00 01 02 0c 80 00 00 02 00 00 00 01 00 02 75 f6 ff fb 00 38 09 00 00
 00 00 00 00 28 00 00 00 00 00 01 00 78 00 78 00 78 00 78 00 78 00 78 00 78
 00 78 00 78 00 78 00 78 00 78 00 78 00 78 00 78 00 78 00 78 00 78 00 78 00
 78

col  2: [70]
 00 54 00 01 02 0c 80 80 00 02 00 00 00 01 00 02 75 f6 ff fc 00 32 48 90 00
 2c 00 00 28 01 00 78 00 78 00 78 00 78 00 78 00 78 00 78 00 78 00 78 00 78
 00 78 00 78 00 78 00 78 00 78 00 78 00 78 00 78 00 78 00 78

end_of_block_dump

When Oracle 19c reported the avg_col_len (and maybe the same would be true of 18c) it was correctly reporting the space taken up in the row by the entries for the two LOBs (and their overheads) that  I had inserted.  The earlier versions of Oracle are doing something a little odd.

A key thing you’ll notice, of course, is that the varchar2() column is storing my input string as one byte per character, but because I used a multibyte characterset for my database characterset in all three cases Oracle has stored the LOB equivalents of the string using a two-byte fixed-width characterset that has effectively doubled the storage (and then has to include the “Lob Locator” overhead – hence the 20 character string turning into 70+ bytes).

This highlights two important issues. First that trying to work out how to “tune” your table storage based on the avg_col_len / avg_row_len is a lot harder to do if you’ve got LOB columns in the table, and your method of estimating storage will have to change to suit the version of Oracle. Secondly, and much more importantly, if you’re thinking of changing a varchar2() column (such as “description” or “comments” from a character data type to a CLOB to allow for a small number of cases where people want to supply longer bits of text then if you’re using a multibyte character set there are two awful consequences:

  • the storage requirement for the column will more than double
  • the LOB will go out of line when the in-row storage requirement exceeds 4,000 bytes – which means when the original string content exceeds something between 1,938 and 1,985 characters depending on your version of Oracle and whether you are using basicfile or securefile LOBs.  (And a “long” input string will exceed a single block in the LOB segment whent it goes over roughly 4,000 characters.)

So if you want to handle “long comments” in multibyte charactersets you might want to consider making your code more complex so that up to 4,000 bytes you store the data as a varchar2(), and only use a CLOB column when the data goes over that limit.

Footnote

If you’re wondering  what all the overheads are for the in-row LOB you’ll find that the 11g block dump gives you some extra details (though the extra details disappeared from my 12c and 19c block dumps).  Here’s the content that appeared immediately after the col1 and col2 data respectively in the 11g block dump.

LOB
Locator:
  Length:        84(76)
  Version:        1
  Byte Length:    2
  LobID: 00.00.00.01.00.02.75.f6.ff.fb
  Flags[ 0x02 0x0c 0x80 0x00 ]:
    Type: CLOB
    Storage: BasicFile
    Enable Storage in Row
    Characterset Format: IMPLICIT
    Partitioned Table: No
    Options: VaringWidthReadWrite
  Inode:
    Size:     56
    Flag:     0x09 [ Valid DataInRow ]
    Future:   0x00 (should be '0x00')
    Blocks:   0
    Bytes:    40
    Version:  00000.0000000001
    Inline data[40]
Dump of memory from 0x00007FB4244BED8D to 0x00007FB4244BEDB5
7FB4244BED80                            00780001              [..x.]
7FB4244BED90 00780078 00780078 00780078 00780078  [x.x.x.x.x.x.x.x.]
        Repeat 1 times
7FB4244BEDB0 00780078 54004678                    [x.x.xF.T]



LOB
Locator:
  Length:        84(70)
  Version:        1
  Byte Length:    2
  LobID: 00.00.00.01.00.02.75.f6.ff.fc
  Flags[ 0x02 0x0c 0x80 0x80 ]:
    Type: CLOB
    Storage: SecureFile
    Characterset Format: IMPLICIT
    Partitioned Table: No
    Options: VaringWidthReadWrite
  SecureFile Header:
    Length:   50
    Old Flag: 0x48 [ DataInRow SecureFile ]
    Flag 0:   0x90 [ INODE Valid ]
    Layers:
      Lengths Array: INODE:44
      INODE:
        00 00 28 01 00 78 00 78 00 78 00 78 00 78 00 78 00 78 00 78
        00 78 00 78 00 78 00 78 00 78 00 78 00 78 00 78 00 78 00 78
        00 78 00 78

Footnote 2

Oracle allows for an empty_clob() to populate a CLOB column with a zero length CLOB, and an empty_clob() is not the same as a null.

Many years ago, when I wrote “Practical Oracle 8i” I pointed out that if you were using a LOB column it would be better to use empty_[bc]lob() to represent an absence of data to avoid the need for checking the column in two different ways for “no data”.

There are two details to remember about this, though. First, empty_clob() take 36 bytes of storage for a basicfile and 30 bytes for a secure file (reported at 37/31 in 19c, 135/131 in 12cR2 and 103/97 in 11gR2). You may decide that’s expensive way of saying “nothing to see here”.

Secondly, while PL/SQL will allow you to compare a PL/SQL clob variable with empty_clob() you cannot use a simple “column = empty_clob()” predicate in SQL and will need something like:

where length(s1) = 0
where dbms_lob.getlength(s1) = 0
where dbms_lob.compare(s1, empty_clob) = 0

As a final closing comment about the oddity of avg_col_len and LOBs – my first test to check the lengths reported for an empty_clob() also inserted NULL into the v1 column. In this special case (all the columns in the row were empty_clob() or null) 11g reported the avg_col_len of s1 and b1 as zero!

DynamoDB Scan (and why 128.5 RCU?)

By Franck Pachot

.
In the previous post I described the PartiSQL SELECT for DynamoDB and mentioned that a SELECT without a WHERE clause on the partition key may result in a Scan, but the result is automatically paginated. This pagination, and the cost of a Scan, is something that may not be very clear from the documentation and I’ll show it here on the regular DynamoDB API. By not very clear, I think this is why many people in the AWS community fear that, with this new PartiQL API, there is a risk to full scan tables, consuming expensive RCUs. I was also misled, when I started to look at DynamoDB, by the AWS CLI “–no-paginate” option, as well as its “Consumed Capacity” always showing 128.5 even for very large scans. So those examples should, hopefully, clear out some doubts.

I have created a HASH/RANGE partitioned table and filled it with a few thousands of items:


aws dynamodb create-table --attribute-definitions \
  AttributeName=MyKeyPart,AttributeType=N \
  AttributeName=MyKeySort,AttributeType=N \
 --key-schema \
  AttributeName=MyKeyPart,KeyType=HASH \
  AttributeName=MyKeySort,KeyType=RANGE \
 --billing-mode PROVISIONED \
 --provisioned-throughput ReadCapacityUnits=25,WriteCapacityUnits=25 \
 --table-name Demo

for i in {1..5000} ; do aws dynamodb put-item --table-name Demo --item '{"MyKeyPart":{"N":"'$(( $RANDOM /1000 ))'"},"MyKeySort":{"N":"'$SECONDS'"},"MyUnstructuredData":{"S":"'$(printf %-1000s | tr ' ' x)'"}}' ; done

Here is how those items look like:
https://blog.dbi-services.com/wp-insides/uploads/sites/2/2020/11/Screens... 300w, https://blog.dbi-services.com/wp-insides/uploads/sites/2/2020/11/Screens... 768w, https://blog.dbi-services.com/wp-insides/uploads/sites/2/2020/11/Screens... 100w, https://blog.dbi-services.com/wp-insides/uploads/sites/2/2020/11/Screens... 140w" sizes="(max-width: 776px) 100vw, 776px" />

I have created large items with a 1000 bytes “MyUnstructuredData” attribute. According to https://zaccharles.github.io/dynamodb-calculator/ an item size is around 1042 bytes. And that’s exactly the size I see here (5209783/5000=1041.96) from the console “Items summary” (I waited a few hours to get it updated in the screenshot above). This means that around 1000 items can fit on a 1MB page. We will see why I’m mentioning 1MB here: the title says 128.5 RCU and that’s the consumed capacity when reading 1MB with eventual consistency (0.5 RCU per 4KB read is 128 RCU per 1MB). Basically, this post will try to explain why we see a 128.5 consumed capacity at maximum when scanning any large table:


[opc@a aws]$ aws dynamodb scan --table-name Demo --select=COUNT --return-consumed-capacity TOTAL\
 --no-consistent-read --output table

----------------------------------
|              Scan              |
+----------+---------------------+
|   Count  |    ScannedCount     |
+----------+---------------------+
|  5000    |  5000               |
+----------+---------------------+
||       ConsumedCapacity       ||
|+----------------+-------------+|
||  CapacityUnits |  TableName  ||
|+----------------+-------------+|
||  128.5         |  Demo       ||
|+----------------+-------------+|
[opc@a aws]$

TL;DR: this number is wrong </p />
</p></div>

    	  	<div class=

Video : Invisible Indexes in Oracle Database 11g Onward

In today’s video we’ll discuss Invisible Indexes, introduced in Oracle 11g.

The video is based on this article.

The star of today’s video is Chris Muir, who was instrumental in me becoming a presenter. He invited me down to Australia to speak at a couple of AUSOUG events, which were the first proper conferences I ever presented at.

Cheers

Tim…

The post Video : Invisible Indexes in Oracle Database 11g Onward first appeared on The ORACLE-BASE Blog.


Video : Invisible Indexes in Oracle Database 11g Onward was first posted on November 30, 2020 at 9:18 am.
©2012 "The ORACLE-BASE Blog". Use of this feed is for personal non-commercial use only. If you are not reading this article in your feed reader, then the site is guilty of copyright infringement.

Cross-cloud PMM: which TCP ports to open

By Franck Pachot

.
I recently installed Percona Monitoring & Management on AWS (free tier) and here is how to monitor an instance on another cloud (OCI), in order to show which TCP port must be opened.

PMM server

I installed PMM from the AWS Marketplace, following those instructions: https://www.percona.com/doc/percona-monitoring-and-management/deploy/server/ami.html. I’ll not reproduce the instructions, just some screenshots I took during the install:
https://blog.dbi-services.com/wp-insides/uploads/sites/2/2020/11/En1z7bj... 300w, https://blog.dbi-services.com/wp-insides/uploads/sites/2/2020/11/En1z7bj... 1024w, https://blog.dbi-services.com/wp-insides/uploads/sites/2/2020/11/En1z7bj... 768w" sizes="(max-width: 1199px) 100vw, 1199px" />
I have opened the HTTPS port in order to access the console, and also configure the clients which will also connect through HTTPS (but I’m not using a signed certificate).
https://blog.dbi-services.com/wp-insides/uploads/sites/2/2020/11/Screens... 300w, https://blog.dbi-services.com/wp-insides/uploads/sites/2/2020/11/Screens... 1024w, https://blog.dbi-services.com/wp-insides/uploads/sites/2/2020/11/Screens... 768w, https://blog.dbi-services.com/wp-insides/uploads/sites/2/2020/11/Screens... 1536w" sizes="(max-width: 2154px) 100vw, 2154px" />
Once installed, two targets are visible: the PMM server host (Linux) and database (PostgreSQL):
https://blog.dbi-services.com/wp-insides/uploads/sites/2/2020/11/Screens... 300w, https://blog.dbi-services.com/wp-insides/uploads/sites/2/2020/11/Screens... 1024w, https://blog.dbi-services.com/wp-insides/uploads/sites/2/2020/11/Screens... 768w, https://blog.dbi-services.com/wp-insides/uploads/sites/2/2020/11/Screens... 1536w, https://blog.dbi-services.com/wp-insides/uploads/sites/2/2020/11/Screens... 2048w" sizes="(max-width: 2560px) 100vw, 2560px" />
Note that I didn’t secure HTTPS here and I’ll have to accept insecure SSL.

PMM client

I’ll monitor an Autonomous Linux instance that I have on Oracle Cloud (Free Tier). Autonomous Linux is based on OEL, which is based on RHEL (see https://blog.dbi-services.com/al7/) and is called “autonomous” because it updates the kernel without the need to reboot. I install the PMM Client RPM:


[opc@al ~]$ sudo yum -y install https://repo.percona.com/yum/percona-release-latest.noarch.rpm


Loaded plugins: langpacks
percona-release-latest.noarch.rpm                                                                        |  19 kB  00:00:00
Examining /var/tmp/yum-root-YgvokG/percona-release-latest.noarch.rpm: percona-release-1.0-25.noarch
Marking /var/tmp/yum-root-YgvokG/percona-release-latest.noarch.rpm to be installed
Resolving Dependencies
--> Running transaction check
---> Package percona-release.noarch 0:1.0-25 will be installed
--> Finished Dependency Resolution
al7/x86_64                                                                                               | 2.8 kB  00:00:00
al7/x86_64/primary_db                                                                                    |  21 MB  00:00:00
epel-apache-maven/7Server/x86_64                                                                         | 3.3 kB  00:00:00
ol7_UEKR5/x86_64                                                                                         | 2.5 kB  00:00:00
ol7_latest/x86_64                                                                                        | 2.7 kB  00:00:00
ol7_x86_64_userspace_ksplice                                                                             | 2.8 kB  00:00:00

Dependencies Resolved

Dependencies Resolved

================================================================================================================================
 Package                        Arch                  Version               Repository                                     Size
================================================================================================================================
Installing:
 percona-release                noarch                1.0-25                /percona-release-latest.noarch                 31 k

Transaction Summary
================================================================================================================================
Install  1 Package

Total size: 31 k
Installed size: 31 k
Downloading packages:
Running transaction check
Running transaction test
Transaction test succeeded

Running transaction
  Installing : percona-release-1.0-25.noarch                                                                                1/1
* Enabling the Percona Original repository
 All done!
* Enabling the Percona Release repository
 All done!
The percona-release package now contains a percona-release script that can enable additional repositories for our newer products.

For example, to enable the Percona Server 8.0 repository use:

  percona-release setup ps80

Note: To avoid conflicts with older product versions, the percona-release setup command may disable our original repository for some products.

For more information, please visit:
  https://www.percona.com/doc/percona-repo-config/percona-release.html

  Verifying  : percona-release-1.0-25.noarch                                                                                1/1

Installed:
  percona-release.noarch 0:1.0-25


This packages helps to enable additional repositories. Here, I need the PMM 2 Client:


[opc@al ~]$ sudo percona-release enable pmm2-client


* Enabling the PMM2 Client repository
 All done!

Once enabled, it is easy to install it with YUM:


[opc@al ~]$ sudo yum install -y pmm2-client


Loaded plugins: langpacks
percona-release-noarch                                                                                   | 2.9 kB  00:00:00
percona-release-x86_64                                                                                   | 2.9 kB  00:00:00
pmm2-client-release-x86_64                                                                               | 2.9 kB  00:00:00
prel-release-noarch                                                                                      | 2.9 kB  00:00:00
(1/4): percona-release-noarch/7Server/primary_db                                                         |  24 kB  00:00:00
(2/4): pmm2-client-release-x86_64/7Server/primary_db                                                     | 3.5 kB  00:00:00
(3/4): prel-release-noarch/7Server/primary_db                                                            | 2.5 kB  00:00:00
(4/4): percona-release-x86_64/7Server/primary_db                                                         | 1.2 MB  00:00:00
Resolving Dependencies
--> Running transaction check
---> Package pmm2-client.x86_64 0:2.11.1-6.el7 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

================================================================================================================================
 Package                     Arch                   Version                        Repository                              Size
================================================================================================================================
Installing:
 pmm2-client                 x86_64                 2.11.1-6.el7                   percona-release-x86_64                  42 M

Transaction Summary
================================================================================================================================
Install  1 Package

Total download size: 42 M
Installed size: 42 M
Downloading packages:
warning: /var/cache/yum/x86_64/7Server/percona-release-x86_64/packages/pmm2-client-2.11.1-6.el7.x86_64.rpm: Header V4 RSA/SHA256
 Signature, key ID 8507efa5: NOKEY
Public key for pmm2-client-2.11.1-6.el7.x86_64.rpm is not installed
pmm2-client-2.11.1-6.el7.x86_64.rpm                                                                      |  42 MB  00:00:07
Retrieving key from file:///etc/pki/rpm-gpg/PERCONA-PACKAGING-KEY
Importing GPG key 0x8507EFA5:
 Userid     : "Percona MySQL Development Team (Packaging key) "
 Fingerprint: 4d1b b29d 63d9 8e42 2b21 13b1 9334 a25f 8507 efa5
 Package    : percona-release-1.0-25.noarch (@/percona-release-latest.noarch)
 From       : /etc/pki/rpm-gpg/PERCONA-PACKAGING-KEY
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
  Installing : pmm2-client-2.11.1-6.el7.x86_64                                                                              1/1
  Verifying  : pmm2-client-2.11.1-6.el7.x86_64                                                                              1/1

Installed:
  pmm2-client.x86_64 0:2.11.1-6.el7

Complete!

That’s all for software installation. I just need to configure the agent to connect to the PMM Server:


[opc@al ~]$ sudo pmm-admin config --server-url https://admin:secretpassword@18.194.119.174 --server-insecure-tls $(curl ident.me) generic OPC-$(hostname)


Checking local pmm-agent status...
pmm-agent is running.
Registering pmm-agent on PMM Server...
Registered.
Configuration file /usr/local/percona/pmm2/config/pmm-agent.yaml updated.
Reloading pmm-agent configuration...
Configuration reloaded.
Checking local pmm-agent status...
pmm-agent is running.

As you can see, I use the “ident-me” web service to identify my IP address, but you probably know your public IP.

This configuration goes to a file, which you should protect because it contains the password in clear text:


[opc@al ~]$ ls -l /usr/local/percona/pmm2/config/pmm-agent.yaml
-rw-r-----. 1 pmm-agent pmm-agent 805 Nov 28 20:22 /usr/local/percona/pmm2/config/pmm-agent.yaml

# Updated by `pmm-agent setup`.
---
id: /agent_id/853027e6-563e-42b8-a417-f144541358ff
listen-port: 7777
server:
  address: 18.194.119.174:443
  username: admin
  password: secretpassword
  insecure-tls: true
paths:
  exporters_base: /usr/local/percona/pmm2/exporters
  node_exporter: /usr/local/percona/pmm2/exporters/node_exporter
  mysqld_exporter: /usr/local/percona/pmm2/exporters/mysqld_exporter
  mongodb_exporter: /usr/local/percona/pmm2/exporters/mongodb_exporter
  postgres_exporter: /usr/local/percona/pmm2/exporters/postgres_exporter
  proxysql_exporter: /usr/local/percona/pmm2/exporters/proxysql_exporter
  rds_exporter: /usr/local/percona/pmm2/exporters/rds_exporter
  tempdir: /tmp
  pt_summary: /usr/local/percona/pmm2/tools/pt-summary
ports:
  min: 42000
  max: 51999
debug: false
trace: false

What is interesting here is the port that is used for the server to connect to pull metrics from the client: 42000

I’ll need to open this port. I can see the error from the PMM server: https://18.194.119.174/prometheus/targets

https://blog.dbi-services.com/wp-insides/uploads/sites/2/2020/11/Screens... 300w, https://blog.dbi-services.com/wp-insides/uploads/sites/2/2020/11/Screens... 1024w, https://blog.dbi-services.com/wp-insides/uploads/sites/2/2020/11/Screens... 768w, https://blog.dbi-services.com/wp-insides/uploads/sites/2/2020/11/Screens... 1536w, https://blog.dbi-services.com/wp-insides/uploads/sites/2/2020/11/Screens... 2048w" sizes="(max-width: 2560px) 100vw, 2560px" />

I open this port on the host:


[opc@al ~]$ sudo iptables -I INPUT 5 -i ens3 -p tcp --dport 42000 -m state --state NEW,ESTABLISHED -j ACCEPT

and on the ingress rules as well:
https://blog.dbi-services.com/wp-insides/uploads/sites/2/2020/11/Screens... 300w, https://blog.dbi-services.com/wp-insides/uploads/sites/2/2020/11/Screens... 1024w, https://blog.dbi-services.com/wp-insides/uploads/sites/2/2020/11/Screens... 768w, https://blog.dbi-services.com/wp-insides/uploads/sites/2/2020/11/Screens... 1536w, https://blog.dbi-services.com/wp-insides/uploads/sites/2/2020/11/Screens... 2048w" sizes="(max-width: 2560px) 100vw, 2560px" />

Testing

I’m running two processes here to test if I get the right metrics


[opc@al ~]$ while true ; do sudo dd bs=100M count=1                if=$(df -Th | sort -rhk3 | awk '/^[/]dev/{print $1;exit}') of=/dev/null ; done &
[opc@al ~]$ while true ; do sudo dd bs=100M count=10G iflag=direct if=$(df -Th | sort -rhk3 | awk '/^[/]dev/{print $1;exit}') of=/dev/null ; done &

The latter will do mostly I/O as I read with O_DIRECT and the former mainly system CPU as it reads from filesystem cache

Here is the Grafana dashboard from PMM:
https://blog.dbi-services.com/wp-insides/uploads/sites/2/2020/11/Screens... 300w, https://blog.dbi-services.com/wp-insides/uploads/sites/2/2020/11/Screens... 1024w, https://blog.dbi-services.com/wp-insides/uploads/sites/2/2020/11/Screens... 768w, https://blog.dbi-services.com/wp-insides/uploads/sites/2/2020/11/Screens... 1536w, https://blog.dbi-services.com/wp-insides/uploads/sites/2/2020/11/Screens... 2048w" sizes="(max-width: 2560px) 100vw, 2560px" />
I see my two processes, and 80% of CPU stolen by the hypervisor as I’m running on the Free Tier here which provides 1/8th of OCPU.

If you have MySQL or PostgreSQL databases there, they can easily be monitored (“pmm-admin add MySQL” or “pmm-admin add MySQL” you can see all that in Elisa Usai demo: https://youtu.be/VgOR_GCUpVw?t=1558).

Last test, let’s see what happens if the monitored host reboots:


[opc@al ~]$ date
Sat Nov 28 22:54:36 CET 2020
[opc@al ~]$ uptrack-uname -a
Linux al 4.14.35-2025.402.2.1.el7uek.x86_64 #2 SMP Fri Oct 23 22:27:16 PDT 2020 x86_64 x86_64 x86_64 GNU/Linux
[opc@al ~]$ uname -a
Linux al 4.14.35-1902.301.1.el7uek.x86_64 #2 SMP Tue Mar 31 16:50:32 PDT 2020 x86_64 x86_64 x86_64 GNU/Linux
[opc@al ~]$ sudo systemctl reboot
Connection to 130.61.159.88 closed by remote host.
Connection to 130.61.159.88 closed.

Yes… I do not reboot it frequently because it is Autonomous Linux and the Effective kernel is up to date (latest patches from October) even if the last restart was in March. But this deserves a test.

The first interesting thing is that PMM seems to keep the last read metrics for a while:
https://blog.dbi-services.com/wp-insides/uploads/sites/2/2020/11/Screens... 300w, https://blog.dbi-services.com/wp-insides/uploads/sites/2/2020/11/Screens... 1024w, https://blog.dbi-services.com/wp-insides/uploads/sites/2/2020/11/Screens... 768w, https://blog.dbi-services.com/wp-insides/uploads/sites/2/2020/11/Screens... 1536w, https://blog.dbi-services.com/wp-insides/uploads/sites/2/2020/11/Screens... 2048w" sizes="(max-width: 2560px) 100vw, 2560px" />
The host was shut down at 22:55 and it shows the last metrics for 5 minutes before stopping.

I had to wait for a while because my Availability Domain was out of capacity for the free tier:
https://twitter.com/FranckPachot/status/1332817607167250433?s=20


[opc@al ~]$ systemctl status pmm-agent.service
● pmm-agent.service - pmm-agent
   Loaded: loaded (/usr/lib/systemd/system/pmm-agent.service; enabled; vendor preset: disabled)
   Active: active (running) since Sat 2020-11-28 23:40:10 UTC; 1min 15s ago
 Main PID: 46446 (pmm-agent)
   CGroup: /system.slice/pmm-agent.service
           ├─46446 /usr/sbin/pmm-agent --config-file=/usr/local/percona/pmm2/config/pmm-agent.yaml
           └─46453 /usr/local/percona/pmm2/exporters/node_exporter --collector.bonding --collector.buddyinfo --collector.cpu ...

No problem, the installation of PMM client has defined the agent to restart on reboot.

In summary, PMM pulls the metric from the exporter, so you need to open inbound ports on the host where the PMM client agent runs. And HTTPS on the PMM server. Then everything is straightforward.

Cet article Cross-cloud PMM: which TCP ports to open est apparu en premier sur Blog dbi services.

Measuring Java JVM thread activity in Linux with task state sampling

There are plenty of JVM profilers out there, but before attaching with a profiler, you can get a high-level overview by just sampling the Linux /proc/PID/comm and /proc/PID/stat fields to see which JVMs threads are most active and whether they’re burning CPU or are blocked by some I/O.
When creating threads in Java, you can name them using the following syntax: Thread t = new Thread("MyThread") or thread.setName("Worker-123"). This name will show up in the comm fields in Linux /proc.

Measuring Java JVM thread activity in Linux with task state sampling

There are plenty of JVM profilers out there, but before attaching with a profiler, you can get a high-level overview by just sampling the Linux /proc/PID/comm and /proc/PID/stat fields to see which JVMs threads are most active and whether they’re burning CPU or are blocked by some I/O.
When creating threads in Java, you can name them using the following syntax: Thread t = new Thread("MyThread") or thread.setName("Worker-123"). This name will show up in the comm fields in Linux /proc.

Retrofitting Partitioning into Existing Applications: Example 1. General Ledger

This post is part of a series about the partitioning of database objects.

If you were designing an application to use partitioning, you would write the code to reference the column by which the data was partitioned so that the database does partition elimination.  However, with a pre-existing or 3rd party application you have to look at how the application queries the data and match the partitioning to that.
I am going to look at a number of cases from real-life, and discuss the thought process behind partitioning decisions.  These examples happen to come from PeopleSoft ERP systems, but that does not make them unusual.  PeopleSoft is just another packaged application.  In every case, it is necessary to have some application knowledge when deciding whether and how to introduce partitioning.

General Ledger

GL is an example of where OLTP and DW activities clash on the same table.  GL is a data warehouse of transactional information about a business.  The rationale for partitioning ledger data is a very typical example of partitioning for SQL query performance.
Dimensions Attributes
BUSINESS_UNIT
LEDGER
ACCOUNT
DEPTID
OPERATING_UNIT
PRODUCT
AFFILIATE
CHARTFIELD1/2/3
PROJECT_ID
BOOK_CODE
FISCAL_YEAR/ACCOUNTING_PERIOD
CURRENCY_CD/BASE_CURRENCY
…and others
POSTED_TOTAL_AMT
POSTED_BASE_AMT
POSTED_TRANS_AMT

You can think of it as a star-schema.  The ledger table is the fact table.  Dimensions are generated from standing data in the application. The reports typically slice and dice that data by time, and various dimensions.  The exact dimensions vary from business to business, and from time to time. 

In PeopleSoft, you can optionally configure summary ledger tables that are pre-aggregations of ledger data by a limited set of dimensions.  These are generated by batch processes.  However, it is not a commonly used feature, as it introduces latency between a change being made, and not being able to report on it from the summary ledgers until the refresh process has run.
Business transactions post continuously to the ledger.  Meanwhile, the accountants also want to query ledger data.  Especially at month-end, they want to post adjustments and see the consequences immediately.
Here is a typical query from the PeopleSoft GL Reporting tool (nVision).  The queries vary widely, but some elements (in bold) are always present.
#eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;">SELECT L.TREE_NODE_NUM,L2.TREE_NODE_NUM,SUM(A.POSTED_TOTAL_AMT)
FROM PS_LEDGER A
, PSTREESELECT05 L1
, PSTREESELECT10 L
, PSTREESELECT10 L2
WHERE A.LEDGER='ACTUALS'
AND A.FISCAL_YEAR=2020
AND A.ACCOUNTING_PERIOD BETWEEN 1 AND 11
AND L1.SELECTOR_NUM=30982 AND A.BUSINESS_UNIT=L1.RANGE_FROM_05
AND L.SELECTOR_NUM=30985 AND A.CHARTFIELD1=L.RANGE_FROM_10
AND L2.SELECTOR_NUM=30984 AND A.ACCOUNT=L2.RANGE_FROM_10
AND A.CURRENCY_CD='GBP'
GROUP BY L.TREE_NODE_NUM,L2.TREE_NODE_NUM
  • Queries are always on a particular ledger or group of ledgers.
    • You can have different ledgers for different accounting standards or reporting requirements.
    • Sometimes you can have adjustment ledgers – that are usually much smaller than the actuals ledgers – and they are aggregated with the main ledger.
    • In the latest version of the application, the budget ledger can be stored in the same table rather than a separate table.  Budget data has a different shape to actuals data and is created up to a year earlier.  It is generally much smaller and has a different usage profile.
    • So, there is always an equality criterion or IN-list criterion on LEDGER
  • Queries are always for a particular fiscal year.  This year, last year, sometimes the year before.  Therefore, there is always an equality criterion on FISCAL_YEAR.
  • Queries may be for a particular period, in which case there is a single-period equality criterion.  Alternatively, they are for the year-to-date, in which case there is a BETWEEN 1 AND current period criterion.  Sometimes for a particular quarter.  It is common to see queries on the same year-to-date period in the previous fiscal year.
  • Queries always specify the reporting currency.  Therefore, there is always a criterion on CURRENCY_CD, although many multi-national customers only have single currency ledgers, so the criterion may not be selective.
  • There will be varying criteria on other dimension columns on LEDGER by joining to the PSTREESELECT dimension tables.

What should I partition by?

We have seen the shape of the SQL, we know which columns are candidate partitioning keys because we have seen which columns have criteria.  LEDGER is a candidate. I also profile the data volumes. The following is a typical example.
#eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;">                                  Cum.
LEDGER NUM_ROWS % %
---------- ------------- ------ ------
XXXXCORE 759,496,900 43.9 43.9
CORE 533,320,425 30.8 74.7
XXXXGAAP 152,563,325 8.8 83.5
GAAP_ADJ 74,371,775 4.3 87.8
ZZZZ_CORE 34,251,514 2.0 89.8
C_XXCORE 29,569,381 1.7 91.5

-------------
sum 1,731,153,467
FISCAL_YEAR is an obvious choice.  
#eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;">    Fiscal 
Year NUM_ROWS %
---------- ------------- ------
2016 121
2017 32
2018 510,168,673 29.5
2019 574,615,980 33.2
2020 646,336,579 37.3
2021 32,082
-------------
sum 1,731,153,467
Most companies have monthly accounting periods (although some use other frequencies).  Then we have 12 accounting periods, plus bought forward (0), carry forward (998), and adjustments (999).
#eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;">    Fiscal Accounting                     Cum.
Year Period NUM_ROWS % %
---------- ---------- ---------- ------ ------

2020 0 66237947 3.8 37.3
1 42865339 2.5 33.5
2 47042492 2.7 31.0
3 53680915 3.1 28.3
4 50113011 2.9 25.2
5 44700409 2.6 22.3
6 54983221 3.2 19.7
7 51982401 3.0 16.6
8 44851506 2.6 13.6
9 56528783 3.3 11.0
10 52266343 3.0 7.7
11 70541810 4.1 4.7
12 10542380 .6 .6
999 22 .0 .0
********** ----------
sum 646336579
CURRENCY_CD is usually not a candidate for most companies because they report in a single currency, so all the rows are the same currency.  But even then, each ledger is a particular currency.  It is usually more effective to partition by LEDGER.
It is very tempting to interval partition on FISCAL_YEAR and then range or list sub-partition on ACCOUNTING PERIOD into 14 partitions each year.  Then Oracle will automatically add the range partitions for each FISCAL_YEAR.
#eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;">CREATE TABLE ps_ledger (...)
PARTITION BY RANGE (fiscal_year) INTERVAL (1)
SUBPARTITION BY RANGE (accounting_period)
SUBPARTITION TEMPLATE
(SUBPARTITION p00 VALUES LESS THAN (1)
,SUBPARTITION p01 VALUES LESS THAN (2)
...
,SUBPARTITION p12 VALUES LESS THAN (13)
,SUBPARTITION pxx VALUES LESS THAN (MAXVALUE))
(PARTITION VALUES LESS THAN (2019));
However, I would usually counsel against this.  You can only partition in two dimensions, and LEDGER is a very attractive option for partitioning.  Instead, I would consider partitioning in one dimension on the combination of two columns.  I would range partition on the combination of FISCAL_YEAR and ACCOUNTING_PERIOD.  However, if you also wish to introduce materialized views, then bear in mind that Partition Change Tracking does not work with multi-column partitioning (see also Partition Change Tracking During Materialized View Refresh and Query Rewrite).
#eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;">CREATE TABLE ps_ledger (...)
PARTITION BY RANGE (fiscal_year,accounting_period)
(PARTITION ledger_2017 VALUES LESS THAN (2018,0)
,PARTITION ledger_2018_bf VALUES LESS THAN (2018,1)
,PARTITION ledger_2018_p01 VALUES LESS THAN (2018,2)

,PARTITION ledger_2021_cf VALUES LESS THAN (2022,0)
);
  • The application never uses ACCOUNTING_PERIOD without also using FISCAL_YEAR.  Sometimes it uses FISCAL_YEAR without ACCOUNTING_PERIOD.
  • Partition elimination does work with multi-column partitions.
    • If you only specify a criterion on FISCAL_YEAR in a query you will still get partition elimination.
    • If you only specify a criterion on ACCOUNTING_PERIOD only you will not get partition elimination.
  • You cannot interval partition on multiple columns.  Therefore, you have to manage the annual addition of new partitions yourself.
  • Also, you cannot get partition change tracking for materialized view refresh to work with multi-column partitioning.
  • This leaves sub-partitioning to be used on a different column.

Should I create a MAXVALUE partition?

  • I deliberately haven't specified a MAXVALUE partition.  There are arguments for and against this.
    • The argument against MAXVALUE it is that you might forget to add the new partition for the new year, and then all the data for the next fiscal year goes into the same partition and over time the performance of the reports gradually decay.  By the time the performance issue is diagnosed several months may have piled up.  Then you need to split the partition into several partitions (or exchange it out, add the new partitions, and reinsert the data).  So not having a MAXVALUE partition forces the annual maintenance activity to be put in the diary, otherwise, the application will error when it tries to insert data for a FISCAL_YEAR for which there is currently no partition.
      • Now budget data is kept in the LEDGER table, you have do this before the budget ledger data is created, which is up to a year ahead of actuals data, so the risk of business interruption is minimal.
    • In favour of a MAXVALUE partition is that it prevents the error from occurring, but risks forgetting or deferring the maintenance for operational reasons.  
    • Of course, a MAXVALUE partition can be added at any time!

Should I Sub-partition?

It depends on the data.
  • The ledger table is a big table, and the LEDGER column is usually a selective low cardinality column.  So, it is a good candidate for sub-partitioning.  A single value list sub-partition for each of the largest actuals and budget ledgers, a default sub-partition for all other values.
  • This is not the case in summary ledger tables that are usually built on a single ledger.  So they are usually range partitioned on FISCAL_YEAR, ACCOUTING_PERIOD, and can then be sub-partitioned on a different dimension column
You can use a template if you want the same sub-partitions for every accounting period.
If you use interval partitioning, you have to use a subpartition template if you want to composite partition.
#eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;">CREATE TABLE ps_ledger (…)
PARTITION BY RANGE (fiscal_year,accounting_period) INTERVAL (1)
SUBPARTITION BY LIST (ledger)
SUBPARTITION TEMPLATE
(SUBPARTITION l_xxx VALUES LESS THAN ('XXX')
,SUBPARTITION l_yyy VALUES LESS THAN ('YYY')

,SUBPARTITION VALUES (DEFAULT))
(PARTITION VALUES LESS THAN (2019));
Sometimes, companies change their use of ledgers, in which case the sub-partitions need to reflect that.  You can still use the template to specify whatever is the currently required sub-partitioning.  If you ever recreate the table you end up explicitly specifying sub-partitions for every other partition.  The DDL becomes very verbose.  Although with deferred segment creation it wouldn't really matter if you had empty sub-partitions that had not been physically created for accounting periods where a ledger was not used.  
However, if I want to specify different tablespaces, no free space allowance, compression etc on certain partitions, then I need to use explicit partition and subpartition clauses, or come along afterwards and alter and rebuild them.  
I think explicit partition and subpartition names are administratively helpful when it comes to reporting on partition space usage, and when you archive/purge data by exchanging or dropping a partition.
#eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;">CREATE TABLE ps_ledger (…)
PARTITION BY RANGE (fiscal_year,accounting_period)
SUBPARTITION BY LIST (ledger)
(PARTITION ledger_2018 VALUES LESS THAN (2019,0) PCTFREE 0 COMPRESS
(SUBPARTITION ledger_2018_xxx VALUES ('XXX')
,SUBPARTITION ledger_2018_yyy VALUES ('YYY')
,SUBPARTITION ledger_2018_z_others VALUES (DEFAULT)
)
,PARTITION ledger_2019_bf VALUES LESS THAN (2019,1) PCTFREE 0 COMPRESS
(SUBPARTITION ledger_2019_bf_xxx VALUES ('XXX')
,SUBPARTITION ledger_2019_bf_yyy VALUES ('YYY')
,SUBPARTITION ledger_2019_bf_z_others VALUES (DEFAULT)
)

;

Indexing

Indexes can be partitioned or not independently of the table.  
  • Local indexes are partitioned in the same way as the table they are built on.  Therefore, there is a 1:1 relationship of table partition/sub-partition to index partition/sub-partition.
  • Global indexes are not partitioned the same way as the table.  You can have
    • Global partitioned indexes
    • Global non-partitioned indexes
Local indexes are easier to build and maintain.  When you do a partition operation on a table partition (add, drop, merge, split or truncate) the same operation is applied to local indexes.  However, if you do an operation on a table partition, any global index will become unusable, unless the DDL is done with the UPDATE INDEXES clause.  Using this option, when you drop a partition, all the corresponding rows are deleted from the index.  The benefit is that the indexes do not become unusable (in which case they would have to be rebuilt), but dropping the table partition takes longer because the rows have to be deleted from the index (effectively a DML operation).
As a general rule, indexes that contain the partitioning key, and at least the first partitioning key column is near the front of the index (I usually reckon in the first three key columns), should be locally partitioned unless there is a reason not to. 
With the general ledger, I tend to create pairs of local indexes that match the reporting analysis criteria.  
  • One of each of the pair of indexes leads on LEDGER, FISCAL_YEAR, ACCOUNTING_PERIOD and then the other dimension columns.  This supports single period queries.
  • The other index leads on LEDGER, FISCAL_YEAR, then the other dimension columns and finally ACCOUNTING_PERIOD is last because we are interested in a range of periods.
To support single period queries To support year-to-date queries
CREATE INDEX psgledger ON ps_ledger
(ledger
,fiscal_year
,accounting_period
,business_unit
,account
,project_id
,book_code
)
LOCAL
CREATE INDEX pshledger ON ps_ledger
(ledger
,fiscal_year
,business_unit
,account
,project_id
,book_code
,accounting_period
)
LOCAL
The unique index on the ledger table does include the partitioning keys.  But FISCAL_YEAR and ACCOUNTING_PERIOD are the last 2 of 25 columns.  This index is really to support queries from the on-line application and batch processes that post to the ledger.  So a query on BUSINESS_UNIT would have to prove every partition.  Therefore, I generally don't partition this index. It would be reasonable to globally partition it on LEDGER only.
#eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;">CREATE UNIQUE INDEX ps_ledger ON ps_ledger
(business_unit,ledger,account,altacct,deptid
,operating_unit,product,fund_code,class_fld,program_code
,budget_ref,affiliate,affiliate_intra1,affiliate_intra2,chartfield1
,chartfield2,chartfield3,project_id,book_code,gl_adjust_type
,date_code,currency_cd,statistics_code,fiscal_year,accounting_period
)…

Archiving

Taken together, FISCAL_YEAR and ACCOUNTING_PERIODY are effectively a proxy for the date of the accounting period.  So we will add partitions and can compress and later drop them after a period of time.
Once an accounting period has been closed it will not be written to again (or at least not much and not often), so it can then be compressed.  It can't be compressed before because the application is still applying ordinary DML (unless the Advanced Compression option has been licenced).  This applies to both conventional dictionary compression and Hybrid Columnar Compression on Exadata.
Most reports are on current and previous fiscal years.  Earlier years are candidates to be purged or archived by dropping or exchanging partitions.  When partitions are dropped, because you have global indexes, this should be with the UPDATE ALL INDEXES clause
#eeeeee; font-family: "courier new"overflow: auto; line-height: 95%; width: 95%;">ALTER TABLE ps_ledger DROP PARTITION ledger_2017 UPDATE INDEXES;