There’s an interesting question on the OTN database forum at present – why does an update of 300,000 rows take a billion buffer visits. (There are 25 indexes on the table – so you might point a finger at then initially, but only one of the indexes is going to be changed by the update so that should only account for around an extra 10 gets per row in a clean environment.)
The answer to the question hadn’t been reached by the time I wrote this note – and this note isn’t intended as a suggested cause of the issue, it’s just an example of the type of thing that could cause an apparent excess of buffer visits. Here’s a little bit of code I’ve just tested on 22.214.171.124 using an 8KB block size
drop sequence t1_seq; create sequence t1_seq; drop table t1; create table t1 as select rownum id, rpad('x',10) small_vc from all_objects where rownum <= 11 ; execute dbms_stats.gather_table_stats(user,'t1') select * from t1; pause execute snap_my_stats.start_snap update t1 set small_vc = upper('small_vc') where id = 11; execute snap_my_stats.end_snap
(The calls to the “snap_my_stats” package simply record the current contents of v$mystat joined to v$statname before and after the update and print the changes.)
The code simply creates a sequence and a table with 11 rows and no indexes, then updates one specific row in the table. However, where the “pause” appears, I start up 10 separate sessions to do the following:
column seqval new_value m_seq select t1_seq.nextval seqval from dual; update t1 set small_vc = upper(small_vc) where id = &m_seq; pause exit
So when I hit return on the pause for the first session, there are 10 separate active transactions on the single block in my table, one for each row except row 11. (And now you know what the sequence was for.)
Here’s a subset of the statistics from v$mystat after my update statement – remember, all I’ve done is update one row in one block using a tablescan:
Name Value ---- ----- session logical reads 45 db block gets 3 db block gets from cache 3 consistent gets 42 consistent gets from cache 42 consistent gets from cache (fastpath) 3 consistent gets - examination 39 db block changes 7 consistent changes 13 calls to kcmgrs 26 calls to kcmgas 1 calls to get snapshot scn: kcmgss 7 redo entries 3 redo size 764 undo change vector size 236 data blocks consistent reads - undo records applied 13 active txn count during cleanout 20 table scan blocks gotten 1
Note the last statistics – just one block accessed by tablescan – compared to the session logical reads at 45 buffer visits.
That 45 buffer visits comes from 3 current (db) block gets and 42 consistent gets.
Of the 42 consistent gets 39 are examinations, which – in the absence of indexes and hash clusters are visits to undo blocks
The 39 undo visits are to find 13 undo records to apply, and 26 visits (to undo segment headers) to find 13 transaction SCNs.
What you’re seeing is one session doing (relatively speaking) a lot of work to hide the effects of other sessions which have not yet committed their transactions. (This was only a quick test, so I haven’t examined why the larger figures appear in multiples of 13 rather than multiples of 10 – the number of other transactions – and since this note is just trying to demonstrate a concept I won’t be looking into it any further.)
If you have a number of “non-interfering” transactions – i.e. transactions that don’t actually lock each other out – on a single table then you could find that they spend more time hiding each other’s work than they do doing their own work.
The numbers change significantly if I commit the 10 transactions (but wait until they’ve all executed, so they are all active at the same time) before I do the update to the 11th row.
The numbers changed even more surprisingly when I forgot to collect stats on the table in my initial example of the test.
For those looking to the next upgrade – here’s an early warning from Oracle:
ANNOUNCEMENT: Deprecating the cursor_sharing = ‘SIMILAR’ setting (Doc ID 1169017.1)
“We recommend that customers discontinue setting cursor_sharing = SIMILAR due to the many problematic situations customers have experienced using it. The ability to set this will be removed in version 12 of the Oracle Database (the settings of EXACT and FORCE will remain available). Instead, we recommend the use of Adaptive Cursor Sharing in 11g.”
The rest of the note contains some interesting information about the behaviour and side effects of this option – which may also help you debug some library cache issues if you’re currently running with this value set in 11g.
Conclusion: the code should probably do a “distinct” in an inline view before calling the function, reducing the number of calls to the function from 71,288 to 3,429.
Footnote: There may be other efficiency steps to consider – I’m always a little suspicious of a query that uses “distinct”: possibly it’s hiding an error in logic, possibly it should be rewritten with an existence subquery somewhere, but sometimes it really is the best strategy. There’s are some unusual statistics names coming from autotrace in the OP’s system – I wonder if he’s installed one of Tanel Poder’s special library hacks.
Following on from yesterday’s post on consistent reads, I thought I’d make the point that the way you work can make an enormous difference to the amount of work you do. Here’s a silly little demo (in 10.2.0.3):
drop table t1 purge; create table t1 (id number, n1 number); insert into t1 values (1,0); insert into t1 values (2,0); commit; execute dbms_stats.gather_table_stats(user,'t1') execute snap_my_stats.start_snap begin for i in 1..1000 loop update t1 set n1 = i where id = 1; end loop; end; / execute snap_my_stats.end_snap set doc off doc Output - 10.2.0.3 (some rows deleted) Name Value ---- ----- opened cursors cumulative 11 user calls 6 recursive calls 1,068 recursive cpu usage 7 session logical reads 4,041 CPU used when call started 7 CPU used by this session 7 DB time 6 db block gets 1,030 db block gets from cache 1,030 consistent gets 3,011 consistent gets from cache 3,011 consistent gets - examination 4 db block changes 2,015 change write time 4 free buffer requested 1,014 switch current to new buffer 1,000 calls to kcmgas 1,014 calls to get snapshot scn: kcmgss 3,009 redo entries 960 redo size 295,160 undo change vector size 111,584 no work - consistent read gets 1,004 table scans (short tables) 1,001 table scan rows gotten 2,002 table scan blocks gotten 1,001 buffer is pinned count 1,000 execute count 1,009 #
I’ve created two rows in a table, then updated one of them 1,000 times – using a table scan to do the update. I haven’t yet committed my transaction. At this point I’m going to use a second session to run the same update loop on the second row in the table:
begin for i in 1..1000 loop update t1 set n1 = i where id = 2; end loop; end; / Name Value ---- ----- opened cursors cumulative 8 user calls 6 recursive calls 1,009 recursive cpu usage 170 session logical reads 965,999 CPU used when call started 172 CPU used by this session 172 DB time 173 db block gets 1,030 db block gets from cache 1,030 consistent gets 964,969 consistent gets from cache 964,969 consistent gets - examination 961,965 db block changes 3,016 consistent changes 1,001,000 free buffer requested 1,015 CR blocks created 1,001 calls to kcmgas 1,015 calls to get snapshot scn: kcmgss 3,008 redo entries 1,936 redo size 358,652 undo change vector size 111,608 data blocks consistent reads - undo records applied 1,001,000 cleanouts and rollbacks - consistent read gets 1,001 immediate (CR) block cleanout applications 1,001 active txn count during cleanout 2,000 cleanout - number of ktugct calls 1,001 IMU CR rollbacks 41,041 table scans (short tables) 1,001 table scan rows gotten 2,002 table scan blocks gotten 1,001 execute count 1,006
Many of the statistics are (virtually) identical (e.g. “execute count”, “db block gets”, “free buffer requested”); some show an increase by 1,000 (often from zero) – largely because we have to worry 1,000 times about cleaning out the current block and creating a read-consistent version so that we can check if it can be updated.
But the most noticeable changes are in the “CPU time” and “consistent gets” because of the 1,000 times we have to apply 1,000 undo records to the block as we create the read-consistent version of the block. The CPU time has gone from 7 (hundredths of a second) to 172 because of (roughly) 1,000,000 “consistent gets – examination”. As I mentioned yesterday, this matches closely to “data blocks consistent reads – undo records applied” so we know why they are happening. Watch out in your batch jobs – if you have a lot of concurrent update activity going on a significant fraction of the workload may be the constant re-creation of CR clones.
However, there is another interesting detail to watch out for – what happens if I change the update execution path from a tablescan to an indexed access path:
create table t1 (id number, n1 number); insert into t1 values (1,0); insert into t1 values (2,0); commit; execute dbms_stats.gather_table_stats(user,'t1') create index t1_i1 on t1(id); -- Make indexed access path available.
Then with an index hint in my update code, I get the following effects (having done the same update loop on row 1 in the first session, of course):
begin for i in 1..1000 loop update /*+ index(t1) */ t1 set n1 = i where id = 2; -- indexed access path hinted end loop; end; / Name Value ---- ----- opened cursors cumulative 7 user calls 6 recursive calls 1,006 recursive cpu usage 11 session logical reads 2,036 CPU used when call started 11 CPU used by this session 11 DB time 11 db block gets 1,030 db block gets from cache 1,030 consistent gets 1,006 consistent gets from cache 1,006 consistent gets - examination 6 db block changes 2,015 free buffer requested 14 shared hash latch upgrades - no wait 1,000 calls to kcmgas 14 calls to get snapshot scn: kcmgss 1,004 redo entries 960 redo size 295,144 undo change vector size 111,608 index crx upgrade (positioned) 1,000 index scans kdiixs1 1,000 execute count 1,005
The difference is astonishing – where did all the ‘create CR copy’ activity go ?
I’ve pointed out before now that choosing a different execution plan for an update can have a big impact on performance – this is just another example demonstrating the point.
Here’s a quick demo to make a point about consistent reads (prompted by a question on the Oracle-L mailing list):
SQL> drop table t1; Table dropped. SQL> create table t1 (n1 number); Table created. SQL> insert into t1 values(0); 1 row created. SQL> begin 2 for i in 1..1000 loop 3 update t1 set n1 = i; 4 end loop; 5 end; 6 / PL/SQL procedure successfully completed.
Note that I haven’t issued a commit in this session, and all I’ve got is a single row in the table (and because it’s my usual demo setup of locally managed tablespaces with uniform extents of 1MB using freelist management I know that that one row is in the first available block of the table).
How much work is a second session going to do to scan that table ?
SQL> alter system flush buffer_cache; SQL> execute snap_my_stats.start_snap SQL> select * from t1; SQL> set serveroutput on size 1000000 format wrapped SQL> execute snap_my_stats.end_snap --------------------------------- Session stats - 18-Apr 11:33:01 Interval:- 2 seconds --------------------------------- Name Value ---- ----- session logical reads 967 consistent gets 967 consistent gets from cache 967 consistent gets - examination 964 consistent changes 1,001 CR blocks created 1 data blocks consistent reads - undo records applied 1,001 IMU CR rollbacks 41
The snap_my_stats package is similar in concept to Tom Kyte’s “runstats” or Tanel Poder’s “snapper” program to capture changes in values in the dynamic performance views over short time periods. In this case I’ve deleted all but a few of the larger changes, and a couple of small changes.
The figure that stands out (probably) is the “session logical reads” – we’ve done 967 logical I/Os to scan a tables of just one block. The reason for this is that we’ve created a read-consistent copy of that one block (“CR blocks created” = 1), and it has taken a lot of work to create that copy. We’ve had to apply 1,001 undo records (“data blocks consistent reads – undo records applied” = 1001).
Most of those undo records come from individual accesses (which are of the cheaper “consistent gets – examination” type that only need a single get on the “cache buffers chains” latch) to undo blocks, following the “UBA (undo block address)” pointer in the relevant ITL entry of the table block, but since this is a 10g database the last few undo records come out of the “In-memory Undo” of the other session. Basically the cloning operation is something like this:
It is an interesting point that as the first session created undo records it would pin and fill undo blocks – so would only do a few current gets (one for each block) on the undo blocks it was using. As another process reverses out the changes in a CR clone it has to get and release each undo block every time it wants a single undo record … applying undo records introduces far more latch and buffer activity that the original generation of the undo.
It’s worth knowing that there are three statistics relating to applying undo records:
transaction tables consistent reads - undo records applied Estimating "old" commit SCNs during delayed block cleanout data blocks consistent reads - undo records applied Creating CR clones rollback changes - undo records applied The result of a real "rollback;"
The second step in the list of actions is: “Notice uncommitted transaction”. It’s probably worth pointing out that another part of the ITL entry holds the transaction id (“xid”) which implicitly identifies the undo segment and transaction table slot in that segment that has been used to hold the transaction state. The current contents of that slot allow Oracle to determine whether or not (and when, if necessary) the transaction was committed.
I have some broken links in my old blog entries right now, so if you’re looking for something, then download the whole zip file from here:
I have uploaded a .zip file (for Windows) and a .tar.gz file (for Unix/Mac). The scripts are all the same with differences in the CR/LF bytes in the files and the init.sql and i.sql which have some OS specific commands in them.
I also uploaded the latest PerfSheet there where I fixed an annoying bug which complained about some missing reference files when opening the file.
I plan to fix the broken links some time between now and my retirement.
Here’s an example of a slightly less common data deadlock graph (dumped from 11gR2, in this case):
[Transaction Deadlock] The following deadlock is not an ORACLE error. It is a deadlock due to user error in the design of an application or from issuing incorrect ad-hoc SQL. The following information may aid in determining the deadlock: Deadlock graph: ---------Blocker(s)-------- ---------Waiter(s)--------- Resource Name process session holds waits process session holds waits TX-00010006-00002ade 16 34 X 12 50 S TX-00050029-000037ab 12 50 X 16 34 S session 34: DID 0001-0010-00000021 session 50: DID 0001-000C-00000024 session 50: DID 0001-000C-00000024 session 34: DID 0001-0010-00000021 Rows waited on: Session 50: no row Session 34: no row Information on the OTHER waiting sessions: Session 50: pid=12 serial=71 audsid=1560855 user: 52/TEST_USER O/S info: user: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX program: sqlplus.exe application name: SQL*Plus, hash value=3669949024 Current SQL Statement: update t1 set n3 = 99 where id = 100 End of information on OTHER waiting sessions. Current SQL statement for this session: update t1 set n3 = 99 where id = 200
The anomaly is that the waiters are both waiting on S (share) mode locks for a TX enqueue.
It’s fairly well known that Share (and Share sub exclusive, SSX) lock waits for TM locks are almost a guarantee of a missing “foreign key index”; and it’s also fairly well known that Share lock waits for TX locks can be due to bitmap collisions, issues with ITL (interested transaction list) waits, various RI (referential integrity) collisions including simultaneous inserts of the same primary key.
A cause for TX/4 waits that is less well known or overlooked (because a featrure is less-well used) is simple data collisions on IOTs (index organized tables). In the example above t1 is an IOT with a primary key of id. Session 34 and 50 have both tried to update the rows with ids 100 and 200 – in the opposite order. If this were a normal heap table the deadlock graph would be showing waits for eXclusive TX locks, because it’s an IOT (and therefore similar in some respects to a primary key wait) we see waits for Share TX locks.
First, a reminder – my Advanced Oracle Troubleshooting v2.0 online seminar starts next week already. Last chance to sign up, I can accept registrations until Sunday :-)
I won’t do another AOT seminar before Oct (or Nov) this year. More details and sign-up here:
I have rescheduled my Advanced SQL Tuning and Partitioning & Parallel Execution for Performance seminars too. I will do them in September/October. Unfortunately I’m too busy right now to do them before the summer.
And that’s all the travel I will do this year…
I’ll soon announce the 2nd EsSN virtual conference too ;-)
Free online stuff:
Perhaps in a month or so I will do another hacking session (I’ll plan 2 hours this time, 1 hour isn’t nearly enough for going deep). The topic will probably be about low-level details of SQL plan execution internals… stay tuned!
A recent question on the OTN database forum:
Can any one please point to me a document or a way to calculate the average number of rows per block in oralce 10.2.0.3
One answer would be to collect stats and then approximate as block / avg_row_len – although you have to worry about things like row overheads, the row directory, and block overheads before you can be sure you’ve got it right. On top of this, the average might not be too helpful anyway. So here’s another (not necessarily fast) option that gives you more information about the blocks that have any rows in them (I picked the source$ table from a 10g system because source$ is often good for some extreme behaviour).
break on report compute sum of tot_blocks on report compute sum of tot_rows on report column avg_rows format 999.99 select twentieth, min(rows_per_block) min_rows, max(rows_per_block) max_rows, sum(block_ct) tot_blocks, sum(row_total) tot_rows, round(sum(row_total)/sum(block_ct),2) avg_rows from ( select ntile(20) over (order by rows_per_block) twentieth, rows_per_block, count(*) block_ct, rows_per_block * count(*) row_total from ( select fno, bno, count(*) rows_per_block from ( select dbms_rowid.rowid_relative_fno(rowid) as fno, dbms_rOwId.rowid_block_number(rowid) as bno from source$ ) group by fno, bno ) group by rows_per_block order by rows_per_block ) group by twentieth order by twentieth ;
I’ve used the ntile() function to split the results into 20 lines, obviously you might want to change this according to the expected variation in rowcounts for your target table. In my case the results looked like this:
TWENTIETH MIN_ROWS MAX_ROWS TOT_BLOCKS TOT_ROWS AVG_ROWS ---------- ---------- ---------- ---------- ---------- -------- 1 1 11 2706 3470 1.28 2 12 22 31 492 15.87 3 23 34 30 868 28.93 4 35 45 20 813 40.65 5 46 57 13 664 51.08 6 59 70 18 1144 63.56 7 71 81 23 1751 76.13 8 82 91 47 4095 87.13 9 92 101 79 7737 97.94 10 102 111 140 14976 106.97 11 112 121 281 32799 116.72 12 122 131 326 41184 126.33 13 132 141 384 52370 136.38 14 142 151 325 47479 146.09 15 152 161 225 35125 156.11 16 162 171 110 18260 166.00 17 172 181 58 10207 175.98 18 182 191 18 3352 186.22 19 193 205 22 4377 198.95 20 206 222 16 3375 210.94 ---------- ---------- sum 4872 284538
Of course, the moment you see a result like this it prompts you to ask more questions.
Is the “bell curve” effect that you can see centred around the 13th ntile indicative of a normal distribution of row lengths – if so why is the first ntile such an extreme outlier – is that indicative of a number of peculiarly long rows, did time of arrival have a special effect, is it the result of a particular pattern of delete activity … and so on.
Averages are generally very bad indicators if you’re worried about the behaviour of an Oracle system.
Steve Bamber has written up a case study of library cache latch contention troubleshooting of an Apex application with LatchProf. I’m happy that others also see the value and have had success with my new LatchProf based latch contention troubleshooting approach which takes into account both sides of the contention story (latch waiters and latch holders/blockers) as opposed to the guesswork used previously (hey if it’s shared pool latch contention – is must be about bad SQL not using bind variables …. NOT always…)
Anyway, I’m happy. If you have success stories with LatchProf, please let me know!
As a second topic of interest, Laimutis Nedzinskas has written some good notes about the effect and overhead of Flashback Database option when you are using and modifying (nocache) LOBs. We’ve exchanged some mails on this topic and yeah, my clients have sure seen some problems with this combination as well. You basically want to keep your LOBs cached when using FB database…