It’s hard to understand all the ramifications of Oracle’s undo handling, and it’s not hard to find cases where the resulting effects are very confusing. In a recent post on the OTN database forum resulted in one response insisting that the OP was obviously updating a table with frequent commits from one session while querying it from another thereby generating a large number of undo reads in the querying session.
I planned to write on this for quite some time, but failed to do so. Sorry about that. Today I finally got time and desire to describe a situation from the title. It was observed on an Oracle 126.96.36.199 running Solaris SPARC; manifested itself as a severe CPU burning at 100% utilization with ‘latch free’ on the first place of the Top 5 wait events in Statspack report.
Here is part of a 1 hour report:
There’s an interesting question on the OTN database forum at present – why does an update of 300,000 rows take a billion buffer visits. (There are 25 indexes on the table – so you might point a finger at then initially, but only one of the indexes is going to be changed by the update so that should only account for around an extra 10 gets per row in a clean environment.)
The answer to the question hadn’t been reached by the time I wrote this note – and this note isn’t intended as a suggested cause of the issue, it’s just an example of the type of thing that could cause an apparent excess of buffer visits. Here’s a little bit of code I’ve just tested on 188.8.131.52 using an 8KB block size
drop sequence t1_seq; create sequence t1_seq; drop table t1; create table t1 as select rownum id, rpad('x',10) small_vc from all_objects where rownum <= 11 ; execute dbms_stats.gather_table_stats(user,'t1') select * from t1; pause execute snap_my_stats.start_snap update t1 set small_vc = upper('small_vc') where id = 11; execute snap_my_stats.end_snap
(The calls to the “snap_my_stats” package simply record the current contents of v$mystat joined to v$statname before and after the update and print the changes.)
The code simply creates a sequence and a table with 11 rows and no indexes, then updates one specific row in the table. However, where the “pause” appears, I start up 10 separate sessions to do the following:
column seqval new_value m_seq select t1_seq.nextval seqval from dual; update t1 set small_vc = upper(small_vc) where id = &m_seq; pause exit
So when I hit return on the pause for the first session, there are 10 separate active transactions on the single block in my table, one for each row except row 11. (And now you know what the sequence was for.)
Here’s a subset of the statistics from v$mystat after my update statement – remember, all I’ve done is update one row in one block using a tablescan:
Name Value ---- ----- session logical reads 45 db block gets 3 db block gets from cache 3 consistent gets 42 consistent gets from cache 42 consistent gets from cache (fastpath) 3 consistent gets - examination 39 db block changes 7 consistent changes 13 calls to kcmgrs 26 calls to kcmgas 1 calls to get snapshot scn: kcmgss 7 redo entries 3 redo size 764 undo change vector size 236 data blocks consistent reads - undo records applied 13 active txn count during cleanout 20 table scan blocks gotten 1
Note the last statistics – just one block accessed by tablescan – compared to the session logical reads at 45 buffer visits.
That 45 buffer visits comes from 3 current (db) block gets and 42 consistent gets.
Of the 42 consistent gets 39 are examinations, which – in the absence of indexes and hash clusters are visits to undo blocks
The 39 undo visits are to find 13 undo records to apply, and 26 visits (to undo segment headers) to find 13 transaction SCNs.
What you’re seeing is one session doing (relatively speaking) a lot of work to hide the effects of other sessions which have not yet committed their transactions. (This was only a quick test, so I haven’t examined why the larger figures appear in multiples of 13 rather than multiples of 10 – the number of other transactions – and since this note is just trying to demonstrate a concept I won’t be looking into it any further.)
If you have a number of “non-interfering” transactions – i.e. transactions that don’t actually lock each other out – on a single table then you could find that they spend more time hiding each other’s work than they do doing their own work.
The numbers change significantly if I commit the 10 transactions (but wait until they’ve all executed, so they are all active at the same time) before I do the update to the 11th row.
The numbers changed even more surprisingly when I forgot to collect stats on the table in my initial example of the test.
OK, let’s look at the next portion of the index block dump. Following the hex dump of the block (as we ended Part I of the series) is the second part of the block header (see below): Block header dump: 0x0201490a Object id on Block? Y seg/obj: 0x1c205 csc: 0x00.2d11214 itc: 2 flg: - [...]
Oracle uses for its read consistency model a true multi-versioning approach which allows readers to not block writers and vice-versa, writers to not block readers. Obviously this great feature allowing highly concurrent processing doesn't come for free, since somewhere the information to build multiple versions of the same data needs to be stored.
Oracle uses the so called undo information not only to rollback on-going transactions but also to re-construct old versions of blocks if required. Very simplified when reading data Oracle knows the point in time (which corresponds to an internal counter called SCN, System Change Number) that data needs to be consistent with. In the default READ COMMITTED isolation mode this point in time is defined when a statement starts to execute. You could also say at the moment a statement starts to run its result is pre-ordained. When Oracle processes a block it checks if the block is "old" enough and if it discovers that the block content is too new (has been changed by other sessions but the current access is not supposed to see this updated content according to the point-in-time assigned to the statement execution) it will start to create a copy of the block and use the information available from the corresponding undo segment to re-construct an older version of the block. Note that this process can be iterative: If after re-constructing the older version of the block it's still not sufficiently old more undo information will be used to go further back in time.