A couple of weeks ago I started a competition to win 2 copies of Oracle E-Business Suite R12 Integration and OA Framework Development and Extension Cookbook by Andy Penver. Thanks to Packt for donating the prizes. The competition closed yesterday and the lucky winners are:
I’ve sent your email addresses to my contact at Packt, who will contact you to deliver your e-book.
Cheers
Tim…
The above is so cool.
The graphic shows the latency heatmap of “log file sync”. I was running a swingbench load and at the same time throttling I/O such that latencies started off good then got worse and then back to normal.
All I did was type
sqlplus / as sysdba @OraLatencyMap_event 3 "log file sync"
This was created by Luca Canali , see http://externaltable.blogspot.com/2013/05/latency-heat-map-in-sqlplus-with.htm
Now if we combine this monitoring, with the I/O throttling documeted by Frits Hoogland here https://fritshoogland.wordpress.com/2012/12/15/throttling-io-with-linux/ , we can really have some fun and even draw latency words:
Run an auto refresh color coded heatmap on “log file sync” in sqlplus by typing
sqlplus / as sysdba @OraLatencyMap_event 3 "log file sync"
where OraLatencyMap_event.sql and OraLatencyMap_internal.sql are your current directory or sqlpath
Now to play with LGWR latency with cgroup throttles see
https://fritshoogland.wordpress.com/2012/12/15/throttling-io-with-linux/
# install cgroups on 2.6.24 LINUX or higher yum intall cgroup # setup /cgroup/blkio grep blkio /proc/mounts || mkdir -p /cgroup/blkio ; mount -t cgroup -o blkio none /cgroup/blkio cgcreate -g blkio:/iothrottle # find the device you want df -k # my Oracle log file divice was ls -l /dev/mapper/vg_source-lv_home lrwxrwxrwx. 1 root root 7 May 1 21:42 /dev/mapper/vg_source-lv_home -> ../dm-2 # my device points to /dev/dm-2 ls -l /dev/dm-2 brw-rw----. 1 root disk 253, 2 May 1 21:42 /dev/dm-2 # my device major and minor numbers are "253, 2" # create a write throtte on this device (for read just replace "write" with "read" # this limits it to 10 writers per second cgset -r blkio.throttle.write_iops_device="253:2 10" iothrottle # look for lgwr ps -ef | grep lgwr oracle 23165 1 0 13:35 ? 00:00:19 ora_lgwr_o1123 # put lgwr pid into throttle group echo 23165 > /cgroup/blkio/iothrottle/tasks # now play with different throttles cgset -r blkio.throttle.write_iops_device="253:2 1" iothrottle cgset -r blkio.throttle.write_iops_device="253:2 10" iothrottle cgset -r blkio.throttle.write_iops_device="253:2 100" iothrottle cgset -r blkio.throttle.write_iops_device="253:2 1000" iothrottle # if you are finished then delete the throttle control group cgdelete blkio:/iothrottle
Here’s a quick and dirty script to create a procedure (in the SYS schema – so be careful) to check the Hakan Factor for an object. If you’re not familiar with the Hakan Factor, it’s the value that gets set when you use the command “alter table minimize records_per_block;”.
I was prompted to publish this note by an item on the OTN SQL forum describing a problem with partition exchange with a table when there were bitmap indexes in place and the table had been changed to have some extra columns added. (Problem as yet unresolved as I publish).
If you start playing with the Hakan Factor, you’ll find that there are some odd little bugs in what gets stored and how it gets used. (SQL updated to use bitand() to reflect comments below and Karsten Spang’s blog note; also edited following a comment on OTN to show the rest of the spare1 flag bits)
create or replace procedure show_hakan(
i_table in varchar2,
i_owner in varchar2 default user
) as
m_obj number(8,0);
m_flags varchar2(12);
m_hakan number(8,0);
begin
/* created by show_hakan.sql */
select
obj#,
/*
case
when (spare1 > 5 * power(2,15))
then (spare1 - 5 * power(2,15))
when (spare1 > power(2,17))
then (spare1 - power(2,17))
when (spare1 > power(2,15))
then (spare1 - power(2,15))
else spare1
end hakan
*/
to_char(
bitand(
spare1, to_number('ffff8000','xxxxxxxx')
),
'xxxxxxxx'
) flags,
bitand(spare1, 32767) hakan -- 0x7fff
into
m_obj,
m_flags,
m_hakan
from
tab$
where obj# in (
select object_id
from dba_objects
where object_name = upper(i_table)
and object_type = 'TABLE'
and owner = upper(i_owner)
)
;
dbms_output.put_line(
'Hakan factor for object ' ||
m_obj || ' (' ||
i_owner || '.' ||
i_table || ') is ' ||
m_hakan || ' with flags ' ||
m_flags
);
end;
/
drop public synonym show_hakan;
create public synonym show_hakan for show_hakan;
grant execute on show_hakan to public;
You’ll notice that I’ve done an “upper()” on the table and owner – that means you’re in trouble if you have created an schemas or tables with mixed-case names (but you wouldn’t do that in a production system, would you?)
One of the odd details of the Hakan factor is that the value it shows is one less than the number of rows that will be stored in a block; and since it looks as if the factor is not allowed to drop to zero, you can’t hack the Hakan factor to force one row per block.
So here’s a (trivial and sub-optimal) piece of code to check current number of rows per block in a simple heap table (assuming the tablespace consists of a single file):
select ct, count(*) from ( select dbms_rowid.rowid_block_number(rowid), count(*) ct from t1 group by dbms_rowid.rowid_block_number(rowid) ) group by ct order by ct ;
Here’s the output of a session, running under 9.2.0.8, cut and pasted from the screen:
SQL> @afiedt.buf CT COUNT(*) ---------- ---------- 9 1 16 1 SQL> alter table t1 nominimize records_per_block; SQL> alter table t1 minimize records_per_block; SQL> execute show_hakan('t1') Hakan factor for object 48865 (TEST_USER.t1) is: 15 SQL> alter table t1 move; SQL> @afiedt.buf CT COUNT(*) ---------- ---------- 10 1 15 1 SQL> alter table t1 nominimize records_per_block; SQL> alter table t1 minimize records_per_block; SQL> execute show_hakan('t1') Hakan factor for object 48865 (TEST_USER.t1) is: 14
Every time you moved the table, 9.2.0.8 (and earlier) used the actual stored value of the Hakan Factor to rebuild the table; but if you regenerated the Hakan Factor the stored value was one less than the actual row count. So if you kept repeating the process the number of rows per block would decrease by one each time and the table would get bigger and bigger.
It’s a silly example – but the real-world relevance was that a direct path insert behaved differently from a normal insert and this could result in a significant amount of wasted space if you were doing bulk loads in your overnight batch; so the code changed in 10g to make the normal and direct path inserts consistent with each other, but the change went the wrong way and, as a side effect, you get one more row per block than suggested by the Hakan Factor – and you can’t trick the Hakan factor into enforcing one row per block any more.
I’ve had my 13″ MacBook Pro since the mid 2009 refresh and it’s been really reliable. Apart from one brief visit to Apple to replace a noisy fan, I’ve had no worries. A few years ago I upgraded from 4G to 8G RAM, so I’m not stranger to taking the back off it.
Even though it’s quite old by computer geek standards, I really don’t have any performance problems. I do demos with a couple of Linux VMs running Oracle and it works OK. Despite this, I was bored the other night and decided to buy an SSD to replace the internal hard drive. It arrived yesterday, so during last nights insomnia, I decided to fit the hard drive, rather than stare at the ceiling.
The actual hard drive replacement is pretty simple. You can see an example of it here. It takes about 5 minutes.
The transfer of the data proved a little more tricky than I expected though…
Attempt 1:
I use Time Machine for backups, so I slapped in the new hard drive, booted from the CD and expected to just restore from Time Machine. It turns out my Time Machine backups weren’t as complete as I thought.
Attempt 2:
No worries. I connected my old hard drive using a USB cable, booted from the CD and used the Disk Utility to restore the old hard drive to the new SSD. That would have been fine, except the new hard drive was fractionally smaller than the old one. That would have been fine for a Time Machine backup, since the old drive was not completely full, but for an image restore it’s a big no-no. Now I was starting to get worried. I could always replace the old drive, but I was starting to think I might have wasted my money.
Attempt 3:
So finally I bit the bullet and re-installed Snow Leopard (the most recent media I had), upgraded to Lion, then Mountain Lion through the App Store. Once that was done I dragged my apps and data from the old drive across to the new drive. Job’s a good’un!
So it got solved in the end, but it wasn’t quite the blissful experience I expected.
Cheers
Tim…
Update: Thanks to Luis Marques for reminding me about TRIM, with this Twitter comment, “Tim, don’t forget to enable TRIM on SSD (if it supports it) using this http://chameleon.alessandroboschini.it/index.php or http://www.groths.org/trim-enabler/“
This is yet another blogpost on Oracle’s direct path read feature which was introduced for non-parallel query processes in Oracle version 11.
For full table scans, a direct path read is done (according to my tests and current knowledge) when:
- The segment is bigger than 5 * _small_table_threshold.
- Less than 50% of the blocks of the table is already in the buffercache.
- Less than 25% of the blocks in the buffercache are directy.
Also (thanks to Freek d’Hooge who pointed me to an article from Tanel Poder) you can change the optimizer statistics to change the segment size for the direct path decision. Please mind that whilst this uses the statistics the optimizer uses, this is NOT an optimizer decision, but a decision made in the “code path”, so during execution.
So let’s take a look at my lab environment (Oracle Linux 6.3, 64 bit, Oracle 11.2.0.3 and ASM)
Small table threshold:
NAME VALUE -------------------------------------------------- ------- _small_table_threshold 1011
Table information:
TS@v11203 > select blocks from user_segments where segment_name = 'T2';
BLOCKS
----------
21504
So if we take small table threshold times and multiply it by five, we get 5055. This means that the size of table T2 is more than enough so should be scanned via direct path:
TS@v11203 > select s.name, m.value from v$statname s, v$mystat m where m.statistic#=s.statistic# and s.name = 'table scans (direct read)'; NAME VALUE ---------------------------------------------------------------- ---------- table scans (direct read) 0 TS@v11203 > select count(*) from t2; COUNT(*) ---------- 1000000 TS@v11203 > select s.name, m.value from v$statname s, v$mystat m where m.statistic#=s.statistic# and s.name = 'table scans (direct read)'; NAME VALUE ---------------------------------------------------------------- ---------- table scans (direct read) 1
Well, that’s that, this seems quite simple.
I’ve created a relatively big table and created a (normal) index on it in the same database. The index is created on a single column, called ‘id’. If I issue a count(id), the whole index needs to be scanned, and Oracle will choose a fast full index scan. A fast full index scan is a scan which just needs to read all the blocks, not necessarily in leaf order. This means it can use multiblock reads (which reads in the order of allocated adjacent blocks).
Let’s check just to be sure:
TS@v11203 > select count(id) from bigtable; Execution Plan ---------------------------------------------------------- Plan hash value: 106863591 ------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | 13 | 19662 (2)| 00:03:56 | | 1 | SORT AGGREGATE | | 1 | 13 | | | | 2 | INDEX FAST FULL SCAN| I_BIGTABLE | 34M| 425M| 19662 (2)| 00:03:56 | ------------------------------------------------------------------------------------ Note ----- - dynamic sampling used for this statement (level=2)
If we look at the index size, the size of the index makes this segment a candidate for direct path reads:
TS@v11203 > select blocks from user_segments where segment_name = 'I_BIGTABLE';
BLOCKS
----------
72704
If we look at number of small table threshold times five (5055), this index is much bigger than that. Also, this is bigger than table T2. Let’s execute select count(id) from bigtable, and look at the statistic ‘index fast full scans (direct read)’:
TS@v11203 > select s.name, m.value from v$statname s, v$mystat m where m.statistic#=s.statistic# and s.name = 'index fast full scans (direct read)'; NAME VALUE ---------------------------------------------------------------- ---------- index fast full scans (direct read) 0 TS@v11203 > select count(id) from bigtable; COUNT(ID) ---------- 32000000 TS@v11203 > select s.name, m.value from v$statname s, v$mystat m where m.statistic#=s.statistic# and s.name = 'index fast full scans (direct read)'; NAME VALUE ---------------------------------------------------------------- ---------- index fast full scans (direct read) 0
Huh? This statistic tells me there hasn’t been a direct path read! This means that this read has been done in the “traditional way”. This is a bit…counter intuitive. I’ve traced the session, and indeed it’s doing the traditional multiblock reads via the scattered read waits.
I did a fair bit of fiddling around with the parameters which are reported to be involved, and found out I can get the database to do direct path reads by changing the parameter “_very_large_object_threshold”. The information found on the internet reports this value is in megabytes. A quick stroll through a number of different database (all on 11.2.0.3) shows this parameter is quite probably statically set at “500″.
If I calculate the size in megabytes of the index I_BIGTABLE, the size is 568M. This is clearly higher than the value of “_very_large_object_threshold”. I can get the same index scanned via direct path reads by changing the value of “_very_large_object_threshold” to 100.
This interesting, because it looks like this parameter does the same for full scans on index segments as “_small_table_threshold” does for full scans on table segments: the size of the segment to be scanned needs to be bigger than five times.
There are also differences: small table threshold is set in blocks, (apparently) very large object threshold is set in megabytes. Also, small table threshold is set by default at 2% of the size of the buffercache (so it scales up with bigger caches), very large object threshold seems to be fixed at 500. If my finding is correct, then it means an index segment needs to be bigger than 500*5=2500M to be considered for direct path reads. It’s unknown to me if the 50% limit for blocks in the cache and the 25% limit for dirty blocks is subject to this too.
Tagged: oracle tune direct path IO
![]()
Thanks to Norman Dunbar for pointing out that MobaXterm 6.3 has been released. You can find the download and changelog in the usual place.
I’ll be interested to see how the performance improvements to SFTP work out. I’ve seen some issues with this during transfers of large files before. The built in NFS and VNC servers sound interesting too. I can think of one situation where the NFS server would come in really handy.
Great stuff!
Cheers
Tim…
UltraEdit 4.0 has been released for Mac and Linux. The downloads are in the usual place. You can see the latest changelogs here (Mac, Linux).
Fun, fun, fun…
Cheers
Tim…
Cost Based Oracle – Fundamentals (November 2005)
But the most interesting function for our purposes is sys_op_countchg(). Judging from its name, this function is probably counting changes, and the first input parameter is the block ID portion (object_id, relative file number, and block number) of the table’s rowid, so the function is clearly matching our notional description of how the clustering_factor is calculated. But what is that 1 we see as the second parameter?
When I first understood how the clustering_factor was defined, I soon realized that its biggest flaw was that Oracle wasn’t remembering recent history as it walked the index; it only remembered the previous table block so that it could check whether the latest row was in the same table block as last time or in a new table block. So when I saw this function, my first guess (or hope) was that the second parameter was a method of telling Oracle to remember a list of previous block visits as it walked the index.
And finally, Oracle Corp. had implemented an official interface to the second parameter of sys_op_countchg() – provided you install the right patch – through a new table (or schema, or database) preference type available to the dbms_stats.set_table_prefs() procedure.
I’ve been meaning to write this post for two or three months, ever since Sean Molloy sent me an email about short blog note from Martin Decker describing Bug 13262857 Enh: provide some control over DBMS_STATS index clustering factor computation. Unfortunately I’ve not yet had time to investigate the patch, but I don’t think I need to any more because Richard Foote has written it up in his latest blog post.
Read Richard’s post – it’s important.
Richard’s post has, unsurprisingly, produced a buzz of excitement in his reader – and started up the discussion of how best to use this capability; so here’s another quote from the book (p.111 – available in the download of chapter 5):
So using Oracle’s own function for calculating the clustering_factor, but substituting the freelists value for the table, may be a valid method for correcting some errors in the clustering_factor for indexes on strongly sequenced data. (The same strategy applies if you use multiple freelist groups—but multiply freelists by freelist groups to set the second parameter.)
Can a similar strategy be used to find a modified clustering_factor in other circumstances? I think the answer is a cautious “yes” for tables that are in ASSM tablespaces. Remember that Oracle currently allocates and formats 16 new blocks at a time when using automatic segment space management (even when the extent sizes are very large, apparently). This means that new data will be roughly scattered across groups of 16 blocks, rather than being tightly packed.
Calling Oracle’s sys_op_countchg() function with a parameter of 16 could be enough to produce a reasonable clustering_factor where Oracle currently produces a meaningless one. The value 16 should, however, be used as an upper bound. If your real degree of concurrency is typically less than 16, then your actual degree of concurrency would probably be more appropriate.
Whatever you do when experimenting with this function—don’t simply apply it across the board to all indexes, or even all indexes on a particular table. There will probably be just a handful of critical indexes where it is a good way of telling Oracle a little more of the truth about your system—in other cases you will simply be confusing the issue.
Note particularly the comments about how the best value depends on the data in the indexed columns, the table configuration, and the degree of concurrency - you don’t necessarily want to use the same value for every index on a given table. That’s a shame, since Oracle has defined the interface as a TABLE preference, so if you set it then you get the same for every index. Despite this, if you’re prepared to put in a little control work, it does mean that you can use an official Oracle mechanism to play the game I was suggesting in the book – for each “special” index, set the preference, collect the stats, then clear the preference.
A new major release (version 3.0) of my XPLAN_ASH tool is available for download.
You can download the latest version here.
In addition to many changes to the way the information is presented and many other smaller changes to functionality there is one major new feature: XPLAN_ASH now also supports S-ASH, the free ASH implementation.
If you run XPLAN_ASH in a S-ASH repository owner schema, it will automatically detect that and adjust accordingly.
XPLAN_ASH was tested against the latest stable version of S-ASH (2.3). There are some minor changes required to that S-ASH release in order to function properly with XPLAN_ASH. Most of them will be included in the next S-ASH release as they really are only minor and don't influence the general S-ASH functionality at all.
If you're interested in using XPLAN_ASH with an existing S-ASH installation get in touch with me so I can provide the necessary scripts that apply the necessary changes.
Rather than writing another lengthy blog post about the changes and new features introduced I thought I start a multi-part video tutorial where I explain the purpose of the tool and how to use it based on the new version - some parts of the tutorial will focus on specific functionality of the tool and are therefore probably also quite useful as some kind of general tutorial on that Oracle feature and SQL execution troubleshooting guide in general.
The tutorial will consist of six parts initially, the first two are already available - the next ones to follow over time.
Part 1: Introduction, Overview
Part 2: Usage, Parameters, Invocation
Part 3: Rowsource Statistics: TBD
Part 4: Active Session History: TBD
Part 5: Systematic Parallel Execution Skew Analysis & Troubleshooting: Coming Soon
Part 6: Experimental Stuff, Script Configuration And Internals: TBD
Feel free to post questions/requests for clarification that are not covered in the tutorials in the comments section - if there are topics of general interest I might publish a seventh part addressing those questions.
In future I might use that video style more often since it's a nicer way of conveying certain kind of information.
Believe me, this article is worth reading I’m currently not allowed to discuss Oracle 12c Database goodies but I am allowed to discuss things perhaps initially intended for 12c that are currently available and already back-ported to 11g. This includes a wonderful improvement in the manageability of how the Clustering Factor (CF) of an index can now […]![]()
Recent comments
17 weeks 4 days ago
27 weeks 3 days ago
29 weeks 23 hours ago
32 weeks 2 days ago
34 weeks 4 days ago
44 weeks 1 day ago
45 weeks 4 days ago
46 weeks 4 days ago
46 weeks 5 days ago
49 weeks 4 days ago