Search

OakieTags

Who's online

There are currently 0 users and 40 guests online.

Recent comments

Affiliations

March 2011

Highly Recommended

I've built up quite a blogging back-log that I hope to clear a little over the next week, but I saw something today that's more important than any of the posts to come so I wanted to blog about that quickly first.

John Scott, a pillar of the Oracle Apex community came up with a great idea. There'll be no wild emotion round here because I didn't know any of the people involved well, but I can wholeheartedly recommend people buy that book. Not only can I be fairly certain that it'll be a high quality book written by people who are knowlegable and passionate about Apex, but it's going to help out the families of a couple of good friends of the Apex community. Even my miserable and cloudy personality is struggling to see a down-side there ;-)

I tip my hat to the authors and hope as many people as possible place an order.

Nice idea John!

6th Planboard DBA Symposium – Registration now open

I am pleased to announce that we, the program committee members, just finalized the program for the 6th Planboard DBA Symposium to be held at May 17 2011 in Amstelveen. There are 10 presentations scheduled in 2 parallel tracks by Frits Hoogland, Johan van Veen, Iloon Ellen-Wolff, Paul der Kinderen, Wouter Wethmar, Alex Nuijten, Gert […]

Virgin Media Customer Service…

I have two big customer service rants coming. This one is related to Virgin Media and the next one will be about SCAN. I’ll hold back on the second one while I wait to see if they do the right thing.

I had Virgin Media installed on 9th March and I am completely happy with the cable broadband service, but they were unable to take on my phone line because of capacity issues. That resulted in a complete mess over the billing of my service because of the way Virgin Media process discounts on bundles. I contacted customer services and after a lot of messing about the billing situation was resolved to my satisfaction.

A few weeks later I was notified of my first bill and it was completely wrong. I got in touch with customer services and after a very long time the issue seemed to be resolved AGAIN, but I was told I would be contacted by midday the same day for confirmation by a manager. It is now 6 days later and I’ve not received a call. In that time I’ve been phoning trying to get it resolved.

I’ve just got off the phone and I’ve been assured it is sorted now, but I know it will all be a steaming pile of poo next month when the bill is sorted again.

Each person I’ve talked to has been fine, but the whole process and the systems involved are terrible. It stinks. In no way can this be called good customer services. It is a complete nightmare.

Sometimes bad customer service is because the people answering the phones are morons. Sometimes it is because the procedures and systems they are working with are idiotic. It comes to something when the person on the other end of the phone is unable to tell you how much you will be charged because the systems they are using are so convoluted with a variety of possible discounts that may or may not be applied…

So in conclusion, Virgin Media customer services is a festering pile of donkey crap…

Cheers

Tim…

 




Oracle11g: Analyze Table Validate Structure Cascade “FAST” (Slow Burn)

I always take notice when Oracle introduces a new “FAST” option, so it was with some excitement when I first noticed in Oracle 11g Rel 1 there was a new FAST option when running the ANALYZE TABLE CASCADE VALIDATE STRUCTURE command.   This was described in the manuals as introducing a hashing scheme that was significantly [...]

Congratulations to Yury Velikanov, now an Oracle ACE!

I’m very excited to announce this somewhat old news — we have a brand new Oracle ACE at Pythian — Yury Velikanov. Fantastic addition to Pythian’s team of ACEs and ACE Directors — Fahd Mirza, Gwen Shapira, Jared Still, Christo Kutrovsky and myself. If you want to know more about Oracle ACE Program, the latest issue of Oracle Magazine has an article written by the fellow Oracle ACE Mike Riley — Everything Is Coming Up ACEs!.

I’ve known Yury for a while as I met him online on Oracle-L many years ago. The world is small and after living on different continents (North America and Europe), providence put us together on the third continent — Australia. As I was building our Australian business in Sydney, it took me a year to convince Yury to join Pythian. But I tell you it was worth the efforts!

Yury is one of our top Senior Oracle Apps DBAs and he is also extremely capable when it comes to Oracle core database technology and middleware which you would expect from a very good Apps DBA. A decade ago, Yury was already fusing Oracle E-Business Suite with Oracle RAC when most people couldn’t even afford to think about it. In 2003, Yury became one of the first five Oracle Certified Masters in Europe. In other words, Yury is very talented individual with top notch Oracle EBS and Oracle database skills.

As you probably know, excellent technical skills are not the only required quality of an Oracle ACE. Oracle ACEs are distinguished for their contributions to the community and Yury has always been an enthusiastic Oracle expert willing to both share his knowledge and learn from others. As a result, he is an active contributor to Oracle community, regionally and globally.

Yury founded Oracle e-Business Suite related mailing list several years ago to help Oracle Apps DBAs around the world exchange their experiences and help others. In Australia, Yury has become an organizer of the Sydney Oracle Meetup – an informal group of Oracle professionals meeting regularly face to face.

Yury also has a long list of conferences he presented at including Hotsos Symposium, UKOUG, AUSOUG and others. Recently, Yury has also started blogging on Pythian Blog and is planning to do more and more.

I would like to note that Yury is a person who never refuse to help — he will lose sleep, work 42 hours per day if need but won’t step away if asked for help. I’ve been there myself and I know that he is a very approachable individual and will do whatever it takes.

After all this, how could I not nominate him as an Oracle ACE? Right. There was no way! Now few months late Yury is happily joined our Oracle ACE Team at Pythian. It’s actually happened a few weeks ago, it just took me so long to write about it!

Oracle Troubleshooting TV Show: Season 1, Episode 01 ;-)

Ok, it’s official – the first and only Oracle Troubleshooting TV show is live now!

The first show is almost 2 hours about the ORA-4031 errors and shared pool hacking. It’s a recording of the US/EMEA timezone online hacking session I did some days ago.

There are a couple of things to note:

  1. The text still isn’t as sharp as in the original recording, but it’s much better than in my previous upload attempts and is decently readable. I’ll try some more variations with my next shows so I hope the text quality will get better! Or maybe I should just switch to GUI tools or powerpoint slides? ;-)
  2. You probably should view this video in full screen (otherwise the text will be tiny and unreadable)
  3. There’s advertising in the beginning (and maybe end) of this show! I’ll see how much money I’ll make out of this – maybe these shows start contributing towards the awesome beer selection I’ll have in my fridge some day (right now I have none). Viewing a 30-sec advert is small price to pay for 2 hours of kick-ass shared pool hacking content !!!
  4. You can download the scripts and tools used in the demos from http://tech.e2sn.com/oracle-scripts-and-tools/
  5. Make sure you check out my online Oracle troubleshooting seminars too (this April and May already)

View the embedded video below or go to my official Oracle Troubleshooting TV show channel:

http://tanelpoder.blip.tv

Enjoy!

MOATS - mother of all tuning scripts

Media: 
See video

 Basically (for now) MOATS is a TOP-utility for Oracle, which runs inside sqlplus, without need to install any external tools, binaries or scripts.

ASSM wreck

Yesterday I introduced a little framework I use to avoid the traps inherent in writing PL/SQL loops when modelling a session that does lots of simple calls to the database. I decided to publish the framework because I had recently come across an example where a series of SQL statements gives a very different result from a single PL/SQL block.

The model starts with a simple data set – which in this case is created in a tablespace using ASSM (automatic segment space management), an 8KB block size and 1MB uniform extents (in a locally management tablespace).


create table t1
tablespace test_8k_assm
as
select
	trunc((rownum-1)/100)	n1,
	lpad('x',40)		v1,
	rpad('x',100)		padding
from
	dual
connect by
	rownum <= 20000
;

create index t1_i1 on t1(n1, v1)
tablespace test_8k_assm
;

validate index t1_i1;
execute print_table('select * from index_stats');

You can see that the n1 column is defined to have 200 rows for each of 100 different values, and that each set of two hundreds rows is stored (at least initially) in a very small cluster of blocks.

With the data set in place I am now going to pick a set of two hundred rows at random, delete it, re-insert it, and commit; and I’m going to repeat that process 1,000 times.

declare
	rand	number(3);
begin
	for i in 1..1000 loop

		rand := trunc(dbms_random.value(0,200));

		delete from t1
		where n1 = rand
		;

		insert into t1
		select
			rand,
			lpad('x',40),
			rpad('x',100)
		from
			dual
		connect by
			rownum <= 100
		;

		commit;

	end loop;
end;
/

validate index t1_i1;
execute print_table('select * from index_stats');

You might think that this piece of code is a little strange – but it is a model of some processing that I’ve recently seen on a client site, and it has crossed my mind that it might appear in a number of systems hidden underneath the covers of dbms_job. So what does it do to the index ?

Given the delay that usually appears between the time an index entry is marked as deleted and the time that the space can be reused, and given the way I’ve engineered my date so that the space needed for the 200 rows for each key value is little more than a block (an important feature of this case), I wouldn’t be too surprised if the index had stabilised at nearly twice its original size. But that’s not what happened to my example running under ASSM. Here are the “before” and “after” results from my test:


                       Before         After
LF_ROWS                20,000        70,327
LF_BLKS                   156           811
LF_ROWS_LEN         1,109,800     3,877,785
BR_ROWS                   155           810
BR_BLKS                     3            10
BR_ROWS_LEN             8,903        45,732
DEL_LF_ROWS                 0        50,327
DEL_LF_ROWS_LEN             0     2,767,985
DISTINCT_KEYS             200           190
MOST_REPEATED_KEY         100         1,685
BTREE_SPACE         1,272,096     6,568,320
USED_SPACE          1,118,703     3,923,517
PCT_USED                   88            60
ROWS_PER_KEY              100           370
BLKS_GETS_PER_ACCESS       54           189

It’s a small disaster – our index has grown in size by a factor of about five, and we have more deleted rows than “real” rows. (Note, by the way, that the awfulness of the index is NOT really indicated by the PCT_USED figure – one which is often suggested as an indicator of the state of an index).

Unfortunately this is the type of problem that doesn’t surprise me when using ASSM; it’s supposed to help with highly concurrent OLTP activity (typified by a large number of very small transactions) but runs into problems updating free space bitmaps whenever you get into “batch-like” activity.

However, there is a special consideration in play here – I’ve run the entire operation as a single pl/sql loop. Would the same problem appear if I ran each delete/insert cycle as a completely independent SQL script using the “start_1000.sql” script from my previous note ?

To test the effect of running 1,000 separate tasks, rather than executing a single pl/sql loop, I wrote the following code into the start_1.sql script that I described in the article before running start_1000.sql:


declare
	rand	number(3);
begin

	rand := trunc(dbms_random.value(0,200));

	delete from t1
	where n1 = rand
	;

	insert into t1
	select
		rand,
		lpad('x',40),
		rpad('x',100)
	from
		dual
	connect by
		rownum <= 100
	;

	commit;

end;
/

The impact was dramatically different. (Still very wasteful, but quite a lot closer to the scale of the results that you might expect from freelist management).


                       Before         After
                    ---------     ---------
LF_ROWS                20,000        39,571
LF_BLKS                   156           479
LF_ROWS_LEN         1,109,800     2,196,047
BR_ROWS                   155           478
BR_BLKS                     3             6
BR_ROWS_LEN             8,903        26,654
DEL_LF_ROWS                 0        19,571
DEL_LF_ROWS_LEN             0     1,086,247
DISTINCT_KEYS             200           199
MOST_REPEATED_KEY         100           422
BTREE_SPACE         1,272,096     3,880,192
USED_SPACE          1,118,703     2,222,701
PCT_USED                   88            58
ROWS_PER_KEY              100           199
BLKS_GETS_PER_ACCESS       54           102

I haven’t yet investigated why the pl/sql loop should have produced such a damaging effect – although I suspect that it might be a side effect of the pinning of bitmap blocks (amongst others, of course) that takes place within a single database call. It’s possible that the repeated database calls from SQL*Plus keep “rediscovering” bitmap blocks that show free space while the pinning effects stop the pl/sql from “going back” to bitmap blocks that have recently acquired free space.

Interestingly the impact of using ASSM was dramatically reduced if one object used freelists and the other used ASSM – and with my specific example the combination of a freelist table with an ASSM index even did better than the expected 50% usage from the “traditional” option of using freelists for both the table and index.

Note – the purpose of this note is NOT to suggest that you should avoid using ASSM in general; but if you can identify code in your system that is doing something similar to the model then it’s worth checking the related indexes (see my index efficiency note) to see if any of them are displaying the same problem as this test case. If they are you may want to do one of two things: think about a schedule for coalescing or even rebuilding problem indexes on a regular basis, or see if you can move the table, index, or both, into a tablespace using freelist management.

I am Neo off the Matrix (apparently)

I know I have mentioned it before, but I am a big fan of the OEM performance screens that are derived from the ASH/AWR information. One of the things I really like about it is the immediate information it gives you, in one glance, that things are “not normal”. Once you notice that things are not normal you can then, within a few seconds, get a feel for what is still probably OK and where you have something that has changed.

As an example of the immediate information, I recently came back to my desk and glanced at my OEM performance screen. It was showing the below:


This will not be an interesting picture to you, but to me it tells me a lot about my system

“data load has just ran” I said to my comrade-in-arms. “which one?” he asked. “The Delta – It ran the quick plan. But it started a bit late, 12:15. Oh, and looks like the transaction view code has swapped back to the full table scan plan and the summary code is not playing up at the moment.”

“you’re turning into Neo you are – can you see a lady in a red dress???” he asked.

That was of course a reference to the “Matrix” films where at times you see the virtual world displayed on a screen as a stream of characters running down the screen – but once you get used to it you can apparently “see” what is going.

The screen shot above is not even actually a very good example of what the performance screens can show you. One of my minor complaints about the performance screens is that it scales to show the greatest of the largest peak or a number of sessions to match the number of CPUs (real or fake) that are available to you. So if you have more CPU available than you need, you can’t see much detail in the graph. And if you have had a nasty peak of activity, again, all detail is squeezed out. In my case, the box is sized to cope in 12 months and the system is new, so activity is scuttling along the bottom of the graph.

However, “poor” though the example is, it told me what was going on across my system at a glance, something about the major tasks we are running, that one problem is currently occurring and that several of the other issues I need to keep an eye out for are not occurring.

That is why I love these screens – I recognise “my” activity patterns from the graph, I now recognise the SQL IDs for my key statements. If I see a pattern in the graph I don’t recognise, I need to check things out immediately. Three or four times over the last 2 weeks I have spotted an issues, started investigating and found out the cause before the Operations desk has even noticed an issue.

Oh, and what is SQL type 189? It is a merge statement. Our implementation of OEM is a little old, it does not correctly interpret that SQL command type. It might be a little old, it is still a lot useful.

Investigating Enqueues Burns CPU Cycles

March 30, 2011 If you take a look at the Oracle Database Performance Tuning Guide, you will find the following SQL statement to help identify the session waiting in an enqueue and the session that currently prevents that session from continuing: SELECT DECODE(request,0,'Holder: ','Waiter: ') ||           sid sess, id1, id2, lmode, request, type    [...]