Search

Top 60 Oracle Blogs

Recent comments

Compression Units – 3

For those who have read the previous posting of how I engineered an Exadata disaster and want to reproduce it, here’s the script I used to generate the data.

create table t1
as
with generator as (
	select	--+ materialize
		rownum id
	from dual
	connect by
		level <= 1e4
)
select
	rownum					id,
	lpad(rownum,10,'0')			v1,
	mod(rownum,10000)			n1,
	trunc(rownum/100)			n2,
	trunc(dbms_random.value(0,262144))	n_256K,
	trunc(dbms_random.value(0,131072))	n_128K,
	trunc(dbms_random.value(0,8192))	n_8k,
	rpad(dbms_random.string('U',3),60)	padding
from
	generator	v1,
	generator	v2
where
	rownum <= 1048576
;

insert into t1 select * from t1;
commit;

insert into t1 select * from t1;
commit;

insert into t1 select * from t1;
commit;

insert into t1 select * from t1;
commit;

insert into t1 select * from t1;
commit;

create table t1_qh
compress for query high
as
select * from t1
;

create table t1_ah
compress for archive high
as
select * from t1
;

begin
	dbms_stats.gather_table_stats(
		ownname		 => user,
		tabname		 =>'T1',
		method_opt	 => 'for all columns size 1'
	);

	dbms_stats.gather_table_stats(
		ownname		 => user,
		tabname		 =>'T1_QH',
		method_opt	 => 'for all columns size 1'
	);

	dbms_stats.gather_table_stats(
		ownname		 => user,
		tabname		 =>'T1_AH',
		method_opt	 => 'for all columns size 1'
	);
end;
/

create bitmap index t1_bi on t1(n_128K);
create bitmap index t1_qh_bi on t1_qh(n_128K);
create bitmap index t1_ah_bi on t1_ah(n_128K);

select
        max(padding)
from
        t1_ah
where
        n_128k between 1000 and 1999
;

You’ll need about 5 GB of space to create all these objects – but that shouldn’t be too difficult on the typical Exadata system. As you can see, I’ve created the data to cover a variety of tests and experiments, including comparing the side effects of different levels of compression (or not).

When you run the query, you may find that the optimizer arithmetic picks the tablescan option automatically – the choice will depend basically on your setting for the db_file_multiblock_read_count and/or the settings for the various system stats (sreadtim, mreadtim, MBRC and CPUspeed, or the equivalent no-workload settings); so, to see the performance impact, you may have to hint the query. (It’s slightly worrying that when you set up a (datawarehouse) system on Exadata it might be a good idea – at least in the short term – to set the db_file_multiblock_read_count to 128 because you really do want Oracle to think that tablescans are, in general, a pretty good fast option – we’ve only just got ver persuading people that you shouldn’t set this parameter in any version of 10g or above !

After running the query twice on each table (to get all the data cached somewhere, if possible) the most recent timings I got were as follows for the indexed access path (the 10% difference between this test and the time I reported in my last test could possibly be explained by the fact that this machine had seen some serious volume testing while the other machine was so new that I had probably been using the outer edge of every disc):

  • No compression – 0.7 seconds.
  • Query high – 77.24 seconds
  • Archive high -  3022.83 seconds

The number of different ways you can test against just this set of data is quite surprising – so don’t feel you have to stop with just the one demonstration.

Footnote – prompted by Kerry Osborne’s note, I’ve realised that my comment about disks and outer edges etc. is completely irrelevant. The entire time spent was CPU time, and the data was all cached when I ran these tests. I should have checked the machine types more closely – the slower machine was a V2, the faster was an X2.