Search

Top 60 Oracle Blogs

Recent comments

September 2010

Webinar: The right way to secure Oracle by Pete Finnigan - Wednesday 29 September 2010

I am going to present my paper " The right Way to Secure Oracle " on Wednesday 29th September 12:00pm - 1pm EDT which should be 17:00 UK time (we are on BST at the moment and New York is....[Read More]

Posted by Pete On 27/09/10 At 04:14 PM

A million kilometers in two years…

I’ve been tracking my business travel with Tripit.com‘s awesome service for about 2 years now.

After getting back from my Tallinn->Helsinki->New York->Detroit->New York->San Francisco->New York->Helsinki->Tallinn trip yesterday, Tripit reported that I have flown 1 007 509 km during my business trips (American readers, that’s about 42 miles ;)

Check yourself below :)

Tripit says I’ve visited 71 different cities in 27 countries within the last two years.

Here’s the map of places where I’ve visited my clients, done training or spoken at conferences:

Actually there’s probably couple of more cities where I’ve been in the last two years, for some reason Tripit doesn’t recognize my trip to Melbourne (but it does show the visit to Sydney which I did during the same trip).

Anyway, the conclusion here is that I think I’ve done enough flying for now. Now I plan to stay at home for a loooong time (I mean at least 3-4, maybe even 5 weeks in a row!!! ;)

But seriously, what I’ve decided is that:

  1. I won’t do any more public on-site seminars (with only few exceptions).
  2. I will move all my public seminar offering to web-based online seminars (using Citrix gotomeeting.com’s service), which I’ll deliver in person.
  3. I will still do private corporate training on-site occasionally, which offers flexibility of customizing the content and focus areas of the seminar to match the customer’s needs
  4. I will also offer private corporate online training, which gives much greater flexibility for choosing the seminar duration and times etc (it’s possible to spread a seminar to 1-day sections, each day delivered on a separate week, to reduce the impact of people being away from their work)
  5. I will still do consulting & advanced troubleshooting where I usually solve even the most complex problems in matter of couple of days (like explained here for example)

Ok, enough of self-promotion and advertising, back to work ;-)

P.S. I will publish my online seminar schedule “very soon now”!!!

P.P.S. I’m not affiliated with Tripit.com by any means business-wise, but if you travel frequently, then I recommend you to check out their awesome service (and iPhone app). The basic version is free, but I just decided to upgrade to Pro after couple of years of using it!

Share/Bookmark

#OOW10 Redux

A few days after my first visit to Oracle OpenWorld in over 15 years, I’ve taken a few days to digest the experience and wanted to summarize my thoughts and “analysis”.  Attending with my wife, who still works for Oracle and is a director associated with Exadata POCs, was also a fun time – between her contacts within Oracle and my “contacts” in the outside ‘”Oracle underground” (as my wife refers to things like the OakTable network) I think we were able to get a nice, full picture of the conference.

I thought there were many interesting “themes” at the conference, from the official ones like Exalogic and the Fusion Apps to unofficial ones like how Exadata has been changing the existing dynamic around databases in the enterprise.

Exalogic was an enigma at the conference – as the opening keynote announcement, and repeated at the closing keynote, one would have thought that there would have been a lot of buzz about the box.  Instead, I saw a fair amount of confusion as people debated the meaning of “cloud” vs. “box” vs. “appliance” vs. “platform”.  Information about the configuration was scarce, and many people at the conference seemed to ignore it completely.  From initial descriptions (Ellison’s claim, for example, that 2 of them could handle all of Facebook’s web traffic) it appears that Oracle has built a high-performance, high-density Java platform, that, coupled with several Exadata database platforms, could easily handle most, if not all, of an enterprises application and data hosting needs.  It remains to be seen if organizations could actually run their entire application portfolio on a set of Exa-platforms – and non-Oracle software “support” seemed to be “lacking” on the Exalogic.  (I visited the Informatica booth to ask them if PowerCenter could run on the Exalogic – their response: “What’s an Exalogic?”, was telling…)

I opined at the conference that Ellison was trying to replace the generic hosted “LAMP” stack with and “LJE” stack: Linux, Java, Exa.  And in the main, I think it’s a good idea – the ability to provide a private shared hosting environment within an enterprise is attractive.  (Full disclosure – I used to work for Network Solutions, so I have a fair appreciation of hosting environments).  It was interesting to see the response to my “tweet” on the subject, as several people shot back how an LJE stack is so much more expensive than a LAMP stack.  And for small applications, they’re right.  However, contrast the size of the Facebook LAMP stack with how Ellison described a hypothetical Facebook LJE stack (2 Exalogic platforms, several Exadata platforms) and I’d guess that the price difference would be less than most people guess – not to mention a lot more energy and space efficient.  As an example of the hosting paradigm, I enjoyed a presentation by Nicholas Tan (Commonwealth Bank of Austrailia) and Roland Slee (Oracle) in which they combined 300 databases into 1 database and started providing internal database hosting with dramatic cost and time savings.

In any event, the rest of the conference seemed to be more focused on Exadata, as more and more customers are coming to grips with it.  Unfortunately, I noticed several “Exadata” presentations that had little Exadata-specific content – as Cary Millsap tweeted, it appeared to be a magic word that got presentations onto the agenda.  The 2 new Exadata boxes were nice technology refreshes, and Alex Gorbachev has a good side-by-side comparison of them.  On an another note, I wondered if Oracle was introducing Exadata models “too quickly” – I heard rumblings from attendees with v1 HP-based Exadata boxes about how they’re being left behind a bit.

As my first conference trying to fully utilize the power of Twitter (I know, I’m late to the party), I was happy with how it semi-allowed me to “attend” several sessions at once vicariously through other people.  Greg Rahn and Gwen Shapira’s tweets kept me in the loop around other sessions all day.  In particular, I was particularly happy to be following the analysts during their “private” sessions with Oracle – Merv Adrian and James Kobelius were very insightful and “non-snarky”.  James’ tweets around the Exadata Intelligent Data Warehouse (whereby SAS PMML models could be run directly on Exadata platforms) were especially interesting to me.

In the end, I started to see somewhat of a sea-change occurring in more of an emphasis on applying the power of Exadata to problems rather than advice on how to technically configure them.  And I think, Ellison agrees with that – one of his statements that customers shouldn’t be defining them selves on the basis of how unique their configurations are, but rather on what they are doing with them.  Of course, that kind of statement starts to imply a reduction in configuration choices – run whatever you want as long as its on an Exadata or is running Oracle Linux or Oracle Solaris (x86) — (I went to a session on Database Flash Cache in which Oracle all but admitted that the feature was limited to operating systems under Oracle’s “control”).  And Ellison’s pointed comments about Red Hat Linux didn’t go unnoticed by the crowd either.  In any event, choice in the hardware space has been narrowing for some time, as non-Intel and non-IBM architectures continue to decline in the market (with the exception of the ARM-based “systems”).  (BTW, speculation around Oracle and ARM has been entertaining).  Ellison’s rather clear desire to provide pre-built solution platforms is also a subtle comparison to IBM and its large services arm – it will be fascinating to watch how this plays out in the market.

This narrowing of choice is continuing into the DBA space, as I believe that range of ways a production DBA can affect the performance of applications continues to diminish – and not in a bad way, but rather in the way that the database defaults are getting better and better and reduce the configuration options that DBAs need to manage.  From Greg Rahn’s comments about how Exadata defaults seem to handle most reasonable workloads without difficultly, to Alex Gorbachev’s recommendation to use OMF to Tom Kyte’s presentation on how smart the optimizer is getting (how it eliminates tables from queries when it determines that they are not necessary) it’s becoming clear to me that the real ability to affect performance is shifting from the  production DBA to what I term database engineering and/or development.

Watch Cary Millsap’s interview with Justin Kestelyn and you’ll see what I mean – I think it’s imperative that DBA’s who are interested and talented at performance work to become more involved in software development.  Either by becoming dedicated development DBAs or forging ahead into database engineering.  I had a good, quick talk with Gwen Shapira about this at the blogger’s meetup.  And I was also struck by information from the Real World Performance team sessions in which they showed how application architectural choices affected design – Greg Rahn spoke about how necessary it was to start thinking in terms of set processing in order to harness the power of multiple compute and storage nodes; and in an understated demonstration, Andrew Holdsworth showed how fewer connections to the database can result in more scalability.  These all speak to development and application choices in my mind.

Finally, I had a good time trying to attend sessions at the Hilton for the Oracle Develop conference.  Cary and I talked about the wisdom of developing plug-ins for SQL Developer vs. Eclipse and I was pleasantly surprised to see Quest announce their Toad Plug-In for Eclipse at the conference.  With the demise of SQL*Plus for Windows and the need to integrate more database development tools into the developer experience, these discussions (not to mention the reduction in price of SQL Developer’s Data Modeling tool to free!) really hit home for me – now only if we could get better SCM integration – perhaps Eclipse with the Quest Plug-in, Method-R MR Trace Plug-in and Mercurial Plug-in will do it for me…

(I like Mercurial because it emphasizes changes and change-sets – which I think is a better match to database refactoring than Subversion – but that’s a topic I’m still researching).

iPad. Thoughts so far…

In one of my previous posts I mentioned I bought an iPad whilst at OpenWorld. Well it’s a few days old so I thought I’d write a little something about my thoughts so far.

I was originally very skeptical about the whole iPad thing. I switched from a 17 inch laptop to a 13 inch MacBook Pro to reduce then weight of the bag I was lugging round at conferences. At OpenWorld I got so sick the weight of the MacBook I decided to buy the iPad and I must say, from a weight perspective at least, it is a massive improvement.

Having not been part of the iPhone crowd I was initially very confused by the interface. Reading content was a breeze but editing and typing was truly arduous. Over the last few days I’ve got used to it and for the most part it is OK. I’m typing this blog post now using it. Having said that, without a separate keyboard it is not a good device for content generation. Simple text is fine, but cut & paste is a complete pain. Don’t even get me started on pasting hyperlinks into blog and forum posts. Too much effort.

Where it really comes into it’s own is for reading content. I’ve got Kindle for iPad and the books are really neat and easy to read. I’ve also moved some PDFs of the Oracle docs onto iBooks and they are much easier to read on the iPad than the Kindle. Very impressive. The screen does suffer in bad light compared to the Kindle, so if you are only going to read novels, then save a load of cash and use a Kindle. If on the other hand you want to browse the net, check emails and do some minor content generation, like blogging and twitter, then the iPad is great.

Two other factors that weigh heavily in it’s favour compared to a laptop are the instant on/off and excellent battery life. No more hassle at conferences and on planes. Just switch it on and go.

It’s early days, but I think this will now replace my laptop for everything except presentations with demos.

Cheers

Tim…

Saturday Night Peter…

“Saturday Night Peter” is the second part of the autobiography of the comedian Peter Kay. The first book, “The Sound of Laughter”, finishes just as Peter gets his first break into comedy. The second book picks up from that point, documenting his Saturday gigs while he progresses up the ladder into the big time.

The first book was just plain funny. It had me laughing out loud a lot. The second book was not full of belly laughs. It was more amusing than full on funny. Still worth a look if you like the guy.

Cheers

Tim…

OpenWorld 2010: Day 4-5

Day 4 & 5 followed pretty much the same pattern for me. Lots of time in The Zone, a few presentations and lots of chatting.

On Wednesday I met up with a former colleague from the UK, now based in Denmark. He’s an Aston Villa fan, but I try not to hold that against him. Last time we met up was in Copenhagen and I blame him for me being very ill the next day. :)

Wednesday evening was the Bloggers meet up, which was very busy. It’s good to be able to put a face to blogs you read.

I missed the wrap party on Thursday as I had to fly home. The trip home was not be best because I didn’t have an aisle seat. When I don’t have an aisle seat I get really fidgety because I am nervous about having to ask the person next to me to repeatedly let me out. As a result I usually have to stand for the whole trip. I was forced to sit through three bouts of turbulence, but the rest of the flight I stood at the back of the plane. I guess I was standing for about 8-9 hours. Never mind.

Thanks for everyone at OTN and the ACE program for another great OOW.

Cheers

Tim…

My Actual OTN Interview

And now, the actual OTN interview (9:11) is online. Thank you, Justin; it was a lot of fun. And thank you to Oracle Corporation for another great show. It's an ever-growing world we work in, and I'm thrilled to be a part of it.

Cleanup of partitioned objects residing in offlined tablespaces

If you have the scenario that a large database that contains historical data - typically a data warehouse with partitioned tables where the partitions reside in different tablespaces depending on their age - is supposed to be duplicated but the target environment, typically a lower environment, like duplicating Production to a UAT environment, doesn't have sufficient space to hold the complete database and in particular the whole historic partitions, then you somehow need to deal with that partial clone.

Of course, in an ideal world the lower environment is supposed to have sufficient space to hold the complete database, but we don't live in an ideal world. Sometimes, depending on the storage technology, you might be able to do some kind of temporary "split mirror" operation that allows you to start up a complete clone of the database and drop all the affected objects / tablespaces before putting the cleaned up / partial database on the final target system, but this is not always applicable, for instance when using ASM-based storage.

The issue in that particular case is that you can do for instance a partial RMAN clone that omits the historical tablespaces, but the resulting database then obviously has some tablespaces that are unavailable and can't be onlined since the datafiles are simply missing.

In case of objects that are completely residing in the offlined tablespaces the solution is simple and straightforward: The objects can be dropped or the affected tablespaces can simply be dropped with the "including contents" clause.

Things get more complicated however in case of partitioned objects if some of the partitions reside in the offlined tablespaces.

In that case you can't simply drop the offlines tablespaces - you end up with Oracle error message "ORA-14404: partitioned table contains partitions in a different tablespace" for example that tells you that objects exist that have some of their partitions not residing in the offlined tablespace and Oracle obviously won't do the cleanup job for you in that case.

Oracle allows to drop partitions of a table that reside in offlined tablespaces, but there is an important limitation: You can only do so if no index has been defined on the table. Otherwise you get the error message "ORA-14117/ORA-14630: [sub]partition resides in offlined tablespace" when attempting to drop the partition. It's interesting to note that the mere existence of indexes is sufficient - it doesn't matter if the index is residing in an offlined or onlined tablespace, if it is unusable or not - you just need to have at least one index defined on the table, and the drop partition operation errors out. It looks like a hard coded "if index exists then error" code path. I'm not sure why exactly this restriction is in place, but it has serious consequences if you need to do this sort of cleanup for a large database with hundreds of thousands of partitions.

By the way it is interesting to know that Oracle allows to drop segments from offlined tablespaces that are locally managed - think about the consequence: The local bitmaps that represent the allocated and free space in the tablespace can not be updated at the time of the drop operation. This is one of the few disadvantages compared to the old-fashioned dictionary managed tablespaces where an operation like that potentially could have been limited to a pure dictionary operation.

The approach Oracle chooses is straightforward: The segments dropped are converted into "temporary" objects (kind of a special "recyclebin" if you want to say so) and if the offlined tablespace should get onlined again the local bitmaps will be updated according to these temporary objects and finally the temporary objects will be removed from the dictionary. Otherwise if the offlined tablespace gets dropped the corresponding temporary objects consequently can also be removed.

Tanel Poder recently also blogged about this in the context of read-only tablespaces.

So you could now drop all indexes from the affected tables and recreate them afterwards, but this is not a good idea for several reasons:
* If the remaining partitions still represent multiple terabytes it will take quite long to rebuild all indexes
* Things might go wrong and you end up with missing indexes

Since these are partitioned objects, another Oracle best practice might come to rescue: Exchange the affected partitions with an exchange table excluding indexes - this allows to drop the exchanged partition residing in an offlined tablespace without any error message. Note that this means that after you've exchanged the partition you need to drop the unpartitioned object (that doesn't have indexes) in order to get rid of the affected segment.

Now if you only have a few objects that are affected or only a single partition to clean up per table then you're basically done, but if you need to clean up multiple partitions and the number of partitions is high then this approach is not feasible, because it sounds like quite an overhead to drop / create a table for each partition that needs to be cleaned up - in particular when talking about hundred thousands of partitions.

The approach of creating a table to exchange partitions with adds another complexity to the problem: Many large databases make usage of the option in Oracle to set columns unused instead of actually dropping them, since the former is a simple meta data operation in the dictionary (renames the column and marks the column as hidden) and independent from the data volume and therefore almost instant but the latter requires Oracle to process the whole segment which can take quite some time in case of huge segments (and is not a very efficient operation by the way, but that is something for a different post).

Now creating a table that can be exchanged with such an evolved table that contains unused columns can not be accomplished by simply doing a handy CTAS operation - the unused columns will be missing from the copy and therefore the exchange operation will fail with "column mismatch" errors.

So you need to deal with these unused columns somehow in that moment you need to create a table dynamically, and you need to do that if you need to drop more than a single partition per table.

The only other viable option that I'm aware of is to maintain this exchange table as a partitioned table itself permanently - which means create two exact copies (in terms of dictionary meta data) of every potentially affected table and make sure to apply the same DDL in the same order which will ensure that the exchange operation with those tables will succeed.

It will need two copies, one unpartitioned and one partitioned, if you want to avoid the drop and dynamic create table operation, since Oracle allows to exchange partitions only with unpartitioned tables. So it would take two exchange operations per affected partition - the first with the unpartitioned table, and the second with the partitioned table. Both exchange operations would have to be done excluding indexes and at least the final partitioned table has to be created excluding indexes - note that Oracle allows the exchange operation even if one table is missing a potentially created primary key constraint if you use the "EXCLUDING INDEXES" clause.

Then you can drop the partition that has been exchanged into the second table since the table doesn't have any indexes defined on it.

This allows you to accomplish two objectives:

1. Avoid a costly (and potentially complex in case of unused columns) drop / create table operation per partition
2. Manage the drop / create cycle by simply dropping and adding partitions in the second table which is far less overhead compared to a drop / create table operation which is important when performance matters

However maintaining two copies for every table that potentially needs to be cleaned up sounds like a huge overhead, and even in Oracle 11.2 with the deferred segment creation at least the partitioned table copy will allocate space since the deferred segment creation option is not (yet) applicable to partitioned objects.

So here a few ideas how to accomplish such a cleanup operation:

1. For each table to process create the required table copies on the fly dynamically once. If you don't have to deal with unused columns then a simple CTAS operation can be used to create those tables. If you need to handle unused columns, a more complex approach is required. You will have to use custom queries against the data dictionary to extract the required information since the official DBMS_METADATA interface as far as I know doesn't expose information about unused columns.

The following query could be used as a starting point to extract the column definition of a table:

-- Get the column list including hidden columns, but ignoring virtual columns added by function-based indexes
-- TODO: What about virtual columns (11g or later)?
select
decode(t.hidden_column,
'YES', 'SYS$' || internal_column_id || '$UNUSED',
t.column_name
) ||
' ' ||
lower(data_type) ||
decode(substr(data_type, 1, 9),
'TIMESTAMP', null,
'INTERVAL ', null,
decode(t.data_type,
'NUMBER', decode(t.data_precision || t.data_scale,
null, null,
'(' ||
decode(t.data_precision,
null, '*',
t.data_precision
) ||
decode(t.data_scale,
null, null,
',' || t.data_scale
) ||
')'
),
'FLOAT', '(' || t.data_precision || ')',
'LONG', null,
'LONG RAW', null,
'BLOB', null,
'CLOB', null,
'NCLOB', null,
'BFILE', null,
'CFILE', null,
'BINARY_FLOAT', null,
'BINARY_DOUBLE', null,
'MLSLABEL', null,
'ROWID', null,
'UROWID', null,
'DATE', null,
'(' ||
nvl(
nvl(
decode(t.char_length,
0, to_number(null),
t.char_length
),
t.char_col_decl_length
),
t.data_length
) ||
decode(t.character_set_name,
'CHAR_CS', decode(t.char_used,
'C', ' char',
'B', ' byte'
)
) ||
')'
)
) ||
' ' ||
decode(nullable,
'N', 'not null',
null
) ||
',' as column_def
from
all_tab_cols t
where
t.owner =
and t.table_name =
and t.segment_column_id is not null
order by
t.internal_column_id;

This will generate SYS$$UNUSED columns for all columns set to unused in the source table which can then subsequently be set to unused in the newly created table to have the same layout as the source table.

Note that it hasn't been tested with virtual columns yet, but should deal with most of the available data types and also handle columns using char semantics correctly (important for NCHAR / NVARCHAR based columns and databases running with multi-byte characters, in particular AL32UTF8). It also can handle virtual columns added via function-based indexes.

Of course the extraction process potentially needs to handle much more than shown here, for instance check the table properties like index organized tables, partition keys etc.

2. Process each affected table partition using the following approach:

* Add a new "clean" partition to the partitioned copy table in a tablespace that is online. The partition key is a "dummy" key that can be the same for each iteration of this process. This is the segment that is going to end up in the affected partitioned table and will be finally dropped since it then resides in a available tablespace. Remember, we want to get rid of those partitions residing in unavailable tablespaces, since we have not sufficient space in the clone of the database to make them available)

* Exchange the affected partition with the unpartitioned copy table EXCLUDING INDEXES (The unpartitioned table copy has also been initially created in an available tablespace)

* Exchange the unpartitioned copy table (that now holds the defect partition) with the newly created "clean" partition of the partitioned copy table EXCLUDING INDEXES

* Now the defect partition resides in the partitioned copy table and can simply be dropped since we don't have any indexes defined on this table

* Finally we can drop now the partition from the affected table (that is now the partition that has been added to the partitioned copy table in the previous iteration of the process) since the partition resides now in an available tablespace

The same process can be applied to a subpartitioned table. There are of course some more details to consider, for example, you can't drop the last subpartition or partition from a partition / table, but since we assume that our affected table in general still has some partitions left over that reside in current and available tablespaces it should be possible to handle these scenarios.

Here is a code snippet / template that applies this process:

/**
* Drop a (sub-)partition residing in an offline tablespace
* @param p_object The original object to clean up
* @param p_temp_exc_table_name The name of the working table
* @param p_clean_tablespace_name The tablespace where to create the cleanup partition
* @param p_partition_name The name of the (sub-)partition to drop
* @param p_segment_type The type of the segment (TABLE[ ][SUB][PARTITION])
* @param p_last_partition_indicator For subpartitions indicate that this is
* the last subpartition - this will drop the whole partition
**/
procedure drop_partition(
p_object in r_oracle_object
, p_temp_exc_table_name in out oracle_object
, p_clean_tablespace_name in oracle_object
, p_partition_name in oracle_object
, p_segment_type in oracle_object
, p_last_partition_indicator in varchar2
)
as
s_sql large_sql;
s_partition_type varchar2(255);
s_exchange_table oracle_object;
e_partition_in_offline_ts exception;
pragma exception_init(e_partition_in_offline_ts, -14117);
-- We do this in an autonomous transaction to keep it separate
-- from any outside transactions like the "queue" transaction
-- used by the parallel clean up threads
pragma autonomous_transaction;
begin
-- Determine if this is a partition or subpartition
s_partition_type := substr(p_segment_type, 7, 255);

-- Step 1: Add a clean partition to the temporary cleanup table
s_sql := 'alter table ' || p_temp_exc_table_name || ' add partition p_cleanup values (42) tablespace ' || p_clean_tablespace_name;

execute(p_object.owner, s_sql);

-- Get the name of the unpartitioned table copy
s_exchange_table := get_exchange_table(p_object);

-- Step 2: Exchange the defect (sub-)partition with the unpartitioned copy table EXCLUDING indexes
s_sql := '
alter table ' || p_object.segment_name || '
exchange ' || s_partition_type || ' ' || p_partition_name || '
with table ' || s_exchange_table || '
excluding indexes without validation';

execute(p_object.owner, s_sql);

-- Step 3: Exchange the unpartitioned copy table with the partitioned copy table partition again EXCLUDING indexes
s_sql := '
alter table ' || p_temp_exc_table_name || '
exchange partition p_cleanup
with table ' || s_exchange_table || '
excluding indexes without validation';

execute(p_object.owner, s_sql);

-- Step 4: Drop the defect partition now residing in partitioned "waste-bin"table partition
-- Since we don't have any indexes this is possible now
s_sql := 'alter table ' || p_temp_exc_table_name || ' drop partition p_cleanup';

begin
execute(p_object.owner, s_sql);
exception
-- Any tables with LOBs will cause an exception since they will implicitly have a LOB INDEX added
-- Therefore the DROP PARTITION will fail with ORA-14117
-- The simplest solution is to drop and recreate the entire working table - note the serious side effects
-- of the underlying ORA-01110 error in 11.1.0.7. See below comments about the potential impact
-- of the health check monitoring
-- A smoother way of handling this to avoid these potential side-effects is to check if the
-- table has any indexes defined on it and then do not attempt to drop the partition but
-- immediately drop the entire table
when e_partition_in_offline_ts then
drop_temporary_exc_table(p_object, p_temp_exc_table_name);
p_temp_exc_table_name := create_temporary_exc_table(p_object, p_clean_tablespace_name, p_segment_type);
end;

-- Step 5: Drop the partition / subpartition which completes the cleanup
-- TODO: Are really all subpartitions of a partition affected?
-- If not, this logic needs to be revised
if (s_partition_type = 'SUBPARTITION' and p_last_partition_indicator = 'Y') then
s_sql := 'alter table ' || p_object.segment_name || ' drop partition ' || get_parent_partition(p_object, p_partition_name);

execute(p_object.owner, s_sql);
elsif s_partition_type = 'PARTITION' then
s_sql := 'alter table ' || p_object.segment_name || ' drop partition ' || p_partition_name;

execute(p_object.owner, s_sql);
end if;
end drop_partition;

Note that tables with LOBs need special treatment since the logic used for tables without LOBs fails with an error that the partition resides in an offline tablespace when the "defect" partition gets dropped from the temporary cleanup table.

This is the error that is typically raised when attempting to drop a partition from a table having indexes defined on it. Very likely this error is caused by the fact that a LOB always has a corresponding LOB INDEX (that will be always exchanged when doing an EXCHANGE PARTITION even if EXCLUDING INDEXES is specified).

Since this index cannot be dropped from the temporary cleanup table we actually do not gain anything by performing the exchange exercise.
The simplest solution to the problem is to drop and recreate the cleanup table which is possible without the ORA-14117 error message being raised.

Note that attempting to drop the partition raising the ORA-14117 error has a serious side effect in 11.1.0.7 and 11.2.0.1: If the "Health Monitoring" detects a "flooding" ORA-1110 error (something is wrong with a datafile) any operation on these datafiles from then on seem to have tracing turned on - this leads to the odd situation that onlining such a tablespace again (which we can't do in our actual case since the datafiles are not there, but I've done for testing purposes) takes minutes and generates trace files in size between 20 and 500 MB (!).

I haven't investigated much into this, but in my tests this couldn't be avoided by setting the undocumented parameter "_disable_health_checks=TRUE".

The MMON_SLAVE process will dump something similar to this:

========= Dump for error ORA 1110 (no incident) ========
----- DDE Action: 'DB_STRUCTURE_INTEGRITY_CHECK' (Async) -----

Also for the sessions that cause the error something like this will be written to trace files:

DDE rules only execution for: ORA 1110
----- START Event Driven Actions Dump ----
---- END Event Driven Actions Dump ----
----- START DDE Actions Dump -----
Executing SYNC actions
----- START DDE Action: 'DB_STRUCTURE_INTEGRITY_CHECK' (Async) -----
DDE Action 'DB_STRUCTURE_INTEGRITY_CHECK' was flood controlled
----- END DDE Action: 'DB_STRUCTURE_INTEGRITY_CHECK' (FLOOD CONTROLLED, 17 csec) -----
Executing ASYNC actions
----- END DDE Actions Dump (total 0 csec) -----

A smoother way of handling this to avoid these potential side-effects is to check if the table has any indexes defined on it and then do not attempt to drop the partition but immediately drop the entire table.

Since each iteration of this process takes some time (in a database with hundreds of thousands of partitions this can take a few seconds per partition due to the underlying recursive data dictionary operations) and is according to some traces mostly CPU bound you can speed up the processing by using multiple processes it if you have multiple CPUs available.

Using an approach with Advanced Queueing having multiple consumers (for instance as many consumers as CPUs) we were able to cleanup more than 200,000 partitions (table plus index (sub)partitions) in less than one hour which was within the available timeframe for that operation.

Storage Serverフィルタリング考察

Exadataの考察
DWH環境でのSQLの多くは、全てのStorage Serverに対してFull Scanが実行される。
それらがExadata Serverにまともに読み込まれたらCPU負荷も余計にかかるだろうし、Infinibandの占有率も上がってしまう。そしてTPC-Hの結果の中でフィルタリングが重要だと書いた:

Exadata Storage ServerがインテリジェントにWhere条件をフィルタリングして結果を返したり、Join条件をハッシュ値として返すことが重要な機能だということが実感できた。
以前Oracle Closed Worldの中で、
でもStorage Server内での「Secondary Oracle」はExadataだけのClosedなものでしょ?
→そんなのこれからも「どんどん変わる」。既に細かな話になり始めている。
と書いた。
「SSDの普及で転送速度が上がればフィルタリングなんてそれほど重要じゃなくなる」というのは認識不足でした。

今回はDBLINKを使用した場合の動きから、少し視点を変えてフィルタリングを考えてみる。


select ... from table1@a , table2@b, table3@c
where a.key=b.key
and a.key=c.key
and a.data_key = 'DONE'
and b.contents like '%AAA%'
and c.create >= sysdate -1;

table1@a:サーバaのDBLINKという意味

LINK先のサーバのデータはWhere条件できちんとフィルタリングされてリクエストを出したサーバに戻ってくる:

  • Server_a : a.data_key = 'DONE'
  • Server_b : b.contents like '%AAA%'
  • Server_c : c.create >= sysdate -1

そして、リクエストを出したサーバがOracle EEならば、理想的にParallel実行される。
また、そのときに使われるtemporary tablespaceはリクエストを出したサーバ内のものが使われるので、
direct path read/write tempがExadataのように複雑ではない。
もっと掘り下げると、DBLINKされているTableはOracle SE-ONEクラスでもで十分だということになる。
そこのI/Oスピードを上げる、あるいは、Keep_Bufferを使う。

Exadataのフィルタリング機能というのはExadataだからではの悩みで、
Shared Everythingアーキテクチャであるがための悩みだと思う。
DBLINK形式であれば、Shared Nothing形式なので、フィルタリングは普通に行われている。

で、、、このパターンを利用して、普通の環境でもfull scan & joinを効率化する発展型がpartitoned viewにあるかも?と思いDWH全体の10%をキャッシュに乗せるの中で:

パーティション・ビューも使えるぞ:-)
以下のようなSQLが有った場合、オプティマイザーはfull scanとindex_scanを使い分けてくれる。
select ... from partition_view
where entry_date > 2010/05/09
and entry_date < 2010/07/05
and .....
5月、6月テーブルはfull scanを行い7月テーブルはindex scanをしてくれる。

と書いた。でも、「そんな昔の機能、誰も使ってません」、「バグだって直らない」
とS君に一蹴されてしまいました。
Partitioned ViewはOracle7.2で「鳴り物入り」で発表された機能なんだぜ。。。

Frequency Histogram 3

Here’s the output I get from querying dba_tab_histograms for a column that has been given a frequency histogram by a call to dbms_stats.gather_table_stats().

ENDPOINT_NUMBER ENDPOINT_VALUE
--------------- --------------
              2             -1
           3190              0
           3813              1
           4310              4
           4480              5
           7507            999
          17210           2000
          17212           2002
          17213           2004
          17215           2006
          17729           2008
          17750           2009
          17752           2010
          17904           2011
          17906           2012
          17909           2017
          17994           5000
          18006           5001
          18009           5002
          18023           5003
          18062           6001
          39885          11000

In an earlier blog I wrote about a query that turned the figures from a frequency histogram into a list of column values and column frequencies, and if I had used that query against the histogram data I would have found that the value 11,000 appears 21,827 times – according to the histogram.

But there’s a problem with this histogram: the value 11,000 actually appeared roughly 2 million times in the real data set – and when I enabled autotrace on the query that I had used to count the number of times that 11,000 appeared the optimizer’s prediction (the cardinality in the execution plan) matched the actual value fairly closely. So how did the optimizer manage to get from the histogram to the correct cardinality ?