This year again I'm going to present at the DOAG 2012 conference in November. Since it is a popular topic I decided to talk about "Cost-Based Optimizer Basics" there, too. According to the official schedule the presentation will take place on Thursday, the 22nd of November, "Raum 1" at 13:00.
Like last year, I'll also try to do some Unconference sessions in addition - if it is going to take place, which I don't know yet.
I'll post some updates as soon as I know more details.
In my current "big" project, I am building a network of nested materialized views to transform rows of one schema into rows of another (very different) schema. The former is used by the old (but still live) version of an application of ours, the latter by the new version; our idea is to incrementally (aka "fast") refresh the network daily in order to have the new schema ready when the new version goes live. We need this nework because we have only a few hours of allowed downtime, and the transformations are very complex: the MV network is going to be composed of at least 200+ MVs, each containing tens of millions of rows.
We have carefully designed all MVs as fast refreshable, and built a simple PL/SQL engine to refresh them in parallel using Oracle jobs; we are hence sure that we can meet our time constraints. Now I'm looking at optimizing the COMPLETE refresh, both for damage limitation (should some MVs required to be refreshed as complete, due to corruption or recreation mandated by to last-minute functional changes) and to speed up the initial network build.
One of the optimization I was thinking about was to refresh as complete any MV whose masters have been completely refreshed before, since it is common knowledge that in this case, "complete" is much faster then "fast" (pardon the pun) - mostly because using the MV log to identify modified rows is far from cheap and it's a huge, useless overhead when ALL rows have been modified. But actually I don't need to worry, since I have discovered that in this case, Oracle silently turns the fast refresh into a complete one, at least in 184.108.40.206.
The test case (script mv_fast_becomes_complete.sql) builds a chain of three MVs:
create table dellera_t .. create materialized view dellera_mv1 .. as select .. from dellera_t; create materialized view dellera_mv2 .. as select .. from dellera_mv1; create materialized view dellera_mv3 .. as select .. from dellera_mv2;
Then the test case modifies some rows of the master table and asks for fast refresh:
exec dbms_mview.refresh('dellera_mv1', method=>'f', atomic_refresh => true); exec dbms_mview.refresh('dellera_mv2', method=>'f', atomic_refresh => true); exec dbms_mview.refresh('dellera_mv3', method=>'f', atomic_refresh => true);
It's no surprise that we get:
SQL> select mview_name, last_refresh_type from user_mviews where mview_name like 'DELLERA%' order by mview_name; MVIEW_NAME LAST_REFRESH_TYPE -------------------- ------------------------ DELLERA_MV1 FAST DELLERA_MV2 FAST DELLERA_MV3 FAST
But, when the test case refreshes the first alone as complete, keeping the next ones as fast:
exec dbms_mview.refresh('dellera_mv1', method=>'c', atomic_refresh => true); exec dbms_mview.refresh('dellera_mv2', method=>'f', atomic_refresh => true); exec dbms_mview.refresh('dellera_mv3', method=>'f', atomic_refresh => true);
MVIEW_NAME LAST_REFRESH_TYPE -------------------- ------------------------ DELLERA_MV1 COMPLETE DELLERA_MV2 COMPLETE DELLERA_MV3 COMPLETE
Hence, our request for fast has been silently turned by Oracle into a complete refresh, for all MVs down the chain. By the way, I have verified that this is true also for join-only and aggregates-only Mvs (check the test case if interested): all it takes to trigger this silent optimization is that at least one master has been completely refreshed before.
Moreover, the trace file shows the following steps, for the fast-turned-complete refresh of dellera_mv2 (edited for readability):
a) update mlog$_dellera_mv1 set snaptime$$ = :1 where ...; b) delete from dellera_mv2; c) delete from mlog$_dellera_mv2; d) insert into dellera_mv2 (m_row$$, x , y , rid#) select rowid,x,y,rowid from dellera_mv1; e) delete from mlog$_dellera_mv1 where snaptime$$ <= :1;
This is indeed the signature of a complete refresh: a complete delete (b) of the refreshing MV (dellera_mv2) followed by a straight insert (d) from the master (dellera_mv1), with no visit of the MV log of the master besides the usual initial marking (a) and the final purge (e).
What is also interesting is that the MV log is completely (no reference to snaptime$$) deleted in (c) BEFORE the insert; insertion that does not populate the MV log since the kernel trigger is "disabled" during a complete refresh (otherwise we would find a lot of rows in the log since the delete happens before and not after, but the script verifies that no row is found). No MV log management overhead, as expected.
It's also interesting to check what changes when the refresh is done with atomic => false (NB I have skipped table-locking and index-maintenance operations, and non pertinent hints, for brevity):
a) same as above b) same as above c) truncate table dellera_mv2 purge snapshot log; d) insert /*+ append */ into dellera_mv2 (m_row$$, x , y , rid#) select rowid,x,y,rowid from dellera_mv1; e) same as above
That is, the delete has turned into a "truncate purge snapshot log" and the insert is now running in append mode, the rest is the same. In passing, (b) looks a tad redundant.
As always, you can find all scripts in the test case .zip above - including of course spool files and traces, the latter also processed with xtrace exec flow to quickly mine the SQL statements of interest.
... well it is for me, anyway.
I know this isn't a universal feeling, but I always look forward to going over to San Francisco for Openworld. There are so many people I'll see that I don't see often enough, lots of cool presentations (if you know where to look) and I expect this year to be a year full of announcements in the areas I care about (and probably in others, too!). I'm gutted that I'll miss the ACE Director briefing because I think I'd be particularly interested this year, but the reality is that it's difficult getting any days away from my current project, never mind a week and a half.
On top of my main agenda presentation (Tuesday, 5pm, Moscone West 3014), I've decided to do a quick 10-minute talk on Tuesday lunchtime at Oracle Closed World, an alternative event organised by Kyle Hailey with assistance from various Oak Table types and sponsored by a number of the best-known Oracle consulting companies. As well as an agenda featuring some of the best speakers around, check out Marco Gralike's cool video on the home page
This year my main task seems to be as a Party Planner, though. That's something I never thought would happen which has given me a new-found respect for all of those poor people who organise the myriad parties that will be going on that week! Still, the great thing about it is that there is a Party at the end
Can't wait ...
Past Closed World Events, 2009/2010 – Collage OakTable World 2012 – Teaser Past Closed World Events, 2009/2010 – Snapbook
I did an EM Cloud Control 12cR2 installation at work yesterday. The database repository was 220.127.116.11 on HP-UX and the middle tier was installed on RHEL 5.8. The installation was pretty much the same as the 12cR1 version. Over the next few days I’ll be testing out some of the features to decide if we can move across to it permanently.
Today I did two run throughs of single server installations on Oracle Linux 5.8 and 6.3. There are a couple of minor differences, but nothing to worry about. You can see what I did here:
The installations are a little small, so they are not too fast, but it’s good enough to test things out.
Update: It’s been a while since I used the 12c version, so I’ve had to relearn a few simple things. I thought I might as well write the down in case it helps others.
I’m now rockin’ UltraEdit 3.2 on Mac and Linux…
This is the version that is meant to bring the Mac/Linux version in line with the Windows version as far as functionality is concerned. I’m not sure that is the case, but it’s getting ever closer. It certainly does everything I need it to do now.
It’s a major release with a bunch of interesting new features, so I’ve got an interesting few days ahead while I get to grips with it. You can get an idea about what’s in it here.
I posted, as (my) tradition dictates, the Oracle agenda overview of XMLDB sessions during Oracle Open World 2012. I already signed up for the HOL10055 session but yesterday I also got a quick peek of it’s contents this year. And yes, I can promise you, it is a must go, if you are dealing with
This really could be published in the Errata and Addenda of “Cost Based Oracle – Fundamentals”, but it’s more convenient to publish the whole thing here and just add a link to the Errata pages.
In chapter 7, on page 156, I gave an example of the type of SQL that Oracle runs (in the dbms_stats package) to generate a histogram. A sample of the code, and the plan from the 18.104.22.168 tkprof output, is listed below:
select min(minbkt),maxbkt, substrb(dump(min(val),16,0,32),1,120) minval, substrb(dump(max(val),16,0,32),1,120) maxval, sum(rep) sumrep, sum(repsq) sumrepsq, max(rep) maxrep, count(*) bktndv, sum(case when rep=1 then 1 else 0 end) unqrep from ( select val,min(bkt) minbkt, max(bkt) maxbkt, count(val) rep, count(val) *count(val) repsq from ( select /*+ noparallel(t) noparallel_index(t) cursor_sharing_exact dynamic_sampling(0) no_monitoring */ "LN100" val, ntile(200) over (order by "LN100") bkt from sys.ora_temp_1_ds_616 t where "LN100" is not null ) group by val ) group by maxbkt order by maxbkt ; Rows Row Source Operation ------- --------------------------------------------------- 170 SORT GROUP BY 825 VIEW 825 SORT GROUP BY 200028 VIEW 200028 WINDOW SORT 200028 TABLE ACCESS FULL ORA_TEMP_1_DS_616
The code in this case references a table called ora_temp_1_DS_616, which is a temporary table dynamically created by the package to hold a sample of just the columns that I was interested in. I had specified a histogram of 200 buckets, and you can see the ntile(200) analytic function in the code that Oracle has used to generate this – and “window sort” in the execution plan that supports it; it’s a window sort of the entire sampled data set which could be quite large and, as we’ve seen elsewhere, could be very resource-intensive.
In a demonstratin of how important it is (and how nearly impossible it is) to keep revisiting the things you think you know, I’ve recently discovered – while constructing a demonstration of a problem in 22.214.171.124 – that the implementation had changed by 10.2.0.3 (and it may have changed even earlier, of course – but that’s the earliest version of 10g I have acces to). Here’s the generated code for the same data set, with plan, from 10.2.0.3:
select substrb(dump(val,16,0,32),1,120) ep, cnt from ( select /*+ no_parallel(t) no_parallel_index(t) dbms_stats cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring */ "LN100" val, count(*) cnt from sys.ora_temp_1_ds_1470 t where "LN100" is not null group by "LN100" ) order by val Rows Row Source Operation ------- --------------------------------------------------- 824 SORT GROUP BY (cr=499 pr=0 pw=0 time=42484 us) 199451 TABLE ACCESS FULL ORA_TEMP_1_DS_1470 (cr=499 pr=0 pw=0 time=199490 us)
As you can see it doesn’t have any functionality built into it that could possibly provide the 200 buckets that I finally want to see. You’ll also notice that the sample size is slightly different from the 9i sample size – that’s one of the threats of sampled histogram creation, of course, you could get a slightly (but dangerously) different histogram every time you call dbms_stats even if the data hasn’t changed.
Based on the differences in the query, I think we can safely assume that Oracle is using some procedural code to count its way through the results from this query (ordered by column value) to construct the histogram details that we are used to seeing. I think it’s worth noting that this demonstrates the fact that using a simpler SQL statement with some surrounding PL/SQL can sometimes be more efficient than writing a devilishly clever piece of SQL that gets you the answer directly.
BLOG UPDATE 2012-09-21: The conference organizers had to dodge legal bullets from Oracle Corporation for using the word Oracle. So, the name of the conference changed to Oaktable World:
This is just a quick announcement about a great conference hosted by the good folks at Delphix, Miracle and Pythian.
Please visit the Oracle Real World website for session and venue information.
I’ll be offering a session called “SLOB Why, How and What’s It Got To Do With Exadata.” Attendees will learn how to put SLOB to use for their benefit. They might even learn something important about Exadata as an added bonus.
Please follow this link to the website to learn more.