Search

OakieTags

Who's online

There are currently 0 users and 33 guests online.

Recent comments

Oakies Blog Aggregator

Little Things Doth Crabby Make – Part XVIII. Automatic Storage Management Won’t Let Me Use My Disk For My Files! Yes, It Will!

It’s been a long time since my last installment in the Little Things Doth Crabby Make series and to be completely honest this particular topic isn’t really all that fit for a LTDCM installment because it covers something that is possible but less than expedient.  That said, there are new readers of this blog and maybe it’s time they google “Little Things Doth Crabby Make” to see where this series has been. This post might rustle up that curiosity!

So what is this blog post about? It’s about stuffing any file system file into Automatic Storage Management space. OK, so maybe this is just morbid curiosity or trivial pursuit. Maybe it’s just a parlor trick. I would agree with any of those descriptions. Nonetheless maybe there are 42 or so people out there who didn’t know this. If so, this post is for them.

ASMCMD cp Command

The cp sub-command of ASM lets you stuff certain database files into ASM. We all know this. However, just to make it all fresh in people’s minds I’ll show a screen shot of me trying to push a compressed tar archive of $ORACLE_HOME/bin/oracle up into ASM:

2014.02.04-pic-0

Well, that’s not surprising. But what happens if I take heed of the error message and attempt to placate? The block size is 8KB so the following screen shot shows me rounding up the size of the compressed tar archive to an 8192B blocking factor:

2014.02.04-pic-0.1

ASMCMD still won’t gobble up the file. That’s still not all that surprising because after ASMCMD checked the geometry of the file it then read the file looking for a header or any file magic it could understand.  As you can see ASMCMD doesn’t see a file type it understands. The following screen shot shows me pre-pending the tar archive with file magic I know ASMCMD must surely understand. I have a database with a tablespace called foo that I created in a non-Oracle Disk Manager naming convention (foo.dbf). The screen shot shows me:

  1. Extracting the foo.dbf file
  2. “Borrowing” 1MB from the head of the file
  3. Creating a compressed tar archive of the Oracle Database executable
  4. Rounding up the size of the compressed tar archive to an 8192B blocking factor

2014.02.04-pic1

 

So now I have a file that has the “shape” of a datafile and the necessary header information from a datafile. The next screen shot shows:

  1. ASMCMD cp command pushing my file into ASM
  2. Removal of all of my current working directory files
  3. ASMCMD cp command pulling the file form ASM and into my current working directory
  4. Extracting the contents of the “embedded” tar archive
  5. md5sum(1) proof the file contents survived the journey

2014.02.04-pic2

OK, so that’s either a) something nobody would ever do or b) something that can be done with some elegant execution of some internal database package in a much less convoluted way or c) a combination of both “a” and “b” or d) a complete waste of my time to post, or, finally, e) a complete waste of your time reading the post. I’m sorry for “a”,”b”,”c” and certainly “e” if the case should be so.

Now you must wonder why I put this in the Little Things Doth Crabby Make series. That’s simple. I don’t like any “file system” imposing restrictions on file types :)

 

Filed under: oracle

Friday Philosophy – How Much does Social Media Impact your Career for Real?

Does what you tweet impact your chances of getting that next interview?
Do people check out your Facebook pictures before making you a job offer?
Does my Blog actually have any impact on my career?

We’ve all heard horror stories about people losing their job as a result of a putting something “very unfortunate” on their facebook page, like how they were on holiday/at a sports event when their employer was under the illusion they were off sick, or the more obvious {and pretty stupid} act of denigrating their boss or employer. But how much does general, day-to-day social media impact your career? {“Your” as in you people who come by this blog, mostly professionals in IT. I know it will be different for people trying to get a job in media or….social media :-) }.

Two things recently have made me wonder about this:

  • The first is that I’ve been in or watched a few discussions recently (via social media!) where people are suggesting that their social media output is part of who they are seen as professionally and they make efforts to ensure they give the right impression, or have even sought professional help to improve their social media standing in respect of employment.
  • The second is that I recently was involved in some hiring and I never even thought to look at their social media. Maybe that is just because I’m over {picks an age} 30 and social media is not a massive thing to me. Most of my hiring experience was before the likes of Facebook and though I would check out a blog if it was mentioned on a CV, I would not have thought to check them out.

When I initially thought about that second point I assumed that most people hiring in the world of IT are similarly a bit ancient like myself and maybe not that attuned to social media. But perhaps I am wrong as it’s people similar to me out there on Twitter who have been worrying about such things. Maybe social media is considered by potential employees than I think? I’d like to know what anyone else thinks.

I should add that I don’t see all Social Media as the same when it comes to it’s impact on your career. I think their is Friends Social and Business Social. Something like LinkedIn is aimed fair and square at business and professional activity and is Business Social. You would really expect it to be looked at and, in fact, most people who use it would hope it is! {Mine isn’t, I get about 3 or 4 views a week and only once, 5 or 6 years ago, was I approached via it for a work opportunity}. If you blog about a work topic or tweet as an expert in your field (so your tweets are mostly about your day job, not just the odd reference) and especially if you are doing either under a company banner then, yes, I’d expect that to be taken into account when prospective employment comes up.

Social Media is most people’s twittering, personal Facebook, private blogs, Pinterest and all those dozens of things I know nothing about as I am too old and too antisocial. Do these really have much impact on your career?

I would suggest not, again for two reasons:

  • I don’t think most employers are going to look at your Friends Social Media until they have at least interviewed you, as when you are hiring you barely have enough time to check over the CV’s, let alone research each candidate’s personal history. Once you have interviewed them, then they have become a real person rather than a name and if you do check out their Friends Social Media then you will look at it in light of them being a human being, which is point 2:
  • Unless you are saying things that would make anyone think you are a bit odd or unpleasant, I can’t see that discussions of football, insulting your friends, making double entendra comments or (one of my favorites) pointless drivel about your cat is going to make anyone who you would want to work for worry about you. Some people might put up things that could be offensive to others – but then, if you really do think immigrants are ruining the UK, we are not going to get on so working together is a mistake for both of us. So maybe even stating your strongly held opinions is long-term beneficial as well. Some people take my strong dislike of children as a real reason to not like me very much. Best we don’t spend 8 hours a day, five days a week together. You’ll only bang on about your bloody kids.

What I think is a shame is that I suspect some people {many people?} self-censor themselves on all Social Media due to a concern to always be seen as professional. As good worker material. We all know that almost everyone we work with have unprofessional moments and, in fact, those few who are professional all the time tend to be… staggeringly dull.

So maybe being mindful of your professional standing is totally correct on Business Social Media but a bit of a shame if you let it impact your Friends Social Media.

But remember, on all social media there are limits. There are some things about you, Dave, that you should simply not share. Or at least, only at the pub when we are all too drunk to care.

Maintaining Tempfile in TEMP Tablespace of PDB$SEED in Oracle 12c (12.1.0.2)

During testing recovery procedures for one of the ongoing projects I wanted to test the "complete disaster" recovery scenario. In this scenario I had to recreate also all ASM disks and restore everything from backup.
Actually full backup with RMAN and subsequent restore of a pluggable 12c single-tenant database  was the solution. I will not talk about that as the main point of this post is quite different.

So the recovery was successful but after restoring  the CDB$ROOT and PDB database I found in the alert log the following message:

Errors in file /u01/app/oracle/diag/rdbms/mydb/mydb/trace/mydb_dbw0_28973.trc:
ORA-01157: cannot identify/lock data file 202 - see DBWR trace file
ORA-01110: data file 202: '+DATA/MYDB/FD9AC20F64D244D7E043B6A9E80A2F2F/DATAFILE/pdbseed_temp012015-01-03_12-22-30-pm.dbf'
ORA-17503: ksfdopn:2 Failed to open file +DATA/MYDB/FD9AC20F64D244D7E043B6A9E80A2F2F/DATAFILE/pdbseed_temp012015-01-03_12-22-30-pm.dbf
ORA-15173: entry 'FD9AC20F64D244D7E043B6A9E80A2F2F' does not exist in directory 'MYDB'
Thu Feb 05 09:12:52 2015
Errors in file /u01/app/oracle/diag/rdbms/mydb/mydb/trace/mydb_dbw0_28973.trc:
ORA-01186: file 202 failed verification tests
ORA-01157: cannot identify/lock data file 202 - see DBWR trace file
ORA-01110: data file 202: '+DATA/MYDB/FD9AC20F64D244D7E043B6A9E80A2F2F/DATAFILE/pdbseed_temp012015-01-03_12-22-30-pm.dbf'

The problem was the temp file of the temporary tablespace TEMP in PDB$SEED which is used for creation of pluggable databases. The tempfiles are not part of RMAN backup as they can always be recreated. Although I could live with this error messages as this is single-tennat database with only one pluggable database I wanted to deliver a "clean solution" to the client. Besides I wanted to know how I can solve such problem.

So here is the solution:

$ sqlplus / as sysdba

SQL*Plus: Release 12.1.0.2.0 Production on Thu Feb 5 09:35:11 2015

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Automatic Storage Management, OLAP, Advanced Analytics
and Real Application Testing options

SQL> alter session set container=pdb$seed;

Session altered.

SQL> select open_mode from v$database;

OPEN_MODE
--------------------
READ ONLY

SQL> select name from v$tempfile;

NAME
---------------------------------------------------------------------------------------------------
+DATA/MYDB/FD9AC20F64D244D7E043B6A9E80A2F2F/DATAFILE/pdbseed_temp012015-01-03_12-22-30-pm.dbf


SQL> alter pluggable database pdb$seed close;
alter pluggable database pdb$seed close
*
ERROR at line 1:
ORA-65017: seed pluggable database may not be dropped or altered

SQL> alter session set "_oracle_script"=TRUE;

Session altered.

SQL> alter pluggable database pdb$seed close;

Pluggable database altered.

SQL> alter pluggable database pdb$seed open read write;

Pluggable database altered.

SQL> alter tablespace temp add tempfile '+DATA';

Tablespace altered.
SQL> column name format a100
SQL> select name from v$tempfile;

NAME
-------------------------------------------------------------------------------------------------
+DATA/MYDB/FD9AC20F64D244D7E043B6A9E80A2F2F/DATAFILE/pdbseed_temp012015-01-03_12-22-30-pm.dbf
+DATA/MYDB/0BBEAC43B4250445E0530A0C6D0AEC65/TEMPFILE/temp.299.870860375

SQL>  alter tablespace temp drop tempfile '+DATA/MYDB/FD9AC20F64D244D7E043B6A9E80A2F2F/DATAFILE/pdbseed_temp012015-01-03_12-22-30-pm.dbf';

Tablespace altered.

SQL>  select name from v$tempfile;

NAME
----------------------------------------------------------------------------
+DATA/MYDB/0BBEAC43B4250445E0530A0C6D0AEC65/TEMPFILE/temp.299.870860375

After performed change there were no more error messages during database startup.

The secrete sauce for changing the state of PDB$SEED was setting "alter session set "_oracle_script"=TRUE;". The idea came from this post about changing the open mode of the PDB$SEED database.
Hope this short post will help somebody to resolve his problem with dropping and creating new tempfile in PDB$SEED faster.

Parallel Execution 12c New Features Overview

Oracle 12c is the first release since a couple of years that adds significant new functionality in the area of Parallel Execution operators, plan shapes and runtime features. Although 11gR2 added the new Auto DOP feature along with In-Memory Parallel Execution and Statement Queueing, the 12c features are more significant because they introduce new operators that can change both execution plan shape and runtime behaviour.

Here is a list of new features that are worth to note (and not necessarily mentioned in the official documentation and white papers by Oracle):

- The new HYBRID HASH adaptive distribution method, that serves two purposes for parallel HASH and MERGE JOINs:

First, it allows the PX SEND / RECEIVE operation for the left (smaller estimated row source) of the hash join to decide dynamically at runtime, actually for each execution, if it should use either a BROADCAST or HASH distribution, and correspondingly for the other row source to use then either a ROUND-ROBIN or a HASH distribution, too. This is described for example in the corresponding white paper by Maria Colgan here.

Second, to some degree it allows to address data distribution skew in case of HASH distributions (and only for parallel hash joins, not merge joins), which I've described in detail already in the past. A summary post that links to all other relevant articles regarding Parallel Execution Skew can be found here, an overview of the relevant feature can be found here and a detailed description can be found here.

I'll cover some aspects of this adaptive distribution that I haven't mentioned in the existing articles in a separate post.

- The new concurrent UNION ALL operator. This is officially documented here. It comes with a new operator PX SELECTOR that is a generic functionality to pick one of the available PX slaves to perform the child operations of this operator. Since the official documentation leaves a lot details unclear how this concurrent operation will actually behave at run time I'll cover some examples with runtime profiles in a separate post.

- The new PQ_REPLICATE feature that for simple parallel FULL TABLE SCAN row sources (I haven't tested yet if a parallel INDEX FAST FULL SCAN is eligible, too, but I assume so) can decide to run the scan entirely in each PX slave instead of running a distributed scan across the PX slaves in granules and distributing by BROADCAST afterwards. It's not entirely clear to me why this was implemented. Although it reduces the number of redistributions, and in some cases where no other parallel redistributions are required can reduce the number of parallel slave sets to one instead of two, BROADCAST distributions are typically used for smaller row sources, so eliminating this distribution doesn't sound like a huge improvement to justify the development effort. Jonathan Lewis describes the feature here along with some of his ideas, why this feature might be useful.

- The new parallel FILTER operator, an important and potentially huge improvement over the previous only available serial FILTER operator. In the past when a FILTER subquery was part of a parallel plan the data of the "driving" row source of the FILTER (the first child operation) had to be passed to the Query Coordinator and only then the second to nth children could be executed as many times as indicated by the first row source (and depending on the efficiency of filter/subquery caching). Now the FILTER operator can run in the PX slaves and there are a number of distribution variants possible with this new parallel operator. I'll cover that in a separate post.

- The new PX SELECTOR operator that I already mention above as part of the new concurrent UNION ALL operator. As described above, the generic functionality of this operator is to pick one of the available PX slaves to perform the child operations of this operator. It will be used in 12c for serial access operators that are part of a parallel plan (like a serial table or index scan). In the past these parts were performed by the Query Coordinator itself, but now one slave out of a slave set will be selected to perform such operations. This has a number of implications and I'll cover that in a separate post

- The new 1 SLAVE distribution method that is a bit similar to the PX SELECTOR operator in that it will use just one slave of the slave set but gets used for serial parts of the execution plan when the data is redistributed from a parallel part of the plan to a part of the plan that needs to be run in serial, because Oracle cannot parallelize the functionality, for example when evaluating ROWNUM or certain analytic function variants (for example LAG or LEAD with no partition clause). This new 1 SLAVE distribution seems to have two purposes: First avoid activity of the query coordinator (like the PX SELECTOR above) and second avoid the decomposition of the parallel plan into multiple DFO trees. I'll cover that in a separate post

- 12c changes also the way some operations in the plan are marked as PARALLEL or not, which in my opinion can be pretty confusing (and is partly inconsistent with runtime behaviour in my opinion) when just looking at the execution plan, since the runtime activity then might look different from what the execution plan suggests. I'll cover that in a separate post and it will also be picked up in the context of other new functionality mentioned above as appropriate.

There is probably more that I haven't come across yet, but as you can see from the number of times I've mentioned "separate post" in this overview this is already enough material for a whole series of posts to follow.

Webinar Followup

Thanks everyone who attended my recent webinar at AllThingsOracle.com.

The link to the webinar recording can be found here.

The presentation PDF can be downloaded here. Note that this site uses a non-default HTTP port, so if you're behind a firewall this might be blocked.

Thanks again to AllThingsOracle.com and Amy Burrows for hosting the event.

Scrutinizing Exadata X5 Datasheet IOPS Claims…and Correcting Mistakes

I want to make these two points right out of the gate:

  1. I do not question Oracle’s IOPS claims in Exadata datasheets
  2. Everyone makes mistakes

Everyone Makes Mistakes

Like me. On January 21, 2015, Oracle announced the X5 generation of Exadata. I spent some time studying the datasheets from this product family and also compared the information to prior generations of Exadata namely the X3 and X4. Yesterday I graphed some of the datasheet numbers from these Exadata products and tweeted the graphs. I’m sorry  to report that two of the graphs were faulty–the result of hasty cut and paste. This post will clear up the mistakes but I owe an apology to Oracle for incorrectly graphing their datasheet information. Everyone makes mistakes. I fess up when I do. I am posting the fixed slides but will link to the deprecated slides at the end of this post.

We’re Only Human

Wouldn’t IT be a more enjoyable industry if certain IT vendors stepped up and admitted when they’ve made little, tiny mistakes like the one I’m blogging about here? In fact, wouldn’t it be wonderful if some of the exceedingly gruesome mistakes certain IT vendors make would result in a little soul-searching and confession? Yes. It would be really nice! But it’ll never happen–well, not for certain IT companies anyway. Enough of that. I’ll move on to the meat of this post. The rest of this article covers:

  • Three Generations of Exadata IOPS Capability
  • Exadata IOPS Per Host CPU
  • Exadata IOPS Per Flash SSD
  • IOPS Per Exadata Storage Server License Cost

Three Generations of Exadata IOPS Capability

The following chart shows how Oracle has evolved Exadata from the X3 to the X5 EF model with regard to IOPS capability. As per Oracle’s datasheets on the matter these are, of course, SQL-driven IOPS. Oracle would likely show you this chart and nothing else. Why? Because it shows favorable,  generational progress in IOPS capability. A quick glance shows that read IOPS improved just shy of 3x and write IOPS capability improved over 4x from the X3 to X5 product releases. These are good numbers. I should point out that the X3 and X4 numbers are the datasheet citations for 100% cached data in Exadata Smart Flash Cache. These models had 4 Exadata Smart Flash Cache PCIe cards in each storage server (aka, cell). The X5 numbers I’m focused on reflect the performance of the all-new Extreme Flash (EF) X5 model. It seems Oracle has started to investigate the value of all-flash technology and, indeed, the X5 EF is the top-dog in the Exadata line-up. For this reason I choose to graph X5 EF data as opposed to the more pedestrian High Capacity model which has 12 4TB SATA drives fronted with PCI Flash cards (4 per storage server). exadata-evolution-iops-gold-1 The tweets I hastily posted yesterday with the faulty data points aimed to normalize these performance numbers to important factors such as host CPU, SSD count and Exadata Storage Server Software licensing costs.  The following set of charts are the error-free versions of the tweeted charts.

Exadata IOPS Per Host CPU

Oracle’s IOPS performance citations are based on SQL-driven workloads. This can be seen in every Exadata datasheet. All Exadata datasheets for generations prior to X4 clearly stated that Exadata IOPS are limited by host CPU. Indeed, anyone who studies Oracle Database with SLOB knows how all of that works. SQL-driven IOPS requires host CPU. Sadly, however, Oracle ceased stating the fact that IOPS are host-CPU bound in Exadata as of the advent of the X4 product family. I presume Oracle stopped correctly stating the factual correlation between host CPU and SQL-driven IOPS for only the most honorable of reasons with the best of customers’ intentions in mind. In case anyone should doubt my assertion that Oracle historically associated Exadata IOPS limitations with host CPU I submit the following screen shot of the pertinent section of the X3 datasheet:   X3-datasheet-truth Now that the established relationship between SQL-driven IOPS and host CPU has been demystified, I’ll offer the following chart which normalizes IOPS to host CPU core count: exadata-evolution-iops-per-core-gold I think the data speaks for itself but I’ll add some commentary. Where Exadata is concerned, Oracle gives no choice of host CPU to customers. If you adopt Exadata you will be forced to take the top-bin Xeon SKU with the most cores offered in the respective Intel CPU family.  For example, the X3 product used 8-core Sandy Bridge Xeons. The X4 used 12-core Ivy Bridge Xeons and finally the X5 uses 18-core Haswell Xeons. In each of these CPU families there are other processors of varying core counts at the same TDP. For example, the Exadata X5 processor is the E5-2699v3 which is a 145w 18-core part. In the same line of Xeons there is also a 145w 14c part (E5-2697v3) but that is not an option to Exadata customers.

All of this is important since Oracle customers must license Oracle Database software by the host CPU core. The chart shows us that read IOPS per core from X3 to X4 improved 18% but from X4 to X5 we see only a 3.6% increase. The chart also shows that write IOPS/core peaked at X4 and has actually dropped some 9% in the X5 product. These important trends suggest Oracle’s balance between storage plumbing and I/O bandwidth in the Storage Servers is not keeping up with the rate at which Intel is packing cores into the Xeon EP family of CPUs. The nugget of truth that is missing here is whether the 145w 14-core  E5-2697v3 might in fact be able to improve this IOPS/core ratio. While such information would be quite beneficial to Exadata-minded customers, the 22% drop in expensive Oracle Database software in such an 18c versus 14c scenario is not beneficial to Oracle–especially not while Oracle is struggling to subsidize its languishing hardware business with gains from traditional software.

Exadata IOPS Per Flash SSD

Oracle uses their own branded Flash cards in all of the X3 through X5 products. While it may seem like an implementation detail, some technicians consider it important to scrutinize how well Oracle leverages their own components in their Engineered Systems. In fact, some customers expect that adding significant amounts of important performance components, like Flash cards, should pay commensurate dividends. So, before you let your eyes drift to the following graph please be reminded that X3 and X4 products came with 4 Gen3 PCI Flash Cards per Exadata Storage Server whereas X5 is fit with 8 NVMe flash cards. And now, feel free to take a gander at how well Exadata architecture leverages a 100% increase in Flash componentry: exadata-evolution-iops-per-SSD-gold This chart helps us visualize the facts sort of hidden in the datasheet information. From Exadata X3 to Exadata X4 Oracle improved IOPS per Flash device by just shy of 100% for both read and write IOPS. On the other hand, Exadata X5 exhibits nearly flat (5%) write IOPS and a troubling drop in read IOPS per SSD device of 22%.  Now, all I can do is share the facts. I cannot change people’s belief system–this I know. That said, I can’t imagine how anyone can spin a per-SSD drop of 22%–especially considering the NVMe SSD product is so significantly faster than the X4 PCIe Flash card. By significant I mean the NVMe SSD used in the X5 model is rated at 260,000 random 8KB IOPS whereas the X4 PCIe Flash card was only rated at 160,000 8KB read IOPS. So X5 has double the SSDs–each of which is rated at 63% more IOPS capacity–than the X4 yet IOPS per SSD dropped 22% from the X4 to the X5. That means an architectural imbalance–somewhere.  However, since Exadata is a completely closed system you are on your own to find out why doubling resources doesn’t double your performance. All of that might sound like taking shots at implementation details. If that seems like the case then the next section of this article might be of interest.

IOPS Per Exadata Storage Server License Cost

As I wrote earlier in this article, both Exadata X3 and Exadata X4 used PCIe Flash cards for accelerating IOPS. Each X3 and X4 Exadata Storage Server came with 12 hard disk drives and 4 PCIe Flash cards. Oracle licenses Exadata Storage Server Software by the hard drive in X3/X4 and by the NVMe SSD in the X5 EF model. To that end the license “basis” is 12 units for X3/X5 and 8 for X5. Already readers are breathing a sigh of relief because less license basis must surely mean less total license cost. Surely Not! Exadata X3 and X4 list price for Exadata Storage Server software was $10,000 per disk drive for an extended price of $120,000 per storage server. The X5 EF model, on the other hand, prices Exadata Storage Server Software at $20,000 per NVMe SSD for an extended price of $160,000 per Exadata Storage Server. With these values in mind feel free to direct your attention to the following chart which graphs the IOPS per Exadata Storage Server Software list price (IOPS/license$$). exadata-evolution-iops-per-license-cost-gold The trend in the X3 to X4 timeframe was a doubling of write IOPS/license$$ and just short of a 100% improvement in read IOPS/license$$. In stark contrast, however, the X5 EF product delivers only a 57% increase in write IOPS/license$$ and a troubling, tiny, 17% increase in read IOPS/license$$. Remember, X5 has 100% more SSD componentry when compared to the X3 and X4 products.

Summary

No summary needed. At least I don’t think so.

About Those Faulty Tweeted Graphs

As promised, I’ve left links to the faulty graphs I tweeted here: Faulty / Deleted Tweet Graph of Exadata IOPS/SSD: http://wp.me/a21zc-1ek Faulty / Deleted Tweet Graph of Exadata IOPS/license$$: http://wp.me/a21zc-1ej

References

Exadata X3-2 datasheet: http://www.oracle.com/technetwork/server-storage/engineered-systems/exadata/exadata-dbmachine-x3-2-ds-1855384.pdf Exadata X4-2 datasheet: http://www.oracle.com/technetwork/database/exadata/exadata-dbmachine-x4-2-ds-2076448.pdf Exadata X5-2 datasheet: http://www.oracle.com/technetwork/database/exadata/exadata-x5-2-ds-2406241.pdf X4 SSD info: http://www.oracle.com/us/products/servers-storage/storage/flash-storage/f80/overview/index.html X5 SSD info: http://docs.oracle.com/cd/E54943_01/html/E54944/gokdw.html#scrolltoc Engineered Systems Price List: http://www.oracle.com/us/corporate/pricing/exadata-pricelist-070598.pdf , http://www.ogs.state.ny.us/purchase/prices/7600020944pl_oracle.pdf

Filed under: oracle

A new blog post on using Snap Clone on EMC storage with ASM

Just a quick note here to say I’ve added a new blog post to the official Enterprise Manager blog site. You can find it here.

Working With AWR Reports From EM12c

I’ve had a recent request to write an update on working with AWR reports, so as promised, here it is!

The Automatic Workload Repository

The Automatic Workload Repository, (AWR) was one of the best enhancements to Oracle back in release 10g.  There was quite a goal put in front of the development group when they were asked to develop a product that:

1.  Provided significant performance recommendation and wait event data enhancements over its predecessor statspack.

2.  Was always on, meaning that the data would continually collect without manual intervention from the database administrator.

3.  Wouldn’t impact the current processing, having its own background processes and memory buffer, designated tablespace, (SYSAUX).

4.  The memory buffer would write in the opposite direction vs. direction the user reads, eliminating concurrency issues.

Along with many other requirements, all of the above was offered with the Automatic Workload Repository and we end up with architecture that looks something like this:

pt5

 

Using AWR Data

The AWR data is identified by the DBID, (Database Identifier) and a SNAP_ID, (snapshot identifier, which has an begin_interval_time and end_interval_time to isolate the date and time of the data collection.) and information about what is currently retained in the database can be queried from the DBA_HIST_SNAPSHOT.  AWR data also contains ASH, (Active Session History) samples along with the snapshot data, by default, about 1 out of every 10 samples.

The goal to using AWR data effectively really has to do with the following:

1.  Have you identified a true performance issue as part of a performance review?

2.  Has there been a user complaint or a request to investigate a performance degradation?

3.  Is there a business challenge or question that needs to be answered that AWR can offer an answer to?  (we’ll go when to use AWR vs. other features…)

Performance Review

A performance review is where you have either identified a problem or have been assigned to investigate the environment for performance issues to solve.  I have a couple Enterprise Manager environments available to me, but I chose to go out to one in particular and cross my fingers hoping I would have some heavy processing to fit the requirements of this post.

The quickest way to see workload in your database environment from EM12c, click on Targets –> Databases.  Choose to view by load map and you will then view databases by workload.  Upon going to a specific Enterprise Manager environment, I found out it was my lucky day!

pt1I really don’t know who Kurt is that has a database monitored on this EM12c cloud control environment, but boy, is he my favorite person today! :)

Hovering my cursor over the database name, (kurt) you can view the workload he has running on his test database currently:pt2

 

Boy, is Kurt my favorite person today!

EM12c Database Home Page

Logging into the database, I can see the significant IO and resource usage for the database and host from the database home page:

pt3

 

If we move to Top Activity, (Performance menu, Top Activity) I begin to view more details about the processing and different wait events:

pt4

 

Kurt is doing all KINDS of inserts, (seen by the different SQL_IDs, by SQL Type “INSERT”.  I can drill down into the individual statements and investigate this, but really, there are a TON of statements and SQL_ID’s here, wouldn’t it just be easier to view the workload with an AWR report?

Running the AWR Report

I choose to click on Performance, AWR, AWR Report.  Now I have a choice.  I could request a new snapshot to be performed immediately or I could wait till the top of the hour, as the interval is set hourly in this database.  I chose the latter for this demonstration, but if you wanted to create a snapshot immediately, you can do this easily from EM12c or request a snapshot by executing the following from SQLPlus with a user with execute privileges on the DBMS_WORKLOAD_REPOSITORY:

BEGIN
DBMS_WORKLOAD_REPOSITORY.CREATE_SNAPSHOT ();
END;
/

For this example, I simply waited, as there was no hurry or concern here and requested the report for the previous hour and latest snapshot:

pt6

 

I always start at the Top Ten Foreground Events and commonly look at those with high wait percentages:

pt7

 

Direct Path Write, that’s it.  Nothing else to see here… :)

Direct path write involves the following:  inserts/updates, objects being written to, tablespaces being written to and those datafiles that make up the tablespace(s).

It’s also IO, which we quickly verify down in the Foreground Wait Class:

pt8

 

Looking at the Top SQL by Elapsed Time confirms that we are dealing with a workload consisting of all inserts:

pt9

 

Clicking on the SQL ID, takes me to the Complete List of SQL Text and shows me just what Bad Boy Kurt is doing to produce his testing workload:

pt10

Wow, that Kurt is quite the rebel, eh? :)

Insert in a loop into one table from the same table, rollback and then end the loop, thanks for playing. He’s kicking some tires and doing it with angst!  Don’t worry people, like I said, Kurt is doing his job, using a module called “Load Generator”.  I’d be a fool not to recognize this as anything other than what it is-  generating workload to test something.  I just get the added benefit of having a workload to do a blog post on using AWR data… :)

Now, if this was a real issue and I was trying to find out what this type of performance impact this type of insert was creating on the environment, where to go next in the AWR report?  The top SQL by elapsed time is important as it should be where you focus your efforts.  Other sections broken down by SQL is nice to have, but always remember, “If you aren’t tuning for time, you’re wasting time.”  Nothing can come of an optimization exercise if no time savings is seen after you’ve completed the work.  So by taking first the Top SQL by Elapsed Time, then looking at the statement, we now can see what objects are part of the statement, (large_block149, 191, 194, 145).

We also know that the problem is IO, so we should jump down from the SQL detailed information and go to the object level information.  These sections are identified by Segments by xxx.  

  • Segments by Logical Reads
  • Segments by Physical Reads
  • Segments by Read Requests
  • Segments by Table Scans

so on and so forth….

These all show a very similar pattern and percentage for the objects we see in our top SQL. Remember, Kurt was reading each of these tables, then inserting those same rows back into the table again, then rolling back.  As this is a workload scenario, unlike most performance issues I see, there is no outstanding object showing with an over 10% impact in any area.

pt11

As this is an Exadata, there is a ton of information to help you understand offloading, (smart scans) flash cache, etc. that will assist in relaying the information you need to make sure you are achieving the performance you desire with an engineered system, but I’d like to save that for another post and just touch on a few of the IO reports, as we were performing table scans, so we do want to make sure those were being offloaded to the cell nodes, (smart scans) vs. being performed on a database node.

We can start by looking at Top Database IO Throughput:

pt14

And then view the Top Database Requests Per Cell Throughput, (sans the Cell node names) to see how they compare:

pt12

 

Then we add to it this last graph that tells us, are we offloading enough of the workload to each of the cell nodes, Top IO Reasons by Request?

pt15

 

This has been one scenario trip through an AWR Report.  Hopefully it’s been interesting and next time I’ll search for a different load type to take you through!

Happy Friday!

 

 

 

 

 



Tags:  , ,


Del.icio.us



Facebook

TweetThis

Digg

StumbleUpon




Copyright © DBA Kevlar [Working With AWR Reports From EM12c], All Right Reserved. 2015.

Data Center of the Future – now

9087977014_a5a2d4bc11_k

 

photo by youflavio

In a recent blog post Eric D. Brown defined an Agile Data Center as

#777777;">An Agile Data Center is one that allows organizations to efficiently and effectively add, remove and change services at the speed of the business, not the speed of technology - Eric D. Brown

In follow up post he said that a Agile Data Center could be implemented by Software Defined Data Center (SDDC) for example using machine virtualization to spin environments up and down.

#555555;">With SDDC, it is possible for companies to replace their data center’s infrastructure with a virtualized environment and then deliver services and software as a service - #777777;">Eric D. Brown

The question arrises, what technologies constitute and agile SDDC? What software should be leveraged to succeed at having an agile data center, a SDDC? The most important software to look at is software that addresses the top constraints in the data center. As the theory of constraints says, any improvement not made at the constraint is an illusion. So what are the top constraints in the data center. The top constraint, as found after working with 100s of companies and surveying 1000s of companies, is provisioning environments for development and QA. Why is that? It’s because almost every industry now is finding itself to be more and more a software industry from stock trading to booksellers to taxi companies to hotels. The competitive advantage is more and more about the software used to provide and sell the service. To build that software requires development and QA and thus development and QA environments. The hard part of an environment to provision is no longer the machine thanks to machine virtualization. The hardest part of the environment to provision is the data. Data that represents the production system is required to develop applications that use, display and manage that data. Data is typically kept in large complex databases such as Oracle, SQL Server, Sybase, Postgres and DB2.

Provisioning development and QA environments that rely on databases  can be an expensive, slow endeavor. But like machine virtualization there is a new technology data virtualization that instead of making full physical copies, instead makes one copy of each unique data block on the source including a stream of changes blocks. With this “time flow” of unique blocks from the source database, data virtualization can provide copies in minutes not by actually making copies but by providing pointers back to the existing blocks. These existing blocks are read/writeable thanks to a technology of redirect on write which saves modified blocks in a different location than the original. It all sound a bit complex but when that’s the beauty of data virtualization solutions. They take the complexity, wrap it up into automated software stack and provide simple interface and APIs to provision full developer environments from the binaries, to the code files to the most complex and difficult part of the environment provisioning which is provisioning full running copies of the data. Included in most data virtualization solutions is masking as well since sensitive data is often required to be masked in development environments. The software defined data centers (SDDC) depend on machine virtualization and data virtualization.

What other technologies are also required?

Check out  Software defined everything: Throwing more software into the mix 

Restrict the query optimizer through OPTIMIZER_FEATURES_ENABLE and Top-Frequency/Hybrid Histograms

To make upgrades easier, I regularly see people considering disabling query optimizer features by setting the OPTIMIZER_FEATURES_ENABLE initialization parameter to a non-default value. My general opinion about this “habit” is summarized in TOP with the following two sentences:

Changing the default value of the OPTIMIZER_FEATURES_ENABLE initialization parameter is only a short-term workaround. Sooner or later the application should be adapted (optimized) for the new database version.

The issue is that not all new features are disabled by the OPTIMIZER_FEATURES_ENABLE initialization parameter. For example, if you set it to 10.2.0.5 in version 12.1.0.2, you will not get exactly the 10.2.0.5 query optimizer. A particular case to be considered in 12.1 is related to the new types of histograms (top-frequency and hybrid). In fact, even though you set the OPTIMIZER_FEATURES_ENABLE initialization parameter to 10.2.0.5, the DBMS_STATS package can produce top-frequency and hybrid histograms. As a result, the estimations of the query optimizer can change. An example is provided by this script.

Note that also setting the COMPATIBLE initialization parameter to, for example, 11.0.0 with the 12.1.0.2 binaries does not prevent the DBMS_STATS package to gather top-frequency and hybrid histograms.

So, if you are really planning to upgrade to 12.1 and changing the default value of the OPTIMIZER_FEATURES_ENABLE initialization parameter, in my opinion you should make sure that the DBMS_STATS package does not create top-frequency and hybrid histograms. Unfortunately, with the METHOD_OPT parameter, it is not possible to specify which type of histograms the DBMS_STATS package can create. The only input parameter of the GATHER_*_STATS procedures that affects this decision is ESTIMATE_PERCENT. When it is set to a numeric value greater than 0, the DBMS_STATS package does not create top-frequency and hybrid histograms. In other words, the DBMS_STATS package considers the creation of the new types of histograms only if the ESTIMATE_PERCENT parameter is set to the DBMS_STATS.AUTO_SAMPLE_SIZE constant (0).

Another possibility to prevent that the DBMS_STATS package creates top-frequency and hybrid histograms is to set the undocumented preferences ENABLE_TOP_FREQ_HISTOGRAMS and ENABLE_HYBRID_HISTOGRAMS. For example, to set them globally, you can use the following PL/SQL block:

BEGIN
  dbms_stats.set_global_prefs(pname => 'ENABLE_HYBRID_HISTOGRAMS', pvalue => 0);
  dbms_stats.set_global_prefs(pname => 'ENABLE_TOP_FREQ_HISTOGRAMS', pvalue => 0);
END;

WARNING: Even though according to my tests the preferences do what I expect (i.e. disable top-frequency and hybrid histograms), I never used them in a real case and I did not ask Oracle about their usage. Hence, if you want to use them, I would not only test them carefully, but also ask Oracle Support whether their utilization can lead to known problems.

Before setting them, I also adivse you to get their (default) value with a query like the following one:

SELECT dbms_stats.get_prefs('ENABLE_HYBRID_HISTOGRAMS') AS hybrid,
       dbms_stats.get_prefs('ENABLE_TOP_FREQ_HISTOGRAMS') AS top_freq
FROM dual