Search

OakieTags

Who's online

There are currently 0 users and 32 guests online.

Recent comments

Affiliations

Oakies Blog Aggregator

Facebook Groups and Lists

For quite some time I’ve had a specific policy on how I use social networks.

  • Google+ : I a have regular G+ profile which is public. I post whatever takes my fancy here, including Oracle and technology stuff. Anything posted on this profile is bounced across to Twitter using ManageFlitter.
  • Google+ (ORACLE-BASE.com) : I have a G+ page that is specific for Oracle and technology related links. I don’t post so much random stuff here.
  • Twitter (@oraclebase) : The usual junk you get on Twitter.
  • Facebook (ORACLE-BASE.com) : I have a Facebook page for the those people who prefer to follow me on Facebook. All my tweets get forwarded to this Facebook page.

In addition to those I’ve had a regular Facebook profile for a long time, but I’ve been very specific about its use. I only accept first-life friends and family. With all the other way of connecting to me, keeping one for myself didn’t seem selfish. Recently, I’ve been playing around with Facebook Groups and Facebook Lists in an attempt to allow connections to more people, but keep groups of people separated from each other. I don’t want to bore my friends with Oracle stuff and I don’t want to bore the Oracle community with tales of my crocodile wrestling.

I created some Facebook Groups and started accepting some Oracle-related people as friends and assigned them to a group called “Oracle”. I figured this was like a Google+ Circle, it’s not. For a start, everyone in the group can see everyone else in the group and they can see what the group is called, so don’t call it “People I Hate!”. :) There are a variety of security options, but none of them really did what I was looking for. I pretty quickly removed the groups and wrote to everyone saying it was not a snub. I just didn’t want to be the leader of some new communities. :) If you are into building communities in Facebook, groups seem like a pretty good idea. You can be a dictator, or let other people in the group join in the administration.

The next thing I tried was Facebook Lists. This is a lot more like Google+ Groups. Hover over the “Friends” section on the left hand side of the page and a “More” link appears. Click on the link and you can see all the lists you’ve already got, which include smartlinks created automatically by Facebook. You can create new lists and manage existing lists from here. When you accept a friend request, you can select the relevant list for the contact. There are some standard lists that come in handy, like “Restricted” and “Limited Profile”. If I’ve not actually met someone before, they tend to get put on one of these lists. This is not so much to hide stuff I post, but it is to provide some layer of protection to my other contacts. I don’t see why something one of my non-Oracle friends posts should be visible to someone I’ve never met. OK, that’s the price you pay for getting involved in social networks, but I don’t want it to be my fault someone else’s posts become public. When you write a status update, you can select which list it is visible to. Alternatively, you can click on the list of interest, then post the status update.

I’m still not sure if altering my policy on Facebook usage was the correct thing to do. I also reserve the right to unfriend everyone and revert to my previous policy at any time. :)

Cheers

Tim…

 


Facebook Groups and Lists was first posted on March 22, 2014 at 1:54 pm.
©2012 "The ORACLE-BASE Blog". Use of this feed is for personal non-commercial use only. If you are not reading this article in your feed reader, then the site is guilty of copyright infringement.

ASH Analytics- Activity Focused on Resource Consumption, Part II

This is the second post on ASH Analytics Detail.   You can find the first post here.

This post we’re going to work into more Activity data within ASH Analytics, but I’m focusing on Resource Consumption, which you can find once you are into Ash Analtyics, (you may want to stop refreshes, which can be done at the upper right in the refresh settings) under the Activity button.

ash_0321_resource_consum

We’ll be working with the last three options in this menu marked off in red, as the Wait Class is just a similar view of what we already get in the main Activity graph.

Wait Class

I’m sure you’ve already noticed, but ASH Analytics has a tendency to always arrange any of the graphs by the heaviest usage at the bottom to the lightest at the top.  This makes it easier to view for the user.   This will be the case for pretty much all the activity graphs, where Load Maps will arrange largest to smallest, left to right.

ash_0321_main

Nope, not much new here-  We are still seeing the resource consumption, but as wait events.  If we compare it to the standard wait events view, not much different.  The reason we would really find value in this is if we had IORM, (IO Resource Manager) enabled.  Since we don’t, we’re going to skip it for now and it will get to be part of that blog post in the future… :)

Wait Event

when we filter by Wait Event, we get a very different view of our data than we did by Wait class.  Note the actual categories that are graph is broken up by, listed down on the right hand side.

ash_0321_wait_event

Scanning from the bottom –> up on the right hand side, we can see the largest resource consumption is “LNS Wait on SENDREQ” and the second highest consumer is “LGWR- LNS Wait on Channel”.  You are also seeing log file sync and parallel log file write.  All of this comes to a clear understanding of what we are dealing with here… Why?  This is a dataguard environment and the LNS* type waits are quite common and occur when-

  • LGWR writes redo to the online redo log on this primary database (when LGWR SYNC is not used, user commits are acknowledged once this step completes except when the parameter COMMIT NOWAIT is used.
  • The Data Guard LNS process on this primary database performs a network send to the Data Guard RFS process on the standby database. For redo write sizes larger than a MB, LNS will issue multiple network sends to the RFS process on the standby.
  • LNS posts LGWR that the all the redo has been successfully received and written to disk by the standby, which is a heavy consumer, as well.

On the primary database here is extended resource consumption represented with the “log file parallel write” wait event. We will then note repeated “LNS wait on SENDREQ” wait event, too. You can further divide the “LNS wait on SENDREQ” wait event into network time and RFS I/O time by subtracting the “RFS write” wait event obtained on the standby. These wait events can be assessed on the standby by using multiple queries in the “Data Guard Specific Wait Events” section for a physical standby or by using AWR for a logical standby if you need further proof.

Object

I’ve hidden the exact object names involved in this database, but you’ll note, they weren’t involved in much of the resource consumption anyway.  Now on an environment with heavy IO, this would change significantly and you would see a lot more of one or more objects being the focus of this type of graph.  This is where having the Top Activity box check marked is helpful, as it clearly shows you that object IO is not much to focus on for this environment.  The black mark showing the upper max of activity for any given time in the activity graph gives you a clear line to compare with.

ash_0321_object

Blocking Session

This is where many of you will become very interested again.  We all want to know about blocking sessions and how they may be impacting the resource consumption in our environment.  Although there was no *true* impact to production processing, there was some blocking sessions that we could inspect in our current example.

Only sessions that are experiencing some type of bl0cking are displayed in this view.  No matter if it’s transaction, (TX- Tran Lock contention or TM- DML enqueue contention , etc.) or UL, (user defined)  these sessions will be shown in this graph.

ash_0321_blocking_sess

I’ve included the bottom portion of the ASH Analytics section so you’re able to see the SQL_ID’s involved in the sessions and then to the right, you can also view the session information, displayed by Instance ID, (for RAC) session ID, (sid) and serial number, (serial#).

You can then click on a session in the lower section, which will filter the ASH graph by the session and the wait class to give you another view, demonstrating in our example that we are waiting on a commit(s) and then our favorite wait, “other”…. :)

ash_0321_blocking_sess_det

What’s helpful, is that the SQL_ID is clearly shown on the left and  you’re able to click on any of these, (although I’d highly recommend clicking on the top one that is 5 times the concern vs. the others. :))

This completes the review of the Resource Consumption Menu for the ASH Analytics Activity Page.  We’ll continue on in Part III later this week!



Tags:  


Del.icio.us



Facebook

TweetThis

Digg

StumbleUpon




Copyright © DBA Kevlar [ASH Analytics- Activity Focused on Resource Consumption, Part II], All Right Reserved. 2014.

ASH Analytics- Activity Focused on SQL Data, Part I

I have a request to hit on some EM performance blog posts, so we’re going to start with breaking down some of the ASH Analytics areas.  ASH Analytics is not your grandfather’s “Top Activity” and I recommend everyone begin to embrace it as it is the the future of performance activity in the Enterprise Manager.  The idea that we will be able to pull directly from ASH and AWR to present our performance data via the EM12c is exciting, to say the least.  The added accuracy and value of the aggregated historical data must be recognized as well.

The standard output for ASH Analytics looks quite similar to Top Activity in the way of focusing on activity graphed out by wait events, but I’m going to dig into a different ways to present the activity data, as it may answer questions that simply won’t show via wait event graphing.

Once you first enter the ASH Analytics interface, you’ll be presented with the section at the top, which will display the timeline for your examination and then below, the actual wait event graph as seen here in our example:

ash_standard

You’ll note that we haven’t used any filters, so we’re viewing all activity data and the standard wait classes we are accustomed to viewing as shown, i.e. I/O, (dark blue) CPU, (kelly green) and System, (light blue).  We can see the system is rather busy during multiple intervals, bypassing the CPU cores line, displayed in red across the image.

Now let’s display the data in a different format.  This is done by changing the Wait Class, shown under the Activity button, to another filtering type.

ash_menu

The section highlighted in #ff0000;">red is where we will be spending our time today.  I’ll display the same section of time we see in the standard wait event displayed in our standard wait event view, but focus on the SQL defined displays and we’ll go over how they might assist you in troubleshooting an issue.

SQL_ID

Now this one might seem a bit self-explanatory, but displaying the data from the activity pane appears very different, then, let’s say from the load map:

ash_bld_out

Now let’s look at it from the Activity View defined by SQL_ID:

ash_by_sqlid

I find the load map an excellent way to get my point across to non-technical folks, but I’m a DBA-  I’m used to spending my time looking at the data in terms of activity, so this is a more comfortable view for me to move from seeing the data in terms of wait events and transitioning that view to seeing the percentage of activity allocated to each SQL_ID.

We can click on the SQL_ID displayed on the right to go to it’s detailed ASH page, which will show you all data pertaining to that SQL_ID and wait information involved with it.  By clicking on different sections, we’re not just digging down into more detailed information, keep in mind, we are also “filtering” out more data that could have been masking valuable information.

Top SQL_ID

You may also think that this one is not going to look any different than the last, that’s where you’re wrong.  This is not the SQL_ID that has resources allocated to it in activity, but the top level SQL_ID that is executing the SQL_ID active.  This is helpful if you are trying to locate what packages and procedures should be first on the list for code review or if you want to quickly locate the package or procedure responsible for a specific SQL_ID.

ash_top_level_sqlid

These can then be identified to sessions by clicking on links and then traced to users, hosts, etc.  If I click on one of the top SQL_IDs, it will take me to all SQL_ID’s involved in that Top SQL_ID and all the wait events, displayed in the same graph timeline.  From there, I can then dig down into the data pertaining to the waits, the SQL_ID’s involved as part of the Top SQL_ID or even switch to other views, such as a load map to present the data in another format for peers to view more easily.

Force Matching Signature

Force matching signature is a way of ensuring that SQL is using the same plan/profile, etc. even when it has literals in it.  It’s kind of like setting cursor_sharing=FORCE, but in a “hinted” way throughout a database.  This can be both good and bad, as let’s say that it forces the value to “1″, where the value “1″ really only makes up 2% of the rows and it would have been better if it knew what it was working with.

ash_force_matching

SQL Plan Hash Value

You should be seeing this next image [hopefully] as a mirror of the SQL_ID one.  We do like plan stability and the idea that we have a ton of hash plan values changing erratically is enough to make most of us feel queasy.  Having a quick visual that we’re not experiencing a lot of plan changes can be quite helpful.  It’s not foolproof, again, this is a visual representation of the data, but it’s helpful.

ash_sql_plan_hash

SQL Plan Operation

If you’re a DBA managing a data warehouse, you know when any DML is running and on what schedule.  The SQL Plan Operation view can give you a quick verification if something is amiss and there are updates, deletes or inserts happening that shouldn’t be.  You can see the percentage of activity that may also quickly tell you a change has occurred vs. the normal database activity.

ash_sql_oper

You can also see just how much of your activity is going to certain type of processing.

SQL Plan Operation Line

The plan operation line, you can see the operation type for the process, along with the description.  the process is then broke down by both the SQL_Id and the step the statement is performing.

ash_sql_plan_op_line

If you hover over any SQL_ID listed, it will also show this to you in the highlighted area:

spec_load_type_op

SQL OpCode

This one is self-explanatory.  We are simply looking at the activity level per statement type.  It is easy to see that during our busy intervals, queries were around a 1/3 or the activity, as were inserts.  This view can be helpful if you are retaining screenshots at different snapshot intervals for comparison.

ash_sql_op_code

Top Level SQL Opcode

You can also view the data simply by Operation code, displaying by each code.  Notice that these are displayed by the Instance number, (if on RAC), Session ID, (sid) and Serial number, (serial#).  This is a global view of the operations that are occurring in the environment and can offer a clear view of activity, but at an operation type level.

 

ash_sql_op_code

We’ve just touched on what the SQL section of the ASH Analytics Activity view can offer.  Stay Tuned and we’ll dig into each section and each area of this great interface for performance monitoring and analysis.

 

 

 

 



Tags:  


Del.icio.us



Facebook

TweetThis

Digg

StumbleUpon




Copyright © DBA Kevlar [ASH Analytics- Activity Focused on SQL Data, Part I], All Right Reserved. 2014.

Oracle : buffer busy wait

 

Oracle 10 and 11

Buffer Busy Waits usually happen on Oracle 10 and 11 mainly because of insert contention into tables or Indexes.  There are a few other rare cases of contention on old style RBS segments, file headers blocks and freelists.
Before Oracle 10 and 11 there was one other major reason which was readers waiting for readers, ie one user does a phyiscal IO of a block into memory and a second user want to read that block. The second user waits until the IO is finished by the first user. Starting in 10g this wait has been given the name ”read by other session“.  Before Oracle 10g this was also a “buffer busy wait”.
The easiest way to analyse the bottleneck and find a solution is to use ASH (active session History) available in Oracle 10g with the diagnostics pack license or using Simulated ASH for free or using a product like DB Optimizer.
Data block class, which can be found in ASH,  is the most important piece of information in analysing buffer busy waits. If we know the block class we can determine what kind of bottleneck:
 If CLASS=
    1. data block
      • IF OTYPE =
      • INDEX , then the insert index leaf block is probably hot, solutions are
        • Hash partition the index
        • Use reverse key index
      • TABLE, then insert block is hot,solutions
        • Use free lists
        • Put Object in ASSM tablespace
    2. Segment header - If “segment header” occurs at the same time as CLASS= “data block” on the same object and the object is of OTYPE= “TABLE”  then this is just a confirmation that the TABLE needs to use free lists or  ASSM.
    3. File Header Block - Most likely extent allocation problems, look at extent size on tablespace and increase the extent size to there are few extent allocations and less contention on the File Header Block.
    4. free lists - Add free list groups to the object
    5. undo header - Not enough UNDO segments, if using old RBS then switch to AUM
    6. undo block - Hot spot in UNDO, application issue
How do we find the block class? With a quick query on the ASH data like:
select
       o.object_name obj,
       o.object_type otype,
       ash.SQL_ID,
       w.class
from v$active_session_history ash,
     ( select rownum class#, class from v$waitstat ) w,
      all_objects o
where event='buffer busy waits'
   and w.class#(+)=ash.p3
   and o.object_id (+)= ash.CURRENT_OBJ#
Order by sample_time;
For Example

 

OBJ    OTYPE  SQL_ID        CLASS
------ ------ ------------- ------------------
TOTO1  TABLE  8gz51m9hg5yuf data block
TOTO1  TABLE  8gz51m9hg5yuf data block
TOTO1  TABLE  8gz51m9hg5yuf segment header
TOTO1  TABLE  8gz51m9hg5yuf data block
If we find that CLASS=datablock, then we will want more information to diagnose, such as the object type “OTYPE” , object name and what kind of tablespace the object is stored in. The following query provides that information:

set linesize 120

col block_type for a20
col objn for a25
col otype for a15
col filen for 9999
col blockn for 9999999
col obj for a20
col tbs for a10
select
       bbw.cnt,
       bbw.obj,
       bbw.otype,
       bbw.sql_id,
       bbw.block_type,
       nvl(tbs.name,to_char(bbw.p1)) TBS,
       tbs_defs.assm ASSM
from (
    select
       count(*) cnt,
       nvl(object_name,CURRENT_OBJ#) obj,
       o.object_type otype,
       ash.SQL_ID sql_id,
       nvl(w.class,'usn '||to_char(ceil((ash.p3-18)/2))||' '||
                    decode(mod(ash.p3,2),
                         1,'header',
                         0,'block')) block_type,
       --nvl(w.class,to_char(ash.p3)) block_type,
       ash.p1 p1
    from v$active_session_history ash,
        ( select rownum class#, class from v$waitstat ) w,
        all_objects o
    where event='buffer busy waits'
      and w.class#(+)=ash.p3
      and o.object_id (+)= ash.CURRENT_OBJ#
      and ash.session_state='WAITING'
      and ash.sample_time > sysdate - &minutes/(60*24)
      --and w.class# > 18
   group by o.object_name, ash.current_obj#, o.object_type,
         ash.sql_id, w.class, ash.p3, ash.p1
  ) bbw,
    (select   file_id, 
       tablespace_name name
  from dba_data_files
   ) tbs,
    (select
 tablespace_name    NAME,
        extent_management  LOCAL,
        allocation_type    EXTENTS,
        segment_space_management ASSM,
        initial_extent
     from dba_tablespaces 
   ) tbs_defs
  where tbs.file_id(+) = bbw.p1
    and tbs.name=tbs_defs.name
Order by bbw.cnt
/
and the output looks like

  CNT OBJ     OTYPE   SQL_ID        BLOCK_TYPE       TBS        ASSM

----- ------- ------- ------------- ---------------- ---------- ------
    3 TOTO1   TABLE   8gz51m9hg5yuf segment header   NO_ASSM    MANUAL
   59 TOTO1   TABLE   8gz51m9hg5yuf data block       NO_ASSM    MANUAL
Oracle 7, 8 and 9

Before Oracle 10, buffer busy waits also happened because IO blocking another user wanting to do the same IO. On Oracle 9, the main reasons for buffer busy waits are

1)       IO read contention (only Oracle 9i and below)

2)       Insert Block Contention on Tables or Indexes
3)       Rollback Segment Contention

On  7.0  - 8.1.5 see http://sites.google.com/site/embtdbo/oracle-buffer-busy-wait/oracle-buffer-busy-wait-7-8-1-5

On version 8 and 9, the p3 value has a different meaning. Instead  of meaning the block type (which is the best thing to know) it means the kind of buffer busy wait. There are only two values that matter to us, values in

100 range = read waits (basically just an IO wait)

Reader blocking Reader, ie one reader is reading a block in and another person wants to read this block and waits on a buffer busy wait p3=130.

200 range = write contetion (same as in 10g)

Writers blocking other writers for example while doing inserts either because of no free lists on the table or because everyone is inserting into the same index block.

If you have set up ASH style collection with S-ASH or have a product like DB Optimizer you can run a query like:

select

       count(*) cnt,
       o.object_name obj,
       o.object_type otype,
       ash.CURRENT_OBJ#,
       ash.SQL_ID,
       decode(substr(ash.p3,1,1),1,'read',2,'write',p3) p3
from v$active_session_history ash,
      all_objects o
where event='buffer busy waits'
   and o.object_id (+)= ash.CURRENT_OBJ#
group by o.object_name, o.object_type, ash.sql_id, ash.p3,ash.CURRENT_OBJ#
order by cnt
/
And see what kind of buffer busy waits there are and what the objects are:

 CNT OBJ     OTYPE   CURRENT_OBJ#     SQL_ID P3

--- ------- ------- ------------ ---------- ------
  1                           -1 1375352856 read
  2                           -1  996767823 read
  2                           -1 2855119862 write
 17                           -1 1375352856 write
 89 TOTO1   TABLE         296030 1212617343 write
109                       296022 1212617343 write

Often the Current_obj# is -1 so we can’t figure out what the object is . There is an alternative method

col block_type for a18

col objn for a25
col otype for a15
col event for a15
col blockn for 999999
col segment_name for a20
col partition_name for a15
col owner for a15
set timing on
/*
drop table myextents;
l
create table myextents as select * from dba_extents;
l
*/
select
       count(*),
       ext.owner,
       ext.segment_name,
       ext.partition_name,
      ext.segment_type,
        decode(substr(ash.p3,1,1),1,'read',2,'write',p3) p3
       --ash.p1,
       --ash.p2
from v$active_session_history ash,
     myextents ext
where
       event = 'buffer busy waits'
   and ( current_obj# = -1 or current_obj#=0  or current_obj# is null )
   --and sample_time > sysdate - &minutes/(60*24)
   --and session_state='WAITING'
   and  ext.file_id(+)=ash.p1 and
        ash.p2 between  ext.block_id and ext.block_id + ext.blocks
group by
       ext.owner,
       ext.segment_name,
       ext.partition_name,
       ext.segment_type,
       p3
       --ash.p1,
       --ash.p2,
       --ash.sql_id
Order by count(*)
/
Because querying DBA_EXTENTS  is a slow operation, I made a copy of DBA_EXTENTS which will be faster to query.

CNT OWNER  SEGMENT_NAME   PARTITION_NAME  SEGMENT_TYPE  P3

--- ------ -------------- --------------- ------------- --------
  1 SYS    _SYSSMU2$                      TYPE2 UNDO    read
  1 SYS    _SYSSMU3$                      TYPE2 UNDO    write
This second option of getting the object from P1 and P2 (file and block) should probably be done only with the users consent, because we would have to create a copy of the dba_extent table which might take a long time if it’s big.
No ASH ?

If you don’t have ASH data you will have to do some guess work.
Block Class (block type)

The first step in finding out the source of buffer busy waits is looking at

     V$waitstats
This will tell us what kind of datablocks we have contention on.
File with contention

You can also get an idea of what file contains the object with the buffer busy waits by looking at:

    X$KCBFWAIT
Object with contention

Starting in version 9i there is the table

    v$segstat
That will list the objects with buffer busy waits.
If you are on version 7 or 8 good luck finding the object without setting up ASH style data collection.
Why do buffer busy waits happen?

To put it most succinctly, buffer busy waits happen because two users want to change a block at the same time. Two users can change the same block, or even same row “at the same time” ie without committing, but that’s different from the actual operation of modifying the block. The modification on the block in RAM, or computer memory, can only be done by one process at at time in order to avoid memory corruptions. Different users can modify different blocks at the same time but only one user or process can modify a the same block at a time.
In order to really understand what’s going on we have to take a look at how Oracle manages memory and block access and modifications.
Here is the layout of
Above is a diagram shows some of the essential parts of Oracle in regards to performance tuning.
In the machine memory are
  •     Oracle’s SGA, or System Global Area, a memory that is shared between Oracle users
  •     LGWR – log writer process
  •     DBWR – database writer process
  •     User1,2,3 … – user processes, in this case “shadow processes”

On the machine file system are

  • Redo log files
  • Data files
The SGA is composed of (among other things)
  • Log Buffer
  • Library Cache
  • Buffer Cache
What’s important for understanding buffer busy waits is how the buffer cache is managed. Here is view of the buffer cache with more components:
In order to access a block, a user (shadow process) has to get a latch (cache buffer chains latch) which protects buckets or linked lists of buffer headers. Once the header desired if found the latch is released. The buffer headers point to the actual data block in memory. Before modifying a block in memory a user has to lock the buffer header. The buffer header is locked any time a modification is made whether it is reading a block into memory or modifying a block that is already in memory. Usually the header is locked only for a brief amount of time but when there is a lot of concurrent access the buffer header can become a bottleneck.
BBW when readling data – read by other session

A buffer busy can happen on oracle 7,8 and 9 when one user is reading a block into memory and a second user wants to read that block. Instead of the second user trying to read that block into memory as well, they just wait for the first user to finish. Starting in Oracle 10, this kind of wait was renames “read by other session”

BBW on insert
If multiple concurrent users are inserting into a table that doesn’t have free lists or is not in an ASSM tablespace then all users will end up inserting into the same block, the first one on the free list and this block will become the hot block
by adding free lists or moving the table to an ASSM tablespace we will alleviate the bottleneck.
Multiple free lists:
The other option is ASSM or Automatic Segment Space Management which is set at the tablespace level.
In this case free block information is kept in Level 1 BMB (or bitmapped blocks). These Level 1 BMBs are chosen by a hash on the users process ID thus distributing the inserts across the table.
The inserts would look something like this (somewhat exaggerated drawing)
the ASSM BMB blocks take up more space in the table , about 1 extra block for every 16 data blocks and there is overhead first looking in the header/level 3 BMB block then going to the Level 2  then level 1 and finally to the datablock but all in all ASSM is worth reduced costs of management verses free lists.
Identifying and creating ASSM tablespaces
Which tablespaces are ASSM or not?

select

        tablespace_name,
        extent_management  LOCAL,
        allocation_type    EXTENTS,
        segment_space_management ASSM,
        initial_extent
from dba_tablespaces

TABLESPACE_NAME LOCAL      EXTENTS   ASSM

--------------- ---------- --------- ------
SYSTEM          LOCAL      SYSTEM    MANUAL
UNDOTBS1        LOCAL      SYSTEM    MANUAL
SYSAUX          LOCAL      SYSTEM    AUTO
TEMP            LOCAL      UNIFORM   MANUAL
USERS           LOCAL      SYSTEM    AUTO
EXAMPLE         LOCAL      SYSTEM    AUTO
DATA            LOCAL      SYSTEM    MANUAL
creating an ASSM tablespace:

create tablespace data2 

datafile '/d3/kyle/data2_01.dbf' 
size 200M
segment space management auto;

BBW on index (because of insert)

If users are inserting data that has a rising key value, especially a monotonically rising value, then all the new inserts will have to update the leading edge leaf block of the index and with high concurrent inserts this can cause buffer busy waits.
Solutions
Hash partition the index
Reverse Key Index

BBW on old style RBS

IF block class > 18 it’s an old style RBS segment

Select  CURRENT_OBJ#||' '||o.object_name objn,

          o.object_type otype,
          CURRENT_FILE# filen,
          CURRENT_BLOCK# blockn,
          ash.SQL_ID,
          w.class ||' '||to_char(ash.p3) block_type
from v$active_session_history ash,
      (select rownum class#, class from v$waitstat ) w,
       all_objects o
where event='buffer busy waits'
    and w.class#(+)=ash.p3
    and o.object_id (+)= ash.CURRENT_OBJ#
Order by sample_time;    

OBJN        OTYPE  FILEN  BLOCKN SQL_ID        BLOCK_TYPE

----------- ------ ------ ------ ------------- ------------
54962 TOTO1 TABLE     16   45012 8gz51m9hg5yuf data block 
54962 TOTO1 TABLE     16     161 8gz51m9hg5yuf segment header
0                     14       9 8gz51m9hg5yuf  87
0                     14       9 8gz51m9hg5yuf  87

IF the block is of class > 18, the there will be no object name, so we have to look it up ourselves to be sure:

select  segment_name,  

            segment_type
from     dba_extents 
where
     &P2  between 
    block_id and block_id + blocks – 1
     and 
     file_id = &P1 ;

Plug in 14 for P1 the file # and 9 for P2 the block number:

SEGMENT_NAME   SEGMENT_TYPE

-------------- --------------
R2             ROLLBACK
solution
move to new AUM or Automatic Undo Mangement
alter system set undo_management=auto  scope=spfile;
BBW on a file header
The ASH data has two different fields that indicate the file # and block # when the wait is a buffer busy wait.
For a buffer busy wait
    File # = p1  *and* File # = current_file#
    Block # = P2  *and* Block # = current_block#
if  p1 != current_file#  or p2 != current_block# then use p1 and p2. They are more reliable.
for example

Time   P1  P2 OBJN OTYPE FN BLOCKN BLOCK_TYPE

----- --- --- ---- ----- -- ------ -----------------
11:44 202   2 -1          0      0 file header block
11:44 202   2 TOTO TABLE  1  60218 file header block
11:44 202   2 TOTO TABLE  1  60218 file header block
11:44 202   2 TOTO TABLE  1  60218 file header block
11:44 202   2 TOTO TABLE  1  60218 file header block
Notice P1 != BLOCKN (BLOCKN is CURRENT_BLOCK#)  and P2 != FN (FN is CURRENT_FILE#)
The real file # is P1 =202 and block # is P2 which is 2
In my database I only had 10 files, so what is this file# 202?!
Solution
If you are getting buffer busy waits on the file header block for a tempfile (datafile in a temporary tablespace) then try increasing the “next extent” size in the temporary tablespace.
This wait can happen when lots of extents are being allocated in the temporary tablespace.
What Would ADDM do?
Interstingly enough the ADDM page doesn’t show the new load that has recently come on the system but the analysis is there.  I clicked on the next to bottom line in the page, “Read and write contention on database blocks was consuming significant database time.
Here are the outputs for the different scenarios.
 inserts into a table contention
 inserts into a table with contention on index
 
 
RBS contention
 
File Header Contention

 

Health Care Crises in Application Development

if someone fraudulently uses your information for medical services or drugs, you could be held liable for the costs

The demand for healthcare application development is exploding and has been exploding over the past couple of years because of

  • Obama Care – Affordable Care Act
  • Regulatory – HITECH and HIPAA Privacy Acts
  • IDC 10
  • Pro-active Health Care (versus reactive health care)
  • Mobile devices

but to develop applications for health care requires the data to be masked. Why does masking data matter and matter especially for health care? If patient information gets out it can be quite damaging. One heuristic for the importance of healthcare information is that on the black market health care information on an individual tends to sell for 100x the credit card information for an individual. Imagine that someone needs health coverage and they swipe the health care information for someone else giving them free treatment. The value of the “free treatment” can well exceed the maximums on a credit card. Also imagine the havoc it can cause for the original individual if some jumps onto their health care. Important information like blood type can be logged incorrectly or the person my have HIV logged against them when they themselves are clear. It can take years to repair the damage or never if the damage is fatal.

What do Britney Spears, George Clooney, Octomom (Nadya Suleman)and the late Farah Fawcett have in common? They are all victims of medical data breaches! … How much would a bookie pay to know the results of a boxer’s medical checkup before a title bout? What would a tabloid be willing to pay to be the first to report a celebrity’s cancer diagnosis? Unfortunately it doesn’t stop there and the average citizen is equally a target.  

When data gets to untrusted parties it is called leakage. To avoid leakage, companies use masking. Masking is a form of data mediation or transformation that replaces sensitive data with equally valid fabricated data. Masking data can be more work on top of the already significant work of provisioning copies of a source database to development and QA. Development and QA can get these database copies in minutes for almost no storage overhead using Delphix (as has been explained extensively on previous blogs) but by default these copies, or virtual databases(VDB), are not masked.  Without Delphix, to mask database copies in development and QA would require masking every single copy, but with Delphix one can provision a single VDB, masked that VDB, and then clone in minutes for almost no storage as many masked copies of that first masked VDB as needed.

 Screen Shot 2014-03-21 at 10.38.14 AM

In the above graphic, Delphix links to a source database, and keeps a compressed version along with a rolling time window of changes from the source database. With this data Delphix can spin up a clone of the source database, anywhere in that time window. The clone can be spun up in a few minutes and takes almost no storage because it initially shares all the duplicate blocks on Delphix. This first VDB can be masked and then clones of the masked VDB can be made in minutes for almost no extra storage.

With Delphix in the architecture making masked copies is fast, easy and efficient. The first VDB that is masked will take up some extra storage for all the changed data. This VDB can then become the basis for all other development and QA masked copies so there is no need to worry about whether or not a development or QA database is masked. Because the source for all development and QA copies is masked then there is no way for any unmasked copies to make it into development and QA. Without the secure architecture of Delphix  it becomes more complicated to verify and enforce that each copy is indeed masked. By consolidating the origins of all the down stream copies into a single set of masked shared data blocks, we can rest assured that all the down stream versions are also masked. The cloning interface in Delphix also logs all cloning activity and chain of custody reports can be run.

How do we actually accomplish the masking? Masking can be accomplished with a number of technologies available in the industry. With Delphix these technologies can be run on a VDB in the same manner that they are currently being used with regular physical clone databases. Alternatively Delphix has hooks for the provisioning where tools can be leveraged before the VDB is fully provisioned out.

Delphix has partnered with Axis Technology to streamline and automate the masking process with virtual databases. Look for upcoming blog posts to go into more detail about Axis and Delphix.

 

Golden rules of RAC performance diagnostics

After collaborating with many performance engineers in a RAC database, I have come to realize that there are common pattern among the (mis)diagnosis. This blog about discussing those issues. I talked about this in Hotsos 2014 conference also.

Golden rules

Here are the golden rules of RAC performance diagnostics. These rules may not apply general RAC configuration issues though.

  1. Beware of top event tunnel vision
  2. Eliminate infrastructure as an issue
  3. Identify problem-inducing instance
  4. Review send-side metrics also
  5. Use histograms, not just averages

Looks like, this may be better read as a document. So, please use the pdf files of the presentation and a paper. Presentation slide #10 shows indepth coverage on gc buffer busy* wait events. I will try to blog about that slide later (hopefully).

Golden rules of RAC diagnostics paper

Golden rules of rac diagnostics ppt

Scripts mentioned in the presentation can be downloaded here.

scripts

IOUG Collaborate 2014 and EM12c Odyssey

Collaborate is coming up fast and this year is going to be a great conference for all involved.  I’ll have a number of sessions, both for those attending in person and those taking advantage of the virtual content.

As a virtual speaker, I’ve been offered a pass for the virtual that I can offer to one lucky recipient.  This will grant you access to the two virtual tracks of manageability and performance offered by IOUG!  I will award this virtual registration code by the end of the week to the person who can come up with the best name for my new VM I’m building.  It better be good, too…. :)

Tweet the Database Name submission with #C14LV and #NameMyEM  to me, @DBAKevlar

As for some great content…. :)I’ll be presenting three sessions this year for Oracle, (v signifies a virtual and in person session)-

  • Mastering Enterprise Manager 12c Monitoring(v)   April 9th, 2-3pm
  • Deep Dive into ASH and AWR(v)  April 8th, 5:30-6:30pm
  • Database as a Service, (DBaaS) in a DBAs World April 9th, 4:30-5:30pm

You’ll also be able to find all of my wonderful teammates presenting at the following sessions for more, fantastic Oracle content here.

See YOU at Collaborate 14!!

oracle3

 

 



Tags:  


Del.icio.us



Facebook

TweetThis

Digg

StumbleUpon




Copyright © DBA Kevlar [IOUG Collaborate 2014 and EM12c Odyssey], All Right Reserved. 2014.

RLS bug

RLS – row level security, aka VPD (virtual private database) or FGAC (fine grained access control) has a critical bug in 11g. The bug is unpublished, but gets mentioned in various other documents, so can be identified as: Bug: 7828323 “SYS_CONTEXTS RETURNS WRONG VALUE WITH SHARED_CONTEXT_SENSITIVE”

The title tells you nearly everything you need to know – if you’ve declared a security policy as context_sensitive or shared_context_sensitive then a change to the context ought to result in the associated predicate function being called to generate a new security predicate the next time the policy becomes relevant. Thanks to bug 7828323 this doesn’t always happen – so queries can return the wrong set of results.

There are some patches for older versions (11.1.0.7 and 11.2.0.2 as far as I’ve checked), but if you don’t have, or can’t get, a patch the “workaround” is to change any relevant policies to dynamic; unfortunately the consequence of this is that the predicate function will then be called for every execution of any statement against any objects protected by that policy.

Depending on how your application has been written and how many queries are likely to invoke security policies this could easily increase your CPU usage by several percent (and if it’s a badly written application maybe a lot more).

Footnote:

It has occurred to me to wonder what happens if you have use a (normal) pl/sql function in a select list, and the function executes a statement against a table, and the table is protected by a context_sensitive security policy – and you decide to use the pl/sql result cache on the function. How long is an item supposed to stay in the result cache, and if it’s longer than a single execution of a statement will the result cache be invalidated if your context changes in a way that invalidates the current security predicate ? No time to check or test at present, though, but I’d be very cautious about putting RLS predicate functions into the result cache until I’ve played around with that idea for a bit.

RLS bug

RLS – row level security, aka VPD (virtual private database) or FGAC (fine grained access control) has a critical bug in 11g. The bug is unpublished, but gets mentioned in various other documents, so can be identified as: Bug: 7828323 “SYS_CONTEXTS RETURNS WRONG VALUE WITH SHARED_CONTEXT_SENSITIVE”

The title tells you nearly everything you need to know – if you’ve declared a security policy as context_sensitive or shared_context_sensitive then a change to the context ought to result in the associated predicate function being called to generate a new security predicate the next time the policy becomes relevant. Thanks to bug 7828323 this doesn’t always happen – so queries can return the wrong set of results.

There are some patches for older versions (11.1.0.7 and 11.2.0.2 as far as I’ve checked), but if you don’t have, or can’t get, a patch the “workaround” is to change any relevant policies to dynamic; unfortunately the consequence of this is that the predicate function will then be called for every execution of any statement against any objects protected by that policy.

Depending on how your application has been written and how many queries are likely to invoke security policies this could easily increase your CPU usage by several percent (and if it’s a badly written application maybe a lot more).

Footnote:

It has occurred to me to wonder what happens if you have use a (normal) pl/sql function in a select list, and the function executes a statement against a table, and the table is protected by a context_sensitive security policy – and you decide to use the pl/sql result cache on the function. How long is an item supposed to stay in the result cache, and if it’s longer than a single execution of a statement will the result cache be invalidated if your context changes in a way that invalidates the current security predicate ? No time to check or test at present, though, but I’d be very cautious about putting RLS predicate functions into the result cache until I’ve played around with that idea for a bit.

Automated Linux VM build on ESX

How to automatically install RedHat like (Red Hat, Cent OS, Oracle Enterprise Linux) Linux system on ESX or VirtualBox servers ? There are at least two methods. I have seen VM cloning in lot of places using ESX (VSphere Center) or VirtualBox functionality. Cloning is fine but it required some intervention after clone will be finished (like host rename or IP address change). What if we want to install different releases of systems - well it is required to have a clone of every single release like RedHat 6.2, OEL 6.3 or OEL 6.4. It require a rework every time a new release is available on the market.

But there is a another method - developed years ago well before virtualization era. It is based on installation answer file and KickStart installation method. If I add DHCP and TFTP and NFS server to this equation I will get nice and quite configurable way to build my VMs very fast.

First of all DHCP server has to be configured. In my case I just created a internal network inside ESX and setup an additional DHCP sever for build purpose only. You can use any DHCP server but it has to be able to support TFTP redirection and also booting functionality. In my base I'm using DHCP, TFTP and NFS server provided by Ubuntu 13.04

DHCP configuration

root@myown:~# cat /etc/dhcp/dhcpd.conf

ddns-update-style none;

# option definitions common to all supported networks...
option domain-name "priv.localdomain";

default-lease-time 600;
max-lease-time 7200;

option subnet-mask 255.255.255.0;
option broadcast-address 10.0.0.255;
option domain-name-servers 192.168.1.20;

subnet 10.0.0.0 netmask 255.255.255.0 {
range dynamic-bootp 10.0.0.1 10.0.0.100;
filename "pxelinux.0";
next-server 10.0.0.250;
}

PXELINUX.0 has to be copied from default location into TFTP directory and it has to be accessible by it. You can find "pxelinux.0" file in Ubuntu syslinux-common package.
Install it using "apt-get install syslinux-common" and then copy from it's default location into /tftpboot

root@myown:~# cp /usr/lib/syslinux/pxelinux.0 /tftpboot/

TFTP configuration - default port and /tftpboot used as files location

pioro@myown:/etc$ cat /etc/xinetd.d/tftp
service tftp
{
protocol = udp
port = 69
bind = 10.0.0.250
socket_type = dgram
wait = yes
user = nobody
server = /usr/sbin/in.tftpd
server_args = /tftpboot
disable = no
}

TFTP directory structure
 

pioro@myown:/tftpboot$ ls -l
total 37816
-rw-r--r-- 1 root root 34551609 Dec 22 17:04 initrd.img
-rw-r--r-- 1 root root 26461 Dec 22 17:26 pxelinux.0
drwxr-xr-x 2 root root 4096 Jan 13 22:02 pxelinux.cfg
-r--r--r-- 1 root root 441 Dec 22 17:04 TRANS.TBL
-rwxr-xr-x 1 root root 4128944 Dec 22 17:04 vmlinuz

PXELINUX.CFG directory inside TFTP directory

pioro@myown:/tftpboot/pxelinux.cfg$ ls -l
total 8
-rw-r--r-- 1 root root 137 Dec 22 18:29 01-00-0c-29-41-69-15
-rw-r--r-- 1 root root 138 Jan 13 21:55 01-00-0c-29-99-7d-3d

Files names are related to NIC MAC addresses. For example: 01-00-0c-29-41-69-15 is first file for MAC address 00:0c:29:41-69:15
Now take a look what is inside a host configuration file

pioro@myown:/tftpboot/pxelinux.cfg$ cat 01-00-0c-29-41-69-15
default Oracle Linuxas_64
label Oracle Linuxas_64
kernel vmlinuz
append initrd=initrd.img ks=nfs:10.0.0.250:/images/ks.cfg ksdevice=eth1

This are boot properties created using a GRUB style and describing a kernel and initrd images. KS parameter is a configuration parameter of KickStart process. In the above example KickStart configuration file is placed on NFS server in /images directory and it's called ks.cfg. In addition to that KickStart will configure interface eth1 which is private ESX network in my case.

My DHCP and TFTP server has NFS server configured as well. It's exporting only one directory /images which keep a KickStart configuration files and also is a mount point for ISO image.

pioro@myown:/tftpboot/pxelinux.cfg$ cat /etc/exports
/images/ *(ro,subtree_check,crossmnt)

ISO with Linux distribution should be mounted below /images directory using loop option.

root@myown:~# mount -o loop /nfs/disk2/images/OEL65.iso /images/OEL65/
mount: block device /nfs/disk2/images/OEL65.iso is write-protected, mounting read-only

Now I have a access to installation files and also to PXE boot files. In my case all have been located in the following directory /images/OEL65/images/pxeboot/ and I just copied it into TFTP /tftpboot directory

root@myown:~# ls -l /images/OEL65/images/pxeboot/
total 37775
-rw-r--r-- 2 root root 34551609 Nov 26 05:02 initrd.img
-r--r--r-- 1 root root 441 Nov 26 05:04 TRANS.TBL
-rwxr-xr-x 2 root root 4128944 Nov 26 05:02 vmlinuz

root@myown:~# cp /images/OEL65/images/pxeboot/* /tftpboot/

Inside NFS exported directory I have also KickStart configuration files

pioro@myown:/images$ ls -l
total 12
-rw-r--r-- 1 root root 1935 Jan 13 21:59 ks2.cfg
-rw-r--r-- 1 root root 1936 Jan 13 22:00 ks.cfg
drwxr-xr-x 2 root root 4096 Dec 22 18:27 OEL65

Example Kickstart configuration file

pioro@myown:/images$ cat dg1.cfg
#platform=x86, AMD64, or Intel EM64T
#version=DEVEL
# Firewall configuration
firewall --disabled
# Install OS instead of upgrade
install
# Use CDROM installation media
nfs --server 10.0.0.250 --dir /images/OEL65/
#cdrom
# Root password
rootpw --plaintext
# System authorization information
auth --useshadow --passalgo=sha512
# Use graphical install
graphical
firstboot --disable
# System keyboard
keyboard us
# System language
lang en_US
# SELinux configuration
selinux --disabled
# Installation logging level
logging --level=info
# Reboot after installation
reboot
# System timezone
timezone Europe/Dublin
# Network information
network --bootproto=static --device=eth0 --ip=192.168.1.51 --nameserver=192.168.1.20 --netmask=255.255.255.0 --onboot=on --hostname=dg1.localhost
network --bootproto=static --device=eth1 --ip=10.0.0.1 --netmask=255.255.255.0 --onboot=on
# network --bootproto=dhcp --device=eth2 --onboot=on
# System bootloader configuration
bootloader --location=mbr
# Clear the Master Boot Record
zerombr
# Partition clearing information
clearpart --all
# Disk partitioning information
part /boot --fstype ext4 --size=200
part pv.01 --size=1 --grow
volgroup VolGroup pv.01
logvol swap --fstype swap --vgname=VolGroup --size=1024 --name=lv_swap
logvol / --fstype ext4 --vgname=VolGroup --size=1 --grow --name=lv_root


%packages
@base
@console-internet
@core
@debugging
@directory-client
@hardware-monitoring
@large-systems
@network-file-system-client
@performance
@perl-runtime
@security-tools
@server-platform
@server-policy
@system-admin-tools
gcc

%end 

Above configuration file will partition a disk into root and swap partition and configure two networks. In addition to that packages groups specified after %packages line will be installed.

Below are screen shots from my ESX environment:

Finding a MAC address of VM - Open VM configuration and go to Network adapters

Booting process - this VM is booting from NIC 2 using private network and all services configured above.

If you looking to step by step instructions you can find it on Tim Hall website:
PXE Network installations 
KickStart

I based my environment build on Tim's website and some Google research. 
Happy installing !!!

regards,
Marcin