Search

OakieTags

Who's online

There are currently 1 user and 37 guests online.

Online users

Recent comments

Oakies Blog Aggregator

Finding Oracle VM Manager Core Management Port After OVM Manager Installation

Recently I was working on integration of OVM Manager with Oracle EM Cloud Control 12cR4 and I had to enter the "Oracle VM Manager URL" a.k.a. as "Core management port" which is reported at the end of OVM Manager installation. The default value for this port is 54321. As I was not in position to get this post-installation report because the installation was performed by another company, I had to find out the right port. The definition should be entered in the form: tcp://:
.
EM Cloud Control and OVM Manager Integration Screen
OVM Manager Installation report sample

One of the the first ideas was to check local firewall configuration and see if there is anything for port 54321.

# iptables -L | grep 5432
ACCEPT     tcp  --  anywhere  anywhere   state NEW tcp dpt:54322

Search showed opened port 54322 which could also be one of the possibilities.
Searching with netstat revealed no acctivity on port 54322

# netstat -vatn | grep 54322

However, a lot of rows was returned when searching for port 54321:

# netstat -vatn | grep 54321
tcp        0      0 ::ffff:127.0.0.1:54321      :::*                        LISTEN
tcp        0      0 ::ffff:127.0.0.1:36797      ::ffff:127.0.0.1:54321      ESTABLISHED
tcp        0      0 ::ffff:127.0.0.1:51828      ::ffff:127.0.0.1:54321      ESTABLISHED
tcp        0      0 ::ffff:127.0.0.1:53096      ::ffff:127.0.0.1:54321      ESTABLISHED
tcp        0      0 ::ffff:127.0.0.1:55461      ::ffff:127.0.0.1:54321      ESTABLISHED
tcp        0      0 ::ffff:127.0.0.1:53017      ::ffff:127.0.0.1:54321      ESTABLISHED

But I was still not 100% sure, if the iptables showed the right port. So further digging through the documentation revealed the right port. The definition of the "Core management port"  is hidden in .config file under the name COREPORT in OVM Manager home directory /u01/app/oracle/ovm-manager-3/.config

The contents of this file in my case was:

# cat  /u01/app/oracle/ovm-manager-3/.config

DBTYPE=MySQL
DBHOST=localhost
SID=ovs
LSNR=49500
OVSSCHEMA=ovs
APEX=8080
WLSADMIN=weblogic
OVSADMIN=admin
COREPORT=54321
UUID=0004fb000001000075e54bae74172d82
BUILDID=3.3.1.1065

The explanation for the configuration items can be found in chapter "5.1 Backing up and Restoring Oracle VM Manager"  and obviously in my case the port used was actually the default one 54321.
The .config file should be part of the backup.

Hope this post will shorten search time for this kind of information to my readers.

10046.pl by Clive Bostock

#555555;">Note:

#555555; padding-left: 30px;">This is a reposting of an old blog post that was on dboptimizer.com but is no longer accessible

#555555; padding-left: 30px;">More trace file analyzer tools at  #2970a6;" href="http://ba6.us/node/177">http://ba6.us/node/177

#555555; padding-left: 30px;">Related blog post: Oracle “Physical I/O” ? not always physical with a 10046 parser specifically for I/O parsetrc.pl and readme


#555555;">Often when I have a 10046 trace file, especially when looking at I/O issues, I want a histogram of I/O response time.  To get I/O response time I’ve hacked out incomple awk scripts from time to time, always meaning to write a more complete one, well now I don’t have to. It’s already been done!

#555555;">Here is a cool perl script from Clive Bostock: #2970a6;" href="https://sites.google.com/site/oraclemonitor/10046-pl-readme">README.TXT    #2970a6;" href="https://sites.google.com/site/oraclemonitor/10046-pl-1">10046.pl

#555555;">(also checkout #2970a6;" href="http://www.oracledba.ru/orasrp/">orasrp which produces a more indepth report in HTML. I like both. I like 10046.pl as a short  easy portable script that I can modify, whereas orasrp is a binary and only works on some ports)

#555555;">For example, if I trace a session with 10046, and retrieve the tracefile, then I can run:

#555555;">$ 10046.pl -t  mytrace.trc

#555555;">and it will output  a header and three sections

#555555;">Header

    #555555;">
  • Summary of all events for tracefile
  • Events by object summary
  • Events by object histogram

#555555;">This looks like

#555555;">Header

#555555;">Trace file mytrace.trc
Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORACLE_HOME = /u02/oracle
System name:    SunOS
Node name:      toto
Release:        5.10
Version:        Generic_142900-12
Machine:        sun4u
Instance name: orcl
Redo thread mounted by this instance: 1
Oracle process number: 177
Unix process pid: 16553, image: oracle@toto
Trace input file : mytrace.trc

#555555;">Wait summary

#555555;">EVENT AGGREGATES
================
Wait Event              Count Elapsed(ms)   Avg Ela (ms)  %Total
~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~ ~~~~~~~~~~~~ ~~~~~~~~~~~~ ~~~~~~~~~~
db file sequential read  2715      11593              4    3.74
       direct path read  4484       4506              1    1.45
 db file scattered read   141        898              6    0.29
          log file sync     3          8              2    0.00
                              ~~~~~~~~~~~~
             Total Elapsed:       309821

#555555;">Wait Summary by object

#555555;">Object Id  : Wait Event                  Count Tot Ela (ms) %Total Avg Ela (ms)
~~~~~~~~~~ : ~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~ ~~~~~~~~~~~~ ~~~~~ ~~~~~~~~~~~~~
28581      : direct path read            4484         4506   1.45            1
1756433    : db file sequential read      725         1891   0.61            2
764699     : db file sequential read      332         1762   0.57            5
37840      : db file sequential read      200         1044   0.34            5
38018      : db file sequential read      108         1009   0.33            9
81596      : db file scattered read       140          887   0.29            6

#555555;">wait histogram by object

#555555;">EVENT HISTOGRAM BREAKDOWN
===========================
This section splits the event counts into elapsed time
buckets so that we can see if there are any suspiciousn
or anomalous response time / frequency patterns.
Object Id : Wait Event              <1ms <2ms <4ms <8ms <16ms <32ms <64ms <128ms <256ms <512ms >=1024ms
~~~~~~~~~ : ~~~~~~~~~~~~~~~~~~~~~~~ ~~~~ ~~~~ ~~~~ ~~~~ ~~~~~ ~~~~~ ~~~~~ ~~~~~~ ~~~~~~ ~~~~~~ ~~~~~~~~
28581     : direct path read        7680   87  148  221  144    40     4     0      0       0        0
1756433   : db file sequential read  606  268   45   35   66     6     2     0      0       0        0
764699    : db file sequential read   74  119   11   78   78     9     0     0      0       0        0
37840     : db file sequential read   50   72    6   45   47     5     0     0      0       0        0
38018     : db file sequential read   12   38    7   10   30    12     5     0      0       0        0
81596     : db file scattered read    64    4   13   62   18     8     3     0      0       0        0
41995     : db file sequential read   20   39    0    7   16     8     4     0      0       0        0
108718    : db file sequential read   74   54    5   12   24     4     0     0      0       0        0
33490     : db file sequential read    0    5   11   25   19     4     0     0      0       0        0

An Oracle Instance is Like An Upmarket Restaurant

I recently did an Introduction to Oracle presentation, describing how the oracle instance worked – technically, but from a very high level. In it I used the analogy of a restaurant, which I was quite happy with. I am now looking at converting that talk into a set of short articles and it struck me that the restaurant analogy is rather good!

Here is a slide from the talk:

Simple partial overview of an Oracle Instance

Simple partial overview of an Oracle Instance

As a user of the oracle instance, you are the little, red blob at the bottom left. You (well, your process, be it SQL*Plus, SQL*Developer, a Java app or whatever) do nothing to the database directly. It is all done for you by the Oracle Sever Process – and this is your waiter.

Now, the waiter may wait on many tables (Multi-threaded server) but this is a very posh restaurant, you get your own waiter.

You ask the waiter for food and the waiter goes off and asks the restaurant to provide it. There are many people working in the restaurant, most of them doing specific jobs and they go off and do whatever they do. You, the customer, have no idea who they are or what they do and you don’t really care. You don’t see most of them. You just wait for your food (your SQL results) to turn up. And this is exactly how an Oracle Instance works. Lots of specific processes carry out their own tasks but they are coordinated and the do the job without most of us having much of an idea what each bit does. Finally, some of the food is ready and the waiter delivers the starter to you – The server process brings you the first rows of data.

Let’s expand the analogy a bit, see how far we can take it.

When you arrived at the restaurant, the Maître d’ greets you and allocates you to your waiter. This is like the Listener process waiting for connection requests and allocating you a server process. The Listener Process listens on a particular port, which is the front door to the restaurant. When you log onto an oracle database your session is created, ie your table is laid. If someone has only just logged off the database their session might get partially cleared and re-used for you (you can see this as the SID may well get re-used), as creating a session is a large task for the database. If someone had just left the restaurant that table may have a quick brush down and the cutlery refreshed, but the table cloth, candle and silly flower in a vase stay. Completely striping a table and relaying it takes more time.

The restaurant occupies a part of the building, the database occupies part of the server. Other things go in the server, the restaurant is in a hotel.

The PMON process is the restaurant manager or Head of House maybe and SMON is the kitchen manager, keeping an eye on the processes/staff and parts of the restaurant they are responsible for. To be candid, I don’t really know what PMON and SMON do in detail and I have no real idea how you run a large kitchen.

There are lots of other processes, these are equivalent to the Sous-chef, Saucier, commis-chef, Plonger (washes up, the ARC processes maybe?), Ritisseur, Poissonier, Patissier etc. They just do stuff, let’s not worry about the details, we just know there are lots of them making it all happen and we the customer or end user never see them.

The PGA is the table area in the restaurant, where all the dishes are arranged and provided to each customer? That does not quite work as the waiter does not sit at our table and feed us.

The SGA is the kitchen, where the ingredients are gathered together and converted into the dishes – the data blocks are gathered in the block buffer cache and processed. The Block Buffer Cache are the tables and kitchen surfaces, where all the ingredients sit. The Library cache is, yes, the recipes. They keep getting re-used as our kitchen only does certain recipes, it’s a database with a set of standard queries. It’s when some fool orders off-menu that it all goes to pot.

Food is kept in the larder and fridges – the tablespaces on disc. You do not prepare the dishes in the larder or fridge, let alone eat food out of them (well, some of the oracle process might nick the odd piece of cooked chicken or chocolate). everything is brought into the kitchen {the SGA} and processed there, on the kitchen tables.

The orders for food are the requests for change – the redo deltas. Nothing is considered ordered until it is on that board in the kitchen, that is the vital information. All the orders are preserved (so you know what was ordered, you can do the accounts and you can re-stock). The archived redo. You don’t have to keep this information but if you don’t, it’s a lot harder to run the restaurant and you can’t find out what was ordered last night.

The SCN is the clock on the wall and all orders get the time they were place on them, so people get their food prepared in order.

When you alter the ingredients, eg grate some of the Parmesan cheese into a sauce, the rest of the cheese (which, being an ingredient is in the SGA) is not put back into the fridge immediately, ie put back into storage. It will probably be used again soon. That’ll push it up the LRU list. Eventually someone will put it back, probably the Garçon de cuisine (the kitchen boy). A big restaurant will gave more then one Garçon de cuisine, all with DBW1 to x written on the back of their whites, and they take the ingredients back to the larder or kitchen when they get around to it – or are ordered to do so by one of the chefs.

Can we pull in the idea of RAC? I think we can. We can think of it as a large hotel complex which will have several restaurants, or at least places to eat. They have their own kitchens but the food is all stored in the central store rooms of the hotel complex. I can’t think what can be an analogy of block pinging as only a badly designed or run restautant would for example only have one block of Parmesan cheese – oh, maybe it IS a lot like some of the RAC implementations I have seen :-)

What is the Sommelier (wine waiter) in all of this? Suggestions on a post card please.

Does anyone have any enhancements to my analogy?

Enterprise Manager and Services, Part I

The term “Service” is used to represent numerous objects within Enterprise Manager 12c.  Although the most common reference to the term is related to the Oracle Management Service, (OMS), in this blog post, we are going to drill down and refer to all services identified in the drop down menu from Targets, Management Services and Repository.

The two main services we are going to cover here are set up as part of the Oracle Enterprise Manager 12.1.0.4 installation and shouldn’t be edited or reconfigured, but I think it’s good to know where everything is if you should ever need to investigate it or are working with Oracle Support.

By default the following services will display in the EM12c environment-

  • EM Console Service
  • EM Job Service

These two services are essential to successful Enterprise Manager task completion.  The EM Console Service is the service that keeps the console active and accessible and the EM Job Service controls not just the EM Jobs viewable in the EM Job Activity console, but also a ton of background processing for the Enterprise Manager.

The Main Console View

Once you’ve accessed the services from the target menu, as these are both monitored targets in the Enterprise Manager, you’ll see the following information:

services1

 

Along with the two, default services we expect to see for a standard Enterprise Manager installation, we also can see the status, if there are any open incidents and what system the services belong to.  You can create and remove services from here, but unless you know what you’re doing, this is mostly the view to quickly filter service information.

Creating services from this console offers you three, two generic service types and one advanced service option called an Aggregate service.

services3

Using our existing services as examples, it may give you a bit of an idea of how a System Based service is different than a Test Based one.

A System based service monitors one single system vs. a Test based one, which monitors a one or more test based systems.  Notice that the EM Jobs Service does ONE thing-  it monitors the Enterprise Manager Job Service.  The EM Console Service has to monitor EVERYTHING that is involved in the console, which is much more complex and requires a constant test to ensure all aspects are available.

And Aggregate Service is a combination of sub-services.  If one of the many sub-services is experiencing an issue, the entire Aggregate service is dependent upon that one failure.

There are a number of other services that Enterprise Manager 12c supports:

  • Business Application
  • Service Entry Point
  • Oracle Database Service
  • Oracle Applications Service
  • Siebel Services
  • Beehive Services
  • EM Service
  • And others….

You can create, manage and remove services via EM CLI commands, too.  The following commands will support the management of each:

  • create_service
  • assign_test_to_target
  • change_service_system_assoc
  • remove_service_test_assoc
  • set_availability
  • set_properties
  • And many, many others….

To the right of the console list of services, are Key Components incident information.  This can assist the administrator when they are experiencing a problem, but the culprit isn’t displayed in the service, a related target may be the issue:

services2

 

The Key Tests also show current status and any monitoring beacons that are attached to the service.

EM Console Service

If we click on the EM Console Service, we are taken to the home page for the service.

service4

Centering in on the upper, left section of the home page, you notice that there are a number of tabs:

  • Home
  • Charts
  • Test Performance
  • System
  • Monitoring Configuration
  • Topology

For any generic service, you are going to see these tabs, but we’ll go through each one of the tabs, (outside of the home tab) for the EM Console Service so that you get a good feel for what is available in each and what tab has control over the configuration of the service.

Charts

Upon clicking the charts menu, you’ll see the basic information about page response per millisecond and page hits per minute:

service5

Unless there is a serious performance issue in the console, I commonly don’t find a lot of useful data from the default view of “last 24 hrs” and recommend changing the view to [at least] the last 7 days.  You’ll be surprised how different the data will look when you take into account the nightly maintenance jobs that keep EM12c clean and effective:

service18

Test Performance

The next tab is for test performance.  All services have tests involved that check and see if a service is actively running and passes either one or more tests that verify it is properly functioning and available.

service6

The value for the last EM Console Service Test is shown above in the graph, but the real information lies lower in the performance timestamp.  The test isn’t run very often, (it is a performance impact to the system, so full testing of the service is only run once every 24 hours.)  You can see the Collection Timestamp in the right hand window of 6:55am, which will also match the extended Web Transaction used to verify that the console is accessible if you click on the EM Console Service Test at the bottom.

System

The System tab displays the latest information about the status and incidents of the components connected to the service.  service7

As you can see from the screen shot above, a recent test of a non-released patch has caused some grief, (this is why we test these things… :)) and there are a few applications components that were impacted and need to be reviewed and addressed.  Each of the components are linked on this page, so they can be easily accessed and the incident investigated.  As this is the EM Console Service, there are a number of related components, (16 total, as shown in the right hand totals) and you can change to the next page to display the rest of the components involved.

Monitoring Configuration

The Monitoring Configuration requires a section to itself, as this has some pretty impressive links in this tab.  We are going to go through each one of these, so you can get a solid understanding of what is available:

service8

System Association

First thing you need to remember, DON’T CLICK ON THE “REMOVE SYSTEM” BUTTON.  Although I’ve never seen anyone attempt this from the console, I can’t imagine the damage that would be done if you chose to remove a key component from the EM Console Service unless instructed by Oracle Support, so just don’t attempt it to see what happens… :)

service10

As discussed above, each component associated to the EM Console Service is displayed, along with the designation as a key component with a check mark.

Root Cause Analysis Configuration

By setting the analysis mode to manual, less stress is put on the system resources to collect root cause data at any issue, letting you control when the analysis is performed.  The recommendation is to leave this set to manual mode for analysis collections and only to change it [again] with the blessing of Oracle Support.

service11If any root cause analysis has been performed, you would view it to the right of the list of components and types, as the bottom section is reporting info only, listing each component and any test counts.

Service Tests and Beacons

This is how cloud control knows that the console service and key components are up and available.  A set of tests are run on a regular interval in the form of a web transaction.  This means that the test logs in, verifies a result from the URL for cloud control and verifies each key component is responsive.  If all tests result in a positive result, then the status is ‘up’.

service12

A beacon is a target used to monitor service tests, primarily to measure performance of the service. One beacon belongs to at least each service test and is used to check connectivity between the client and the OMS host or the OMS host and the OMR host, etc.  Adding multiple beacons for one service adds no value, so if for some reason, you see multiple beacons assigned to a service, there’s a problem right there.

Availability Definition

This console looks very similar to the one above, but it covers test information about the EM Console Service Tests and EM Management Beacon:

service13

By default, both of these require ALL KEY SERVICE TESTS must be SUCCESSFUL to be marked successful and return a status of ‘up’.  As you can see, the EM Management Beacon has not done so, which results in an unavailable status.  The next step in trouble-shooting this would be to click on the Beacon name link and investigate what  didn’t return successfully.

Performance Metrics

Performance metric data is where you set your thresholds for EM Console Service response, (and the readers say “Aha!  That’s another location I can get to the valuable metric threshold setting!”)

service14

The metric value settings for the threshold are in milliseconds and if you are receiving notifications that this is beyond your response time, look at network connection, network connection between data centers, etc. and if it’s something you can’t correct, bump up the warning and critical threshold values to provide more value.

Usage Metrics

This setting by default isn’t set, but it’s one area, you can configure if you would like.  If you are concerned about page performance and want to report on this metric data, set up the with a logical value to start.  For a small web server, 12000 page hits per minute would be a pretty good warning value with 20000 page hits per minute critical.  Use this as a beginning base, test, rinse and repeat.

service15

Now we’ve completed a high level review of the Monitoring Configuration, we can return to the last tab, Configuration Topology.

Topology

This last tab is for those of you that love topology charts.  This is the ultimate in configuration topology, allowing you to interact with all data, component dependency and connections to the EM Console Service.

service9

 

This is also where you come to realize how dependent everything is on your EM Console Service-  no website, no Enterprise Manager access.  By hovering my cursor over the service, you can quickly see the dependent HA servers, database instance, management agent and other essential hardware and software components that are involved with the EM12c hardware and framework.  To the right is filtering options to drill down into a more manageable view:

service19

High level information about each of the targets you want to drill down on are displayed and you can use this section to create a much easier to read topology:

service20

 

The topology makes even more sense if you can view all the information on each of the targets vs. me hiding host/database names… :)

Hopefully this review of services helps explain a little bit more about this very, very important key component to Enterprise Manager 12c.  Next post on Services, I’ll go over the all important, EM Job Service.

 



Tags:  , ,


Del.icio.us



Facebook

TweetThis

Digg

StumbleUpon




Copyright © DBA Kevlar [Enterprise Manager and Services, Part I], All Right Reserved. 2015.

Put Delphix on your laptop at Oracle Jan 28 !

 

#CloneAttack

Create an army of clone databases and applications in minutes

cloneattack

Tomorrow Jan 28 we will be installing Delphix on people’s laptops at the BIWA conference at Oracle conference center at Oracle head quarters in Redwood Shores.

Prerequisites

    #767676;">
  • Laptop, either
    • Mac: VMware Fusion or VirtualBox
    • Linux: VMware Fusion or VirtualBox
    • Windows: VMware Workstation or VirtualBox
  • at least 8 GB RAM
  • at least 50 GB free disk space, but preferably 100 GB free
  • at least 2 Ghz CPU, preferably dual-core or better

We’ll provide a USB stick with 3 virtual machine file OVA files. Just start up the VMs and in a few minutes you will be thin cloning Oracle databases, Postgres databases and web applications.

Example of the installation

Example of provisioning a database with web application using #CloneAttack

Using Hyphen In OS Username Or ORACLE_HOME Path in Oracle 12c Database Installation Will Make You Cry

Recently I was involved in an upgrade project on Linux platforms to Oracle 12c (12.1.0.2) and the customer wanted to name the owners of the Oracle SW distribution according to their company standard. So instead of using "grid"  and "oracle" OS account names they wanted to have the names which would start with "svc-" and continue with unique non Oracle standard name for either grid infrastructure or database software. Besides using this names as owners of grid and database software we used them also in the names of the GRID_HOME and ORACLE_HOME directories according to Oracle OFA suggestion.

So no problems with those names during installation and database creation. However, when  we wanted to patch the grid infrastructure and Oracle home with latest patches, the Opatch utility returned with an error that it can't parse the path to Oracle grid infrastructure. I immediately recognized the problem and the only remedy was to uninstall both grid and database SW. And then bang, another surprise, the Oracle 12c deinstallation procedures crashed as well. This time the reason was in checking the ownership of the distribution (grid or oracle).

The regular expression used to find the owner of the distribution in the script is /.*?\((\w+)\).*/
String returned by "id" command is:

uid=211(svc-xxxyyy) gid=201(oinstall) groups=201(oinstall), 200(dba), 205(asmadmin), 206(asmdba), 207(asmoper), 110000513(domain users) 

The $1 returned variable gets value "oinstall" instead of svc-xxxyyy and this causes deinstalltion script to fail.  The right expression would be /.*?\((\w+\-\w+)\).*/

So how one can manually deinstall database home and grid infrastructure home? For ORACLE_HOME deinstallation one can follow Doc.ID 1624580.1 and run:

cd $ORACLE_HOME/oui/bin
./runInstaller -silent -detachHome ORACLE_HOME="/u01/app/svc-xxxyyy/product/12.1.0/dbhome_1" ORACLE_HOME_NAME="OraDB12Home1"


This operation succeeded without any problems. Deinstalling GRID_HOME was slightly more complicated and actually at the moment of writing this post there is no document on MOS relevant for manual deinstallation of grid infrastructure with configured Oracle Restart. The suggestion from Oracle support was to follow "How to Deinstall Oracle Clusterware Home Manually ( Doc ID 1364419.1 )" which was relevant for 11gR2 but not 12c. Actually I have performed the deinstalltion in two steps:

# roothas.sh -deconfig -force
$ runInstaller -detachHome -silent -local ORACLE_HOME=$ORACLE_HOME


The $ORACLE_HOME in this case was actually pointing to GRID_HOME.

Why I said in the title that this will make you cry - because you have to start it all over again and the Oracle Installer doesn't warn you about the problematic hyphen  during the installation process when all prerequisites are run.

So after being working so many years with Oracle products I was really angry with myself that I was caught on such stupid mistake.


Succeeding with Test Data Management

 

 

Screen Shot 2015-01-26 at 6.00.46 PM
Test data management is difficult, time consuming and expensive leading to incorrect implementation of test data management and significant losses in revenue from high QA costs to bugs in production. Fortunately there is a solution that alleviates the majority of the huge resource, time and planning of conventional test data management and that new technology is called data virtualization.
#292f33;"> 
Dependable QA testing requires that code be tested on data that represents the data to be encountered on the final production system. The data has to respect the business rules and data has to correlate between one business rule and another. For example a customer order record has to correlate to existing order items and an existing customer. The records also have to cover the date ranges being searched. If for example the test code searches date ranges outside of the test data then the test code won’t even touch any data. It’s difficult nearly impossible to create from scratch a full set of data that represents all the possible combinations of data  that will be encountered in production.
#292f33;"> 
Test data management is an enormous time drain for application development.  According to Infosys “#0033cc;" href="http://www.google.com/url?q=http://www.infosys.com/IT-services/independent-validation-testing-services/white-papers/Documents/test-data-management-strategy.pdf&sa=D&sntz=1&usg=AFrqEzcYGUxDGSEt4GoqFBUGbX1n633xOw">up to 60 percent* of application development and testing time is devoted to data-related tasks, making it cumbersome and time consuming”  and in #292f33; font-family: Arial;">#292f33;">#0033cc;" href="http://www.google.com/url?q=http://www.infosys.com/IT-services/independent-validation-testing-services/white-papers/Documents/test-data-management-strategy.pdf&sa=D&sntz=1&usg=AFrqEzcYGUxDGSEt4GoqFBUGbX1n633xOw">many cases, system functionalities are not adequately tested due to required test data not being available.”
#292f33;"> 
#292f33;">If test data management is done incorrectly then the lack proper data in testing leads bugs making their way into production leading to significant impact on the bottom line.
#292f33;"> 

#0033cc;" href="http://www.google.com/url?q=http://www.cognizant.com/InsightsWhitepapers/Transforming-Test-Data-Management-for-Increased-Business-Value.pdf&sa=D&sntz=1&usg=AFrqEzfU99bJQCk0n39RAT3v2C8ohyLA2Q">The absence of proper test data causes nearly one-third of the incidents we see in QA/nonproduction environments and is a major reason why nearly two-thirds of business applications reach production without being properly tested. The resulting application failures cause significant amounts of downtime, with an average price tag of $100,000 per hour for mission-critical applications, according to industry estimates. – Cognizant

#444444;">
#444444;">After talking to hundreds of companies about their test data management we have found that QA systems typically use data that  is partial, representing a subset of data in production, or the data is synthetic and had been generated to simulate data in production. In both cases the data isn’t sufficient to cover all the data possibilities that are encountered in the production system(s). In order to address these missing case, the testing plan typically includes a QA cycle on a full copy of the production system near the end of the development cycle. At this point many bugs are found leading to the project either having to delay the project release or to release on time but with existing bugs.
#444444;">
#292f33;"> 
#444444;">The question arises “why isn’t the code tested on a full copy of production earlier in the development cycle?” If code testing was run on a full copy of production earlier in the code cycle then bugs would be found earlier, fixed earlier and fixed with less code rework.
#444444;">
#292f33;"> 
#444444;">The problem is that production systems usually run on larger complex databases and making copies of these databases is difficult, time consuming and resource intensive. Even when a copy is made, if the test cycles modify data as is typically the case where data is created, modified and deleted then the same test data will change from one QA run to another leading to diverging results. This leads to the need to refresh the data to the state it was before the QA cycle started.  With large complex data sets the refresh process can be deemed prohibitively expensive and time consuming. Thus the industry had come to think that testing on full size production data is not feasible.
#444444;">
#292f33;"> 
#444444;">In fact using production data is possible but it requires a technology called data virtualization. Data virtualization is a technology that allows almost instant cloning of data for almost no storage. There is a storage requirement for the original source data and even that requirement can be  eased through compression such that storage requirement is on the order of one third of the original source data. Once data virtualization is linked to the source data, i.e. there is an initial copy of the source data, then from there new clones can be made for almost no storage because the clones aren’t actual copies but point to the already existing data. The beauty is that these clones are full read and write copies because when these clones modify that data, these modifications are stored separately from the original data and are only visible to the clone that made the changes. These clones are called thin clones because they don’t initial take up additional storage.  The whole process is called data virtualization and when it comes to databases that use this technology they are called virtual databases. On top of the storage savings data virtualization technologies come with automated collection of changes on the source data creating a time flow of changes. The time flow means that there is never a full copy of the source taken again. Only changes are collected. Changes older than the time window, which is usually a couple of weeks, are purged out. Virtual database can be provisioned from any point in time down to the second from this time flow. Most data virtualization technologies come with automated provisioning of virtual databases making the provisioning of a up and running database on a target machine a matter of a few clicks of a mouse and a few minutes. Data virtualization also generally includes options for data masking  improving security coverage in the testing cycles.
#444444;">
#292f33;"> 

A few companies we have worked with have had test cycles that repeated over and over and between testing code the database had to be refreshed. In one specific case the refresh took 8 hours for only 20 minute of actual code testing. Going from this architecture to an architecture of virtual database we were able to use full fresh copies of production data catching bugs earlier and reduce the refresh time down to a few minutes drastically reducing the overhead of QA and increasing test coverage:

 

Screen Shot 2015-01-26 at 5.50.45 PM

 

 

5 ways to boost your career with Social Media

8562448300_7f375c9de9_o

Photo by Kevin Dooley

If you are in the Bay Area tomorrow, Jan 27, come see myself, Yann Ropars and #141823;">Yury Velikanov talk about how to leverage social media to boost your career. We will be talking at Oracle head quarters at the NoCOUG/BIWA conference at #444444;">2:30 pm 

Why use social media as an IT technician? Because

vibrant social networks are key to landing jobs, moving forward in your career, and securing personal happiness

- Richard Florida #252525;"> professor #0b0080;" title="Rotman School of Management" href="http://en.wikipedia.org/wiki/Rotman_School_of_Management">Rotman School of Management#252525;"> at the #0b0080;" title="University of Toronto" href="http://en.wikipedia.org/wiki/University_of_Toronto">University of Toronto

#000000;"> Success no longer comes from possessing knowledge; instead, you have to participate with others in creating a flow of knowledge. 

- Walter Isaacson, President and CEO, the Aspen Institute, and author of Einstein: His Life and Universe

#444444;">Networks matter – who do you turn to for help? Who can you trust? How do you get a new job? How do you know what products to buy? What are the latest discoveries? How do you know if 12c is ready for prod?

Networks of people can accomplish things an order of magnitude faster than a lone individual. Look at wikipedia or open source software.

My first big social networks “ah ha” was 15 years ag  with a group of Oracle technicians  that Mogens Nørgaard had created an email list for. With the help from several different people on this list such as Anjo Kolk, James Morle and Jonathan Lewis, I was able to put together a program in a week to read Oracle’s SGA directly with C instead of SQL. Really cool stuff! Without each of their input I might have spent a year instead of a weekend.

That was 15 years ago and I’m convinced  more than ever that social networks are hugely important

My second big social network “ah ha” came in the world economic meltdown in 2009.  My company at the time wasn’t strong but I was happy there because I was running my own project and was head down working on my project ignoring the rest of the world. Then when the economy slipped I realized that my company was in a bad position as they cut salaries across the board, laid people off and cut my project team down to a handful and dumped 4 other software projects on by desk that I had no interest in. With almost no time to work on my project that I was passionate about, hardly any team left, more work than I could handle on other weak software packages and a lower salary, I was chomping at the bit to get out, but I realized I didn’t have any one one turn to.  All my contacts were at Oracle which I didn’t want to go back to and my current company. I realized I need a strong Linkedin network to leverage to find a new job. Ultimately it was the network that Mogens had set up that saved me and connected me to Delphix. Before this experience I had thought linkedin was just a place to put your resume online. I didn’t realize it was a strong social network that could actually help me get a job.

Now days, if I meet a DBA who is not on Linkedin, I assume that the person is a 9am-5pm “lifer” at the company they are currently at. It leads me to think they are happy where they are, have no plans on ever leaving, and no ambition to leave.  My assumptions are probably often wrong. Maybe the person just doesn’t know the power of social networks but there are so many people on social media who are easy to connect with, why bother with people who are hard to connect with?  In person connections were the only way to connect 20 years ago. In person connection, or local connections are still important, which is why Silicon Valley is so powerful. Silicon Valley is so powerful because there are so many smart, motivated, passionate IT technicians who can easily hop from one job to another, meet at bars, meetups, hackathons, conferences etc. But now days online social connections are becoming more and more important. When communication is rapid, things happen faster.

Social media for me is

  • email groups, like Oracle-L
  • forums, like OTN
  • twitter
  • linkedin
  • facebook

 

5 ways to boost your career with social media:

  1. Find jobs
    • become more well known by your presence
    • create a network of people you can turn to and who can validate your skills
    • share your knowledge and generate
  2. Find employees
    • someone send you a resume? See who they are connected to. Are they connected to a strong network?
  3. Solve technical problems
    • have an error, ask if others have seen it
  4. Create projects together
    • github – get input and help on scripts and projects
  5. Get feedback on technology
    • what’s new
    • what’s useful
    • what can you ignore
    • what problems should you watch out for

 

How do you get started?

Hop on twitter. Find someone important who only follows a few people. For example in the Oracle world, go to Tom Kyte‘s twitter profile.  Tom is only following 22 people and you can bet those 22 people are probably pretty good people to follow, so go through his list and pick out most or all and follow them. Then you will be on a good start to creating a nice Oracle twitter feed.

Retweet other peoples tweets. It will get  you noticed and help you build a following.

Twitter can be hard to sort the wheat from the chaff so I use a couple of tools.

Prismatic will go through tweets and pull just the tweets with links to articles and then server up the articles in a news reader. I love it.

Tweetdeck will allow me to track words or hashtags. For example I can monitor all tweets with my company name, Delphix, or during a conference like Oracle Open World I can track all the tweets with #OOW and #OOW15.

Start a blog if you can. I often use my blog as just a way to document stuff I’ll forget and will want to find later.

Comment on other peoples blogs.

Get on technical forums and participate. My favorite Oracle forum is the Oracle-L list.

If you write scripts to help you, share them on github and help other people with their scripts on github.

On linkedin, connect with everyone you can. If your company is small enough, just connect with everyone at your company. Connect with friends, colleagues, customers, everyone you can.

 

5440123633_4c85b23958_z

photo by  Daniel Iversen

12869913094_96695c7918_z

photo by neliO

 

Find it. Fix it. Real-World SQL Tuning Cases - Jan. 13 Webinar Followup

If you missed my January 13th webinar entitled "Find it. Fix it. Real-World SQL Tuning Cases" you can now access the recording and download the presentation file using the following links.

Presentation PDF
Webinar recording

Abstract
There are many ways to find SQL that is performing poorly. The hard part is what to do with a bad SQL statement once you have it. In this session, several real-world examples will be reviewed to help you learn how to evaluate poorly performing SQL. Each example will demonstrate a commonly occurring SQL performance problem and provide a method to solve it.

Key points:
1. Review ways to identify poorly performing SQL.
2. Learn a simple method for evaluating a problem SQL statement.
3. Identify key performance inhibitors in SQL execution plans.
4. Determine the best and simplest solution.

In-memory DB

A recent thread on the OTN database forum supplied some code that seemed to show that In-memory DB made no difference to performance when compared with the traditional row-store mechanism and asked why not.  (It looked as if the answer was that almost all the time for the tests was spent returning the 3M row result set to the SQL*Plus client 15 rows at a time.)

The responses on the thread led to the question:  Why would the in-memory (column-store) database be faster than simply having the (row-store) data fully cached in the buffer cache ?

Maria Colgan has addressed this question in part 3 of her series on In-Memory Database (see catalogue below), but I thought I’d repeat the basic ideas with a few imaginative numbers thrown in to give a better flavour of what’s going on. So imagine you have a table sized at 100GB, with 100 columns of data where every column holds data of a similar size and pattern; and want to execute a query of the form: select {list of columns} from big_table where colX >=  {some constant}.

Traditional Tablescan (approximation) with table fully cached

For each block of the 100GB, Oracle has to acquire the “cache buffers chains” latch, pin the block, drop the latch, and scan the block testing each row, then acquire the latch, unpin the block, and drop the latch.  Scanning the block requires a walk through the row directory and, for each row pointer, jumping to the correct location in the block for the row, stepping along the row one column at a time to get to the correct row, and then checking the column  value. If the column matches the predicate extract, format and return the required columns from that row.

It’s a lot of memory to scan, in a large number of small steps, involving a lot of latching and pinning – which translates into a lot of CPU. On the plus side, although it’s very expensive to identify the required rows, it’s very cheap to construct and return a row once you’ve identified it.

In-memory scan (approximation)

  1. Given the way I’ve described the table (100GB, 100 similar columns), Oracle can recreate it in memory as 100 lists of 1GB each; so we can identify the rows we want by scanning one of those lists and applying the predicate – so only 1GB of (fairly contigious) memory to scan, rather than 100GB, and virtually no latching and pinning to find that memory, and no jumping around following pointers and counting along rows.
  2. But it’s probably NOT 1GB of memory to scan, because Oracle has some compression/deduplication methods that it can apply to the data that often reduces the memory requirement by a factor of four of five – so perhaps it’s only 250 MB of memory to scan.
  3. But Oracle breaks large lists into chunks, so rather than 250MB of contiguous memory, it’s likely to be 250 chunks of 1MB; and as part of the description of each chunk Oracle records the lowest and highest value in the chunk; [ed:  Christian Antognini says that the metadata list every distinct value for the chunk] so it can check the predicate against the boundary values on the chunk and be in a position to discard entire chunks without looking at their content. So, depending on the actual content and pattern of the data, we may examine only a handful of chunks in detail, dropping the scan from 250MB to, perhaps, 10MB.
  4. And we still haven’t finished there, because the code that handles the column-store is able to take advantage of the SIMD (Single Instruction,  Multiple Data) operations in the CPU to check the predicate against 4, or possibly even 8, values simultaneously at a speed matching a single column comparison that has to be used for the traditional cached row-store. (Given the way that Oracle  (probably) handles the compression, I suspect that this final advantage is only significant for range-based predicates – but that’s purely conjectural).

So the huge benefit you get from In-Memory column store, compared to “fully cached row-store”  is that you are likely to scan far less memory to identify the rows that match your predicate, and do it with far less “infrastructure” activity like latching and pinning. The potential saving in CPU usage is huge.

There is, of course, a penalty to pay. As you identify the rows of interest you can (in effect) construct a bitmap representing the position of those rows in the table (and if you have predicates on more than 1 column you can use bitmap operations on the individual column bitmaps to identify the rows you want in the final result) but then you have to construct the row that goes into the result set. If your query is interested in just 5 columns that means using the bitmap to locate the correct entry from each of 5 separate column lists; if your query is interested in 99 column that means extracting the correct entry from each of 99 separate column lists. Identifying the rows you want can be very  quick, building the final result may be relatively slow.

Soundbite summary

  • Using the In-memory Database, you can identify the rows you want very quickly but it’s relatively slow to reconstruct them.
  • Using a fully cached traditional row-store, it’s relatively slow to identify the rows you want, but once you’ve found them you spend no time reconstructing them.

Bear in mind that this is an extremely simplified analysis and ignores all sorts of details about read-consistency, the probability of physical reads, the probability of scanning blocks instead of scanning chunks, and so on; my focus is only on the type of activity that differentiates row-store handling from column-store handling when all the data is in memory so that you can have some appreciation of why the benefits available from In-memory DB can vary with the pattern of the data and the way you use it.

Catalogue of blog posts by Maria Colgan and Andy Rivenes:

Other articles on In-memory DB: