Search

Top 60 Oracle Blogs

Recent comments

Oakies Blog Aggregator

Automated Root Cause Analysis

I’ve ran into multiple products that claim to offer automated root cause analysis, so don’t think that I’m ranting against a specific product or vendor. I have a problem with the concept.

The problem these products are trying to solve: IT staff spend much of their time trying to troubleshoot issues. Essentially finding the cause of effects they don’t like. What is causing high response times on this report? What is causing the lower disk throughputs?

If we can somehow automate the task of finding a cause for a problem, we’ll have a much more efficient IT department.

The idea that troubleshooting can be automated is rather seductive. I’d love to have a “What is causing this issue” button. My problem is with the way those vendors go about solving this issue.

Most of them use variations of a very similar technique:
All these vendors already have monitoring software, so they usually know when there is a problem. They also know of many other things that happen at the same time. So if their software detects that response time go up, it can look at disk throughput, DB cpu, swap, load average, number of connections, etc etc.
When they see that CPU goes up together with response times – Tada! Root cause found!

First problem with this approach: You can’t look at correlation and declare that you found a cause. Enough said.

Second problem: If you collect so much data (and often these systems have millions of measurements) you will find many correlation by pure chance, in addition to some correlations that do indicate a common issue.
What these vendors do is ignore all the false findings and present the real problems found at a conference as proof that their method works. Also, you can’t reduce the rate of false-findings without losing the rate of finding real issues as well.

Note that I’m not talking about tools like Tanel Poder visualization tool. Tools which makes it easier for the DBA to look at large amounts of data and using our brain’s built in pattern matcher to find correlations. I support any tool that assists me in applying my knowledge to large sets of data at once.

I have a problem with tools that use statistical correlation as a replacement to applying knowledge. It can’t be done.

Here’s the kind of tool I’d like to see:
Suppose your monitoring tool will give you the ability to visually browse, filter and explore all that data you collect in ways that help you troubleshoot. The tool will remember the things you looked at and the steps you took. After you solve the problem, you can upload the problem description and your debug process to a website. You can even mark away the dead-ends of the investigation.

Now you can go to that website and see that for problem X, 90% of the DBAs started by looking at v$sesstat and 10% ran trace. Maybe you can even have a friend network, so you can see that in this case Fuad looked at disk utilization first while Iggy checked how much redo is written each hour.

If you are not into sharing, you can still browse your own past problems and solutions for ideas that might have slipped your mind.

I think that a troubleshooting tool combined with “collective wisdom” site can assist experienced DBAs and improve the learning curve for junior DBAs without pretending to automate away knowledge and experience.

Data Access APIs–Part 1: Fun with UPI

First, I’d like to apologize to our good friend SQLLIB.  Those of you who have been working with the Oracle Database for some time will notice that, while it too is a common data access library, I’ve omitted it from this series of posts. No, it’s not because of some personal vendetta against SQLLIB.  In […]

a formula for failure (or an expensive redesign)

If 'Premature optimization is the root of all evil.'then 'Premature automation is the propagator of many evils.'else 'Failure to optimize is the abyss.'end;

Excited about NoCOUG Winter Conference

NoCOUG is hosting its winter conference next week – On February 11th.
As usual, we’ll have the best speakers and presentations ever. This time I’m extra happy because two of the speakers that are going to be there, Dr. Neil Gunther and Robyn Sands, are there because I was wowed by them in a previous conference and asked our Director of Conference Programming to invite them. And they agreed! I believe it is the first time that either of them presents at NoCOUG and I’m very excited about this.

I’m sure I don’t need to introduce Robyn Sands to any Oracle professional – She’s an OakTable member who talks a lot about the right ways to manage performance. She is very scientific and precise but she gives very practical advice that is very applicable.

Dr. Neil Gunther is a well known performance expert. So well known that he has his own Wikipedia article. I first ran into his work when I did performance testing work, something like 6 years ago. From his articles, I learned the importance of having performance models without which you cannot interpret your results and know when your tests were faulty. I ran into him again when Tanel Poder mentioned that Dr. Neil Gunther is now doing work that will be relevant to Oracle professionals. He appeared in HotSos few years back and now we get to see him at NoCOUG – with both a keynote session and a technical session. He invited the crowds to ask questions at his blog, so you can participate.

In addition to these two prestigious names, we have few local celebrities giving presentations: Ahbaid Gaffoor, lead DBA at Amazon, will show his make-based deployment methodology. If you don’t have a deployment methodology, this presentation is a must-see. Maria Colgan will give a presentation about data loading for data warehouses. Although she’s an Oracle presenter, which sometimes means “marketing”, Maria is smart and knowledgeable and if you are doing data warehouse work – she is worth listening to.

I’ll be presenting “What Every DBA Should Know About TCP/IP Networks”. The presentation is about network problems I’ve had to solve in the last year and how I solved them with some basic knowledge of networks, a packet sniffer and an envelope. If you ever wondered how to make your network admin take you seriously, how to get more bang from your bandwidth and whether or not you should care about your SDU, you should definitely show up.

I’m looking forward to meeting old and new friends in the conference. Its going to be a blast.

DEVCON Luzon 2010

I just recently I became a member of the PSIA Tech Council… The company I’m working for is a member of PSIA which makes up 90% of the country’s software sector promoting the growth and global competitiveness of the Philippine software industry, also an active partner of the government and academe in implementing programs that benefit the industry.

The PSIA, PSIA Tech Council, together with the Awesome and Cool sponsors will be having the Luzon leg of DEVCON here in Manila!

Below are the details of this awesome event:

09 February 2010, 4-9pm, SMX Convention Center Function Room 1

Sync. Support. Succeed.

Get together to be connected, enhance skills and support each other to achieve success!

Designed to be a premier gathering of all Filipino software engineers, DEVCON facilitates collaboration, interaction and mentoring among leading practitioners of the Philippine software industry. DEVCON adapts global best practices for skills improvement and professional advancement among Filipino software engineers. It features three main elements which has successful formats used in international technology gatherings:

> Lightning Talks – a fast-paced presentation on any topic of interest
> Birds of a Feather – a dynamic discussion of opposing perspectives on mutual topics
> Hackathon – providing rapid learning of a new technology through hands-on demonstration or joint coding onsite

Register online for your FREE seat. </p />
</p></div>

    	  	<div class=

Oracle Peformance Visualization…

Coskan Gundogar and Karl Arao have written two interesting articles about Oracle performance analysis and visualization, check these out!

Coskan’s article:

Karl’s article:

Note that in March I will be releasing PerfSheet v3.0, which will have lots of improvements! ;-)

Share/Bookmark

Oracle Peformance Visualization…

Coskan Gundogar and Karl Arao have written two interesting articles about Oracle performance analysis and visualization, check these out!
Coskan’s article:
http://coskan.wordpress.com/2010/01/27/working-with-statspack-part-1a-di... Karl’s article:
http://karlarao.wordpress.com/2010/01/31/workload-characterization-using... Note that in March I will be releasing PerfSheet v3.0, which will have lots of improvements! ;-)

Oracle Peformance Visualization…

Coskan Gundogar and Karl Arao have written two interesting articles about Oracle performance analysis and visualization, check these out!
Coskan’s article:
http://coskan.wordpress.com/2010/01/27/working-with-statspack-part-1a-di... Karl’s article:
http://karlarao.wordpress.com/2010/01/31/workload-characterization-using... Note that in March I will be releasing PerfSheet v3.0, which will have lots of improvements! ;-)

Oracle Peformance Visualization…

Coskan Gundogar and Karl Arao have written two interesting articles about Oracle performance analysis and visualization, check these out!
Coskan’s article:
http://coskan.wordpress.com/2010/01/27/working-with-statspack-part-1a-di... Karl’s article:
http://karlarao.wordpress.com/2010/01/31/workload-characterization-using... Note that in March I will be releasing PerfSheet v3.0, which will have lots of improvements! ;-)

Oracle Peformance Visualization…

Coskan Gundogar and Karl Arao have written two interesting articles about Oracle performance analysis and visualization, check these out!
Coskan’s article:
http://coskan.wordpress.com/2010/01/27/working-with-statspack-part-1a-di... Karl’s article:
http://karlarao.wordpress.com/2010/01/31/workload-characterization-using... Note that in March I will be releasing PerfSheet v3.0, which will have lots of improvements! ;-)