Search

Top 60 Oracle Blogs

Recent comments

Oakies Blog Aggregator

Open Storage S7000 with Exadata… a good fit ETL/ELT operations.

I have worked on Exadata V2 performance projects with Kevin Closson for nearly a year now and have had the opportunity to evaluate several methods of loading data into a data warehouse. The most common, and by far the fastest method, involves the use of “External Tables”. External tables allow the user to define a table object made up of text files that live on a file system.   Using External Tables allows for standard SQL parallel query operations to be used to load data into permanent database tables.

SQL> alter session enable parallel dml ;
SQL> insert /*+ APPEND */ into mytable select * from ext_tab ;

With the size and power of Exadata, businesses are creating larger and larger data warehouses. There will often be dozens of machines that collect and stage data for ingest by the data warehouse.  So this means the staging area for these flat-files must be huge, real fast, and accessible from multiple networks.

What options are available for staging input files?

With Exadata V2, or any RAC environment, flat-file data has to be present on all nodes in order to fully utilize parallel query.  The natural first choice with Exadata V2 is to use DBFS.

DBFS comes with Exadata and allows for easy clustering across all Exadata database nodes.  The real data store for DBFS are database tables residing on a tablespace within the database machine.  The DBFS client program is then used to mount the DBFS filesystem such that they appear to the Linux user to be just another file system.  This allows for file system data to be managed just like any other database while using the full power of Exadata.  DBFS is quite fast and works well for housing external tables, but it does cut down on the storage available for the data warehouse.  Also, since DBFS is simply a client on an Exadata database node, it uses CPU resources on the database machine to initially transfer or create the flat files.

Open Storage S7000 a natural staging area for Exadata

If you want to extend the amount of storage to stage data for your warehouse, then the S7000 is an excellent choice.  The S7000 can stage files off traditional networks using 1gigE and 10gigE connections.  This allows for multiple machines to seamlessly connect to the S7000 in-order to stage data for ingest.  This activity has no effect on the Exadata users since the S7000 is a self contained storage server – unlike DBFS that uses CPU cycles from the database grid to manage and store the flat-file data.

Once the data is on the S7000, we can use IPoIB and connect directly into the high-bandwidth Infiniband network that is part of Exadata V2.  This allows the S7000 to be positioned neatly between Exadata and the traditional gigE networks.

what about performance?

As part of a larger project, I was able to run a quick test.  I had the following:

  • S7410 with 12 drives
  • 128 x 1GB files on a share
  • 8 db nodes active (x4170) with the share mounted on all nodes.

I created an external table across all the files and performed two tests:

  1. Select count(*).
    SQL> select count(*) from ext_tab;
  2. Insert as Select “IAS”
    SQL> alter session enable parallel dml ;
    SQL> insert /*+APPEND */ into mytable select * from ext_tab;

Both when querying and loading data with “IAS”, I was able to get 1.2GB/sec throughput as I saw with my earlier tests with S7000 and 10gigE.  That is over 4TB/hr with just one head node for the S7410.  With a clustered configuration and multiple mount points, the load rate could be increased even further.

summary

The Sun Open Storage S7410 server is an excellent choice for managing file system data.  With the ability to connect to multiple networks, it is a perfect fit to stage data for Exadata environments as well.

Filed under: Exadata, Oracle, Storage

Upcoming Index Seminar Dates (The Tourist)

I’ve just finalised some European dates for my Oracle Index Internals and Best Practices Seminar for later in the year. Location and dates are as follows: Turkey (Istanbul): 18-19 October 2010 Germany (Munich): 21-22 October 2010 Czech Republic (Prague) : 25-26 October 2010 As usual, always a good idea to book early to avoid disappointment. [...]

Advert: See you in Utrecht next week

I'm excited to be teaching at Amis for one day next week and the more informal event the previous evening. Registration form is here. Showing my beloved pictures with beer in hand? Sounds perfect although perhaps I should make sure the demos are working correctly before opening the bottle ;-)

Lots more technical posts to come in the near future too, I'm just getting over jet-lag on top of a week in Las Vegas!

Oracle Exadata Storage Server: No More SQL Tuning Required? Hooray!

I sometimes wonder if folks think I prefer poorly-tuned SQL. After all, I like hardware and everyone knows how effective it is to throw hardware at a poor query plan. I think everyone in the world has already blogged it but I can’t resist the me-too excitement. On June 10 my fellow Oak Table Network [...]

Evolution of a SQL Plan Baseline Based on a DELETE Statement

During an evolution the database engine compares the performance of two execution plans. The aim is to find out which one provides the better performance. For that purpose it has to run the SQL statement on which the SQL plan baseline is based and compare some execution statistics. The following output of the DBMS_SPM.EVOLVE_SQL_PLAN_BASELINE function [...]

an unexpected visitor ...

I got up early so I could plant a few things in the garden before I settled down to some writing this afternoon. I was on my way out the door, returned to my room to get my socks and noticed something long and black on the floor near my bed. Then I realize the thing was alive and flicking it's tongue at me:Interesting start to a Sunday morning: a 3 foot long black snake on my bedroom floor.

Oracle Internals and X$ tables

I try to be active on the OTN Forums, particularly the Database – General forum. Very often, I’ll see people asking about Oracle internals and X$ tables and where they can learn more. The answer is generally, that you can’t. It’s not possible to read up on stuff that’s largely undocumented. Further, you shouldn’t really care that much. While internals can be interesting, they rarely add a great deal of real, practical value.

This recently came up again in an OTN forum posting. I replied to the thread. That answer is reproduced here:

In my view, you shouldn’t look at X$ tables, or Oracle internals in general, as a body of study. You need to understand database design. Then, you need to understand fundamental principles of database systems, such as ACID. Then, you learn about various database features, and how they are implemented and how they work in Oracle. And then try to understand how specific features will help you in an application design. For example, given a particular application system, and the expected usage, should a particular entity be implemented as a heap table, index organized table, single table hash cluster, etc?

Now, as you gain experience, and ask questions, you’ll eventually have a question about something that isn’t clearly documented. At that point, you may ask a question here on this forum, or on Oracle-L, or whatever. Often, people will come back and say “Why do you even care about that?” But occasionally, you’ll come up with a question about an aspect of Oracle internals that it would be legitimately good and/or useful for you to know. So, by this time, you’ll probably have gained a lot of experience with Oracle, and maybe you’ll have enough knowledge to design a test case, and answer the question yourself. Or maybe you’ll be able to troll through V$ (some of which aren’t very well documented) and understand what the underlying X$ tables mean.

My point is, Oracle internals knowledge is something that’s picked up along the journey. It’s not a destination. You’ll find bits and pieces scattered along the journey. The destinations are things like “How to implement a viable backup and recovery strategy using RMAN?” or, “What method should be used to do SQL statement tuning?”, etc, etc. “What do I need to learn about Oracle internals?” is not a destination. Part of the problem is that the subject of internals is vast, and it’s (by definition) not documented and is subject to change. So, studying internals for the sake of internals is a losing proposition. By the time you make a dent in learning even a portion of it, Oracle will put out a new release, and suddenly, 20% or 30% or 50% of what you learned about internals is different.

So, don’t worry about being an internals expert. You’ll pick up the useful bits and pieces along the way.

Enjoy the journey!

So, that’s my opinion on learning internals. Anyone else have thoughts or opinions on the subject? Leave a comment.

Do-It-Yourself Exadata-Level Performance? Really?

Oracle Mix is hosting “Suggest-a-Session” for OOW 2010. I thought that sounded like fun so I did the following…see what you think: Do-It-Yourself Exadata-level Performance? Since my blog has been essentially dormant for 2 months I expect about 43 click-throughs and no more than 42 votes And, yes, I will kick up my blogging again [...]

Quiz Night

A recent question on OTN gave the following information: I’m having problem with my database (Oracle 8.1.7.2.0 on Solaris 9), which contains more than 1 row with a same value on a field that has uniqueness constraint. Could you please help tell me how to fix this? Here is the log from sqlplus. When I [...]

Oracle Mix – OOW and Oracle Develop Suggest-A-Session 2010 Now Open

Just a short note to let everyone know the Oracle Mix – OOW and Oracle Develop Suggest-A-Session facility is now open for the submission and voting of proposals to this year’s Oracle OpenWorld and Oracle Develop conferences. This is a great opportunity to let the “public” not only submit proposals for presentations and panel sessions but also to vote for those sessions [...]