Search

Top 60 Oracle Blogs

Recent comments

data warehouse

Data Warehouse Design: Snowflake Dimensions and Lost Skew Trap

This post is part of a series that discusses some common issues in data warehouses. Originally written in 2018, but I never got round to publishing it.
While I was experimenting with the previous query I noticed that the cost of the execution plans didn't change as I changed the COUNTRY_ISO_CODE, yet the data volumes for different countries are very different.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 100%; line-height: 95%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%; line-height:1">select c.country_name
, u.cust_state_province
, COUNT(*) num_sales
, SUM(s.amount_sold) total_amount_sold
from sales s

A Brief Look Inside Oracle's Autonomous Data Warehouse Cloud

This post is part of a series that discusses some common issues in data warehouses.
There is lots of documentation for Autonomous Data Warehouse Cloud (ADWC), in which I found this bold claim:

Data Warehouse Design: Engineered Systems Considerations

This post is part of a series that discusses some common issues in data warehouses.

On an engineered system, a key feature is that Bloom filters are pushed to storage cells during smart scan,  Additionally, a Bloom filter computed from a join column of a one table can be used against another table.  Storage index can skip I/O against the large fact table based on a Bloom filter calculated from a small dimension table (see Tanel Poder's Blog: Combining Bloom Filter Offloading and Storage Indexes on Exadata)
This shifts the balance away from Star Transformation, so you are far less likely to want to add bitmap indexes.

Data Warehouse Design: To Index, or Not to Index, that is the question

This post is part of a series that discusses some common issues in data warehouses.

When you query a star schema, you essentially have two choices;

  • bitmap index and star transformation 
  • full scan, bloom filter, and hash join

Star Transformation 

Star transformation was introduced in Oracle 8(see also Oracle Optimizer Blog: Optimizer Transformations: Star Transformation).   A star transformation requires:

How Not to Build A(n Autonomous) Data Warehouse

My day job involves investigating and resolving performance problems, so I get to see a lot of bad stuff.  Often, these problems have their roots in poor design.  It is not surprising. but is nonetheless disappointing, that when I point this out I am told that the system is either delivered this way by the vendor, or it has already been built and it is too late to change.
In the last couple of years, I have worked on several data warehouse applications that have provided the inspiration for a new presentation that I am giving at the DOAG and UKOUG conferences this year.
The presentation and this series of related blogs have several objectives:

Data Warehouse Design Mistakes 3: Date Dimensions Without Date Primary Keys

This post is part of a series that discusses some common issues in data warehouses.

Good Practice 

It is not uncommon to see a time dimension with one row for every day. This approach saves putting functions on the date column of the fact table.  For example, in the Oracle Sales History sample schema:

  • The primary key on the time dimension is a date.
  • There are 37 different attribute columns. This saves coding complex SQL functions to group dates.
#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 100%; line-height: 95%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">Name                                      Null?    Type
----------------------------------------- -------- -----------

Data Warehouse Design Mistakes 1: Lack of Foreign Key Constraints

This post is part of a series that discusses some common issues in data warehouses.

Data Warehouse Design Mistakes 2: Effective Dating Dimensions

This post is part of a series that discusses some common issues in data warehouses.

I have seen situations where some dimensions are effective-dated.  That is to say that there are multiple rows in the dimension table for the same main dimension id, but for different date ranges.  At least one of the date columns has to become part of the primary key.
The dimensions in the Sales History sample schema have got effective from and to dates, but these columns are not part of the primary key.  Let's imagine that the products get updated every year and a new version of the product is sold, requiring a new dimension row.

How Partial Indexing helps you save space in #Oracle 12c

partialhttps://uhesse.files.wordpress.com/2016/12/partial1.png?w=600&h=580 600w, https://uhesse.files.wordpress.com/2016/12/partial1.png?w=150&h=145 150w" sizes="(max-width: 300px) 100vw, 300px" />

Over time certain partitions may become less popular. In 12c, you don’t have to index these partitions anymore! This can save huge amounts of space and is one of the best 12c New Features in my opinion. Really a big deal if you are working with range partitioned tables where the phenomenon of old ranges becoming unpopular is very common. Let’s have a look, first at the problem:

How to change RANGE- to INTERVAL-Partitioning in #Oracle

set_intervalhttps://uhesse.files.wordpress.com/2016/11/set_interval.png?w=578&h=600 578w, https://uhesse.files.wordpress.com/2016/11/set_interval.png?w=144&h=150 144w" sizes="(max-width: 289px) 100vw, 289px" />

An existing RANGE partitioned table can easily be changed to be INTERVAL partitioned with the SET INTERVAL command. My table has been created initially like this: