Search

Top 60 Oracle Blogs

Recent comments

Database space usage in layman’s terms

It seems easy enough. An email pops through from your manager and he’s concerned about the space utilization in the database. Then comes the question:

“So what exactly is using up most of the space?”

For the sake of simplicity, lets assume that you already know that most of the space comes from a single database schema, so you connect to that schema and throw a query against USER_SEGMENTS. And that is when the frustration starts..

USER_SEGMENTS is about segments

That statement might seem right up there in a the list of all-time most obvious statements, but it is important to understand that strictly speaking, it is not tables and indexes and the like that consume database space. They can be thought of as a logical construct. The only thing that consumes database space is a segment, and even that really is just a logical grouping of extents. The Database Concepts guide contains an thorough explanation of these terms, but in a nutshell, a database table could consist of no segment (ie, it does not yet contain any data), a single segment, or even multiple segments (for partitioned table).  The same goes for indexes. Thus if you query USER_TABLES or USER_INDEXES to find the largest space consumer you are not getting a true indicator of the space usage on the database.  You can only get that by looking at segments. But when you query USER_SEGMENTS, you might end up with a result like this:


SQL> select segment_name, bytes
  2  from user_segments
  3  order by bytes desc
  4  fetch first 1 row only;

SEGMENT_NAME                        BYTES
------------------------------ ----------
SYS_LOB0000089108C00006$$       268435456

It doesn’t take a lot of imagination to see how that will play out with your manager:

  • You: “Yes, I’ve taken a look at the database as you requested”
  • Manager: “So what is the the biggest consumer? Is it our SALES table? Our CUSTOMERS table?”
  • You: “It is SYS_LOB0000089108C00006$$”
  • Manager: “The….what?!?!?!”

Suitably chastened you head back to the database. Since this looks like a LOB segment, you dive into USER_LOBS to see what the table is for that particular LOB segment.


SQL> select table_name
  2  from   user_lobs
  3  where  segment_name = 'SYS_LOB0000089108C00006$$';

TABLE_NAME
------------------------------
DR$TEXT_COL_IDX$X

Now you can go back to the manager.

  • You: “Whoops…sorry about that. My mistake. That is a LOB segment – it is part of one our tables”
  • Manager: “No problems. So the biggest table is?”
  • You: “It is DR$TEXT_COL_IDX$X”
  • Manager: “The….what?!?!?!”

Segment Hierarchy

Whilst the database is quite happy to manage and report segments in this way, as humans (and particular managers Smile) we have an expectation that everything that consume space in the database, ultimately must roll up to the tables that we hold our true data in. The indexes, the LOBs, the LOB indexes, the tables automatically generated to support things like Text indexes, all really just contribute to our understanding of the space a table consumes.

With that in mind, I’ve put together a script which hopefully will assist. It will attribute space to a table in the database based on:

  • the data for the table itself,
  • any partitions for that table,
  • any indexes for that table,
  • any index partitions for that table,
  • any LOB segments,
  • any secondary tables,
  • any indexes on those secondary tables,
  • any LOB segments on those secondary tables

For example, the AskTOM schema contains tables, partitioned tables, text indexes, LOBs and the like. A straight query from USER_SEGMENTS looks like this:


SQL> select segment_name, bytes
  2  from user_segments
  3  order by bytes desc;

SEGMENT_NAME                        BYTES
------------------------------ ----------
ATE_QUESTION_REVIEWS            268435456
SYS_LOB0000089108C00006$$       151191552
DR$ATE_SEARCH_IDX$I             134217728
ATE_SUBMITTED_QUESTIONS         109051904
SYS_LOB0000089079C00013$$        67305472
BIN$VuDAwtUUQHSNPYBK64buUQ==$0   54525952
SYS_LOB0000089079C00014$$        50528256
SYS_LOB0000089108C00014$$        50528256
DR$ATE_SEARCH_IDX$X              30408704

but with my space distribution script, we get a much better idea of where the true space consumption lies


SEG                            SZ
------------------------------ ------------
ATE_QUESTION_REVIEWS               474M
ATE_SUBMITTED_QUESTIONS            388M
$RECYCLE_BIN$                       53M
ATE_ERROR_CAPTURE                   40M
ATE_QUESTION_VIEWS                  17M
ATE_QUESTION_REVIEW_LINKFIX         12M
ATE_PRESENTATIONS                    6M
ATE_SUBMITTED_QUESTION_LINKFIX       6M
ATE_SUPPORTING_FILES                 4M
SOLVER                               2M
ATE_TRIBUTE                          2M
ATE_QUESTION_STATUS_HIST             2M
ATE_ADMINS                           2M

The script is below and you can also get from my github miscellaneous scrips repos. Hope you find it useful



--
-- space distribution in a schema, rolling up things like
-- indexes, LOBs, secondary tables etc to the parent table
-- to get a better view of distribution
--
with 
  indexes_with_secondary as
      ( --
        -- normal indexes
        --
        select index_name, table_name
        from   user_indexes
        where  table_name not in ( select secondary_object_name from user_secondary_objects)
        union all
        --
        -- secondary tables
        --
        select uso.secondary_object_name, ui.table_name
        from   user_secondary_objects uso,
               user_tables ut,
               user_indexes ui
        where  ut.table_name = uso.secondary_object_name
        and    uso.index_name = ui.index_name
        union all
        --
        -- indexes on secondary tables
        --
        select ui.index_name, ui_parent.table_name
        from   user_indexes ui,
               user_secondary_objects uso,
               user_tables ut,
               user_indexes ui_parent
        where  ut.table_name = uso.secondary_object_name
        and    ut.table_name = ui.table_name
        and    uso.index_name = ui_parent.index_name
      ),  
  lobs_with_secondary as
      ( --
        -- normal lobs
        --
        select segment_name, table_name
        from   user_lobs
        where  table_name not in ( select secondary_object_name from user_secondary_objects)
        union all
        --
        -- secondary tables
        --
        select ul.segment_name, ut.table_name
        from   user_lobs ul,
               user_secondary_objects uso,
               user_tables ut,
               user_indexes ui
        where  ul.table_name = uso.secondary_object_name
        and    uso.index_name = ui.index_name
        and    ui.table_name = ut.table_name
      ),
  seg_space as
      (
        select 
          coalesce(
             i.table_name,
             l.table_name, 
            case when s.segment_name like 'BIN$%' then '$RECYCLE_BIN$' else s.segment_name end) seg, 
        sum(s.bytes) byt
        from user_segments s,
             indexes_with_secondary i,
             lobs_with_secondary l
        where s.segment_name = i.index_name(+)
        and   s.segment_name = l.segment_name(+)     
        group by coalesce(i.table_name,l.table_name, 
            case when s.segment_name like 'BIN$%' then '$RECYCLE_BIN$' else s.segment_name end)
      )  
select seg, 
        lpad(case 
           when byt > 1024*1024*1024 then round(byt/1024/1024/1024)||'G'
           when byt > 1024*1024 then round(byt/1024/1024)||'M'
           else round(byt/1024)||'K'
        end,8) sz
from  seg_space s
order by byt desc;