# Linear Decay

I’ve mentioned “linear decay” in several posts when explaining a problem that someone has seen with an execution path – but I’ve recently realised that I don’t have a post describing what it is and how it works – although it’s in Cost Based Oracle – Fundamentals, of course, if you want some detail – so here’s a brief introduction (based on simple stats with no histograms).

When you have a predicate that involves values outside the known high or low values, Oracle bases its estimate for equality predicates on “column = unknown value” and then scales this down by comparing the known range of values for that column with the distance outside the range that your supplied value falls. So, for example, if your low/high values for a column are 1 and 100 respectively with 100 distinct value and Oracle thinks that there are 10 rows per distinct value the cardinality estimate for the predicate “column = 180″ will be calculate (approximately) by the following algorithm:

• Known range 99 : (high – low) — call it 100 to keep the arithmetic simple
• Cardinality estimate for values in range 10 (as stated above)
• Distance outside range 80 : (180 – 100)
• Percentage distance outside range 80%, so scale cardinality down by 80%
• Cardinality estimate = 2

There are many other factors to handling values outside the range – one of the most awkward is that the estimate of cardinality for an out-of-range inequality predicate (e.g. column > 101) is defined to be the same as “column = unknown value in range” and that leads to some counter-intuitive results.

Just for reference and play, here’s a little script to create some data (100 distinct values, dates and integers) 16 rows per distinct value, with some queries demonstrating the point. First we generate the data:

```create table t1 as
select
trunc(sysdate) + rownum	d1,
rownum			n1
from    all_objects
where   rownum <= 100
;

insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;
insert into t1 select * from t1;

commit;

execute dbms_stats.gather_table_stats(user,'t1');

column d_80 new_value m_80
column d_40 new_value m_40
column d40  new_value m40
column d80  new_value m80

select
min(d1), max(d1), max(d1) - min(d1) + 1 d_range,
min(n1), max(n1), max(n1) - min(n1) + 1 n_range,
to_char(max(d1) - 80,'dd-mon-yyyy')	d_80,
to_char(max(d1) - 40,'dd-mon-yyyy')	d_40,
to_char(max(d1) + 40,'dd-mon-yyyy')	d40,
to_char(max(d1) + 80,'dd-mon-yyyy')	d80
from
t1
;

```

Then we run some sample queries – note that the numeric ranges and the date ranges match in size and distance above the known high value.

```prompt	===========
prompt	d1 in range
prompt	===========

select * from t1 where d1 = to_date('&m_80');
select * from t1 where d1 between to_date('&m_80') and to_date('&m_40');

prompt	===========
prompt	n1 in range
prompt	===========

select * from t1 where n1 = 40;
select * from t1 where n1 between 40 and 80;

prompt	================
prompt	d1 out of range
prompt	================

select * from t1 where d1 = to_date('&m40');
select * from t1 where d1 = to_date('&m80');
select * from t1 where d1 between to_date('&m40') and to_date('&m80');

prompt	================
prompt	n1 out of range
prompt	================

select * from t1 where n1 = 140;
select * from t1 where n1 = 180;
select * from t1 where n1 between 140 and 180;
```

And now the execution plans – of which we’re only really interested in the cardinality estimates:

```
===========
d1 in range
===========
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |    16 |   176 |     2   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |    16 |   176 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("D1"=TO_DATE(' 2013-01-24 00:00:00', 'syyyy-mm-dd
hh24:mi:ss'))

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   678 |  7458 |     2   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |   678 |  7458 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("D1"<=TO_DATE(' 2013-03-05 00:00:00', 'syyyy-mm-dd
hh24:mi:ss') AND "D1">=TO_DATE(' 2013-01-24 00:00:00', 'syyyy-mm-dd
hh24:mi:ss'))

===========
n1 in range
===========
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |    16 |   176 |     2   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |    16 |   176 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("N1"=40)

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   678 |  7458 |     2   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |   678 |  7458 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("N1">=40 AND "N1"<=80)

```

Note that when “in-range”: equality gives a cardinality of 16 (as expected); while a range of 40 (from a total range of 99), gives a cardinality of 678 : (1600 * 40 / 99)

```
================
d1 out of range
================
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |    10 |   110 |     2   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |    10 |   110 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("D1"=TO_DATE(' 2013-05-24 00:00:00', 'syyyy-mm-dd
hh24:mi:ss'))

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     3 |    33 |     2   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |     3 |    33 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("D1"=TO_DATE(' 2013-07-03 00:00:00', 'syyyy-mm-dd
hh24:mi:ss'))

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |    16 |   176 |     2   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |    16 |   176 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("D1">=TO_DATE(' 2013-05-24 00:00:00', 'syyyy-mm-dd
hh24:mi:ss') AND "D1"<=TO_DATE(' 2013-07-03 00:00:00', 'syyyy-mm-dd
hh24:mi:ss'))

================
n1 out of range
================
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |    10 |   110 |     2   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |    10 |   110 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("N1"=140)

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     3 |    33 |     2   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |     3 |    33 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("N1"=180)

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |    16 |   176 |     2   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |    16 |   176 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("N1">=140 AND "N1"<=180)

```

In this second set of results we see that for equality: when we are 40 out of range, the cardinality is the base cardinality * (total range – distance outside ) / (total range). For the value 140 this gives: 16 * (99 – 40)/ 99 = 9.53, for the value 180 this gives: 16 * (99 – 80) / 99 = 3.07.

For the out-of-range range-based predicate, the cardinality is simply the base cardinality of an in-range “column = unknown constant”, viz: 16.

You might then want to work out (or simply test) the cardinalities for: n1 = 110, n1 in (110, 140, 180), and n1 between 110 and 180. It should give you an interesting insight into one of the problems of statistics going stale on columns that are sequence or time-based.