Dion Cho – Oracle Performance Storyteller

We are natural born scientists

Posts Tagged ‘full table scan

Strong doubt on small table(_small_table_threshold)

with 5 comments

It is well known that the table is called “small” when the size is smaller than 2% of the buffer cache. That size is calculated on startup into _small_table_threshold parameter. Some links about this:

  1. http://jonathanlewis.wordpress.com/2007/01/29/small-partitions/
  2. http://www.ixora.com.au/q+a/cache.htm

For instance, in my local database whose db_cache_size is 200m, the _db_block_buffers is 24950. Hence _small_table_threshold should be 24950*2% = 499.

SYS@ukja10> @para _db_cache_size

NAME                           VALUE
------------------------------ --------------------
__db_cache_size                209715200

SYS@ukja10> @para _db_block_buffers

NAME                           VALUE                
------------------------------ --------------------
_db_block_buffers              24950

SYS@ukja10> @para small_table_threshold

NAME                           VALUE                
------------------------------ --------------------
_small_table_threshold         499





Why is this 2% threshold so important? Because it affects the efficiency of LRU algorithm of the buffer cache.

  • Table < (2%*buffer_cache) is located into the middle point of LRU list when loaded into the buffer cache.
  • Table > (2%*buffer_cache) is located into the cold end of the LRU list when loaded into the buffer cache.

This simple rule enables Oracle to prevent the buffer cache from being flooded by the full scan on big table. But the negative effect is that the full table scan on the big table is always physical(when it is read after the buffer cache got full). No matter how often the table is read.

Okay, so far so good.

A couple of days ago, I was asked the exactly same question from my customer.

What is the threshold of being small table?

I was just about to say that the threshold is 2% of the buffer cache. But wait… I’ve never proved it myself. So I tried to make a simple test case to demonstrate the 2% rule.

But, to my surprise, the famous 2% rule never worked! The more surprising thing is that in my local 10gR2 database, the threshold seems to be around 25% of the buffer cache.

Here goes the test case.

UKJA@ukja102> select * from v$version;

BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Prod
PL/SQL Release 10.2.0.1.0 - Production
CORE    10.2.0.1.0      Production
TNS for 32-bit Windows: Version 10.2.0.1.0 - Production
NLSRTL Version 10.2.0.1.0 - Production

-- When the db_cache_size = 100m (without ASMM)
__db_cache_size                104857600
_db_block_buffers              12475
_small_table_threshold         249

-- When the db_cache_size = 200m (without ASMM)
__db_cache_size                209715200
_db_block_buffers              24950
_small_table_threshold         499


-- a bit big table (a bit bigger than 25% of db_cache_size)
UKJA@ukja102> create table t1(c1 char(2000), c2 char(2000), c3 char(2000)) nologging;

Table created.

UKJA@ukja102> insert /*+ append */ into t1
  2  select 'x', 'x', 'x'
  3  from dual
  4  connect by level <= 1*3400    -- when db_cache_size = 100m
  -- connect by level <= 2*3400    -- when db_cache_size = 200m 
  5  ;

3400 rows created.

-- a bit small table(a bit smaller than 25% of db_cache_size)
UKJA@ukja102> create table t2(c1 char(2000), c2 char(2000), c3 char(2000)) nologging;

Table created.

UKJA@ukja102> insert /*+ append */ into t2
  2  select 'x', 'x', 'x'
  3  from dual
  4  connect by level <= 1*3100    -- when db_cache_size = 100m 
--  connect by level <= 2*3100    -- when db_cache_size = 200m 
  5  ;

3100 rows created.

-- big table(which a bit bigger than db_cache_size)
UKJA@ukja102> create table big_table(c1 char(2000), c2 char(2000), c3 char(2000)) nologging;

Table created.

UKJA@ukja102> insert /*+ append */ into big_table
  2  select 'x', 'x', 'x'
  3  from dual
  4  connect by level <= 1*13000    -- when db_cache_size = 100m
-- connect by level <= 2*13000    -- when db_cache_size = 200m
  5  ;

13000 rows created.


-- Read table big_table after flushing buffer cache.
-- This would fill out the buffer cache.

UKJA@ukja102> alter system flush buffer_cache;

System altered.

UKJA@ukja102> -- read big_table
UKJA@ukja102> select count(*) from big_table;

  COUNT(*)                                                                      
----------                                                                      
     13000         


-- Read table t1(a bit bigger than 25% of db_cache_size). 
-- See that physical reads is equal to logical reads?
-- This means that table t1 is always located to the cold end of LRU list.

UKJA@ukja102> select count(*) from t1;

  COUNT(*)                                                                      
----------                                                                      
      3400                                                                      

UKJA@ukja102> select count(*) from t1;

  COUNT(*)                                                                      
----------                                                                      
      3400                                                                      

UKJA@ukja102> set autot on
UKJA@ukja102> select count(*) from t1;

  COUNT(*)                                                                      
----------                                                                      
      3400                                                                      

Statistics
----------------------------------------------------------                      
          0  recursive calls                                                    
          0  db block gets                                                      
       3407  consistent gets                                                    
       3376  physical reads                                                     
          0  redo size                                                          
        411  bytes sent via SQL*Net to client                                   
        400  bytes received via SQL*Net from client                             
          2  SQL*Net roundtrips to/from client                                  
          0  sorts (memory)                                                     
          0  sorts (disk)                                                       
          1  rows processed                                                     

UKJA@ukja102> set autot off


-- Read table t2(a bit smaller than 25% of db_cache_size). 
-- See that physical is almost gone?
-- This means that table t1 is always located to the mid point of the LRU list.

UKJA@ukja102> select count(*) from t2;

  COUNT(*)                                                                      
----------                                                                      
      3100                                                                      

UKJA@ukja102> select count(*) from t2;

  COUNT(*)                                                                      
----------                                                                      
      3100        

UKJA@ukja102> set autot on
UKJA@ukja102> select count(*) from t2;

  COUNT(*)                                                                      
----------                                                                      
      3100                                                                      

Statistics
----------------------------------------------------------                      
          0  recursive calls                                                    
          0  db block gets                                                      
       3106  consistent gets                                                    
         32  physical reads                                                     
          0  redo size                                                          
        411  bytes sent via SQL*Net to client                                   
        400  bytes received via SQL*Net from client                             
          2  SQL*Net roundtrips to/from client                                  
          0  sorts (memory)                                                     
          0  sorts (disk)                                                       
          1  rows processed                                                     

UKJA@ukja102> set autot off





Small table threshold is just around 25% of db_cache_size with both of 100m and 200m size!

This result gave me a strong doubt on 2% rule.

  • Oracle would have unknown internal threshold. This threshold would vary by version and/or db_cache_size. Memory management algorithm would be another factor.
  • 2% rule is poorly documented and it has never been fully verified.

Well, from now on, I would not trust the 2% stuff. But I strongly hope someone would have better answer and test case than mine.




Written by Dion Cho

April 22, 2009 at 5:37 am

Posted in I/O

Tagged with , ,

Why full table scan even with lower index scan cost?

with 4 comments

See following test case.

1. Version

UKJA@ukja102> select * from v$version;

BANNER
-------------------------------------------------------------------

Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Prod
PL/SQL Release 10.2.0.1.0 - Production
CORE    10.2.0.1.0      Production
TNS for 32-bit Windows: Version 10.2.0.1.0 - Production
NLSRTL Version 10.2.0.1.0 - Production

2. Create objects

create table t1(c1 int, c2 int, c3 char(100));

insert into t1
select level, level, 'x'
from dual
connect by level <= 1000
;

create index t1_n1 on t1(c1, c2);  <-- Note that c1 is leading column

exec dbms_stats.gather_table_stats(user, 't1');

3. Now we execute following query. Because leading column(c1) is inexistent in predicates, FTS is expected.

select *
from t1
where c2 = 1  <-- Note that leading column is not given!
;
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |   108 |     7   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |     1 |   108 |     7   (0)| 00:00:01 |
--------------------------------------------------------------------------

4. But there is a problem. With index hint given, Oracle expects lower cost with index full scan. The cost is 5 compared to 7 of FTS.

select /*+ index(t1) */ *
from t1
where c2 = 1
;

-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |     1 |   108 |     5   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1    |     1 |   108 |     5   (0)| 00:00:01 |
|*  2 |   INDEX FULL SCAN           | T1_N1 |     1 |       |     4   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

5. Now we have a very serious problem.

Why doesn’t Oracle choose index full scan which has much lower?

I’m very familiar with this phenomenon where Oracle denies to use index when leading column is inexistent in predicates. Sometimes we hit index skip scan instead of FTS, but why Oracle denies to use index full scan?

This behavior is consistent in all versions except in 9.2.0.1 where index full scan cost is much higher.

-- 9.2.0 (without index hint)
-------------------------------------------------------------------------
| Id  | Operation            |  Name       | Rows  | Bytes | Cost (%CPU)|
-------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |             |     1 |   108 |     7  (15)|
|*  1 |  TABLE ACCESS FULL   | T1          |     1 |   108 |     7  (15)|
-------------------------------------------------------------------------

-- 9.2.0 (with index hint)
--------------------------------------------------------------------------------
| Id  | Operation                   |  Name       | Rows  | Bytes | Cost (%CPU)|
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |             |     1 |   108 |    28   (4)|
|*  1 |  TABLE ACCESS FULL          | T1          |     1 |   108 |     7  (15)|
|*  2 |   INDEX FULL SCAN           | T1_N1       |     1 |       |    27   (4)|
--------------------------------------------------------------------------------

-- 9.2.0 (10053 trace)
SINGLE TABLE ACCESS PATH
Column:         C2  Col#: 2      Table: T1   Alias: T1
    NDV: 1000      NULLS: 0         DENS: 1.0000e-003 LO:  1  HI: 1000
    NO HISTOGRAM: #BKT: 1 #VAL: 2
  TABLE: T1     ORIG CDN: 1000  ROUNDED CDN: 1  CMPTD CDN: 1
  Access path: tsc  Resc:  6  Resp:  6
  Skip scan: ss-sel 0  andv 1000 
    ss cost 1000
    table io scan cost 6
  Access path: index (no sta/stp keys)
      Index: T1_N1
  TABLE: T1
      RSC_CPU: 193379   RSC_IO: 27
  IX_SEL:  1.0000e+000  TB_SEL:  1.0000e-003
  BEST_CST: 7.00  PATH: 2  Degree:  1

-- 10.1.0.2 (without index hint)
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |   108 |     6   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |     1 |   108 |     6   (0)| 00:00:01 |
--------------------------------------------------------------------------

-- 10.1.0.2 (with index hint)
-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |     1 |   108 |     5   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1    |     1 |   108 |     5   (0)| 00:00:01 |
|*  2 |   INDEX FULL SCAN           | T1_N1 |     1 |       |     4   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

-- 10.1.0.2 (10053 trace)
SINGLE TABLE ACCESS PATH
  COLUMN:         C2(NUMBER)  Col#: 2      Table: T1   Alias: T1
    Size: 4  NDV: 1000  Nulls: 0  Density: 1.0000e-003 Min: 1  Max: 1000
    No Histogram: #BKT: 1
        (1 uncompressed buckets and 2 endpoint values)
  TABLE: T1  Alias: T1    
    Original Card: 1000  Rounded Card: 1  Computed Card: 1.00
  Access Path: table-scan  Resc:  6  Resp:  6
  Access Path: index (skip-scan)
    ss sel 1.0000e-003  andv 1000 
    ss cost 1000 vs. table scan io cost 6
    Skip Scan rejected
  Access Path: index (no start/stop keys)
    Index: T1_N1
    rsc_cpu: 35997   rsc_io: 5
    ix_sel:  1.0000e+000    ix_sel_with_filters:  1.0000e-003
  BEST_CST: 6.03  PATH: 2  Degree:  1

-- 10.2.0.1 (without index hint)
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |   108 |     7   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |     1 |   108 |     7   (0)| 00:00:01 |
--------------------------------------------------------------------------

-- 10.2.0.1 (with index hint)
-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |     1 |   108 |     5   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1    |     1 |   108 |     5   (0)| 00:00:01 |
|*  2 |   INDEX FULL SCAN           | T1_N1 |     1 |       |     4   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

-- 10.2.0.1 (10053 trace)
SINGLE TABLE ACCESS PATH
  Column (#2): C2(NUMBER)
    AvgLen: 4.00 NDV: 1000 Nulls: 0 Density: 1.0000e-003 Min: 1 Max: 1000
  Table: T1  Alias: T1    
    Card: Original: 1000  Rounded: 1  Computed: 1.00  Non Adjusted: 1.00
  Access Path: TableScan
    Cost:  7.04  Resp: 7.04  Degree: 0
      Cost_io: 7.00  Cost_cpu: 362449
      Resp_io: 7.00  Resp_cpu: 362449
kkofmx: index filter:"T1"."C2"=1
  Access Path: index (skip-scan)
    SS sel: 1.0000e-003  ANDV (#skips): 1000
    SS io: 1000.00 vs. table scan io: 7.00
    Skip Scan rejected
  Access Path: index (FullScan)
    Index: T1_N1
    resc_io: 5.00  resc_cpu: 235797
    ix_sel: 1  ix_sel_with_filters: 1.0000e-003
    Cost: 5.03  Resp: 5.03  Degree: 1
  Best:: AccessPath: TableScan
         Cost: 7.04  Degree: 1  Resp: 7.04  Card: 1.00  Bytes: 0

-- 11.1.0.6 (without index hint)
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |   108 |     7   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |     1 |   108 |     7   (0)| 00:00:01 |
--------------------------------------------------------------------------

-- 11.1.0.6 (with index hint)
-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |     1 |   108 |     5   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1    |     1 |   108 |     5   (0)| 00:00:01 |
|*  2 |   INDEX FULL SCAN           | T1_N1 |     1 |       |     4   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

-- 11.1.0.6 (10053 trace)
SINGLE TABLE ACCESS PATH
  Single Table Cardinality Estimation for T1[T1]
  Table: T1  Alias: T1
    Card: Original: 1000.000000  Rounded: 1  Computed: 1.00  Non Adjusted: 1.00
  Access Path: TableScan
    Cost:  7.04  Resp: 7.04  Degree: 0
      Cost_io: 7.00  Cost_cpu: 362449
      Resp_io: 7.00  Resp_cpu: 362449
kkofmx: index filter:"T1"."C2"=1

  Access Path: index (skip-scan)
    SS sel: 0.001000  ANDV (#skips): 1000.000000
    SS io: 1000.000000 vs. table scan io: 7.000000
    Skip Scan rejected

  Access Path: index (FullScan)
    Index: T1_N1
    resc_io: 5.00  resc_cpu: 235797
    ix_sel: 1.000000  ix_sel_with_filters: 0.001000
 ***** Logdef predicate Adjustment ******
 Final IO cst 0.00 , CPU cst 50.00
 ***** End Logdef Adjustment ******
    Cost: 5.03  Resp: 5.03  Degree: 1
  Best:: AccessPath: TableScan
         Cost: 7.04  Degree: 1  Resp: 7.04  Card: 1.00  Bytes: 0

The funny thing is that Oracle did consider the index ful scan and found it to have lower cost! But it silently denied to use it.

Oracle would have chosen index skip scan with lower cost!

Is this a bug or designed feature?

Written by Dion Cho

February 11, 2009 at 8:53 am

Posted in Optimizer

Tagged with ,

Follow

Get every new post delivered to your Inbox.

Join 61 other followers