Dion Cho – Oracle Performance Storyteller

We are natural born scientists

Underperformance of V$SQL_OPTIMIZER_ENV

with 8 comments

A couple of days ago, I was trying to search the non-default optimizer environments(aka. parameters) of a specific SQL query which has multiple child cursors. But it soon turned out that it’s almost impossible to search this simple view just because of the underperformance.

Followings are what I mean.

1. Create simple objects and gather statistics with histogram.

TPACK@ukja1106> select * from v$version;

BANNER
------------------------------------------------------------------------

Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - Production

TPACK@ukja1106> create table t1(c1 int, c2 int);

Table created.

TPACK@ukja1106> 
TPACK@ukja1106> insert into t1 select level, mod(level,5) from dual
  2  connect by level <= 10000;

10000 rows created.

TPACK@ukja1106> 
TPACK@ukja1106> create index t1_n1 on t1(c1);

Index created.

TPACK@ukja1106> 
TPACK@ukja1106> exec dbms_stats.gather_table_stats(user, 't1', method_opt=>'for columns size 5 c1 size 5 c2');

PL/SQL procedure successfully completed.

2. Now I create 200 child cursors for a simple query using the notorious combination of SIMILAR cursor sharing and histogram.

TPACK@ukja1106> -- session #1
TPACK@ukja1106> alter session set cursor_sharing = similar;

Session altered.

TPACK@ukja1106> 
TPACK@ukja1106> declare
  2    v_cnt   number;
  3  begin
  4    for idx in 1 .. 200 loop
  5      execute immediate 'select count(*) from t1 where c1 = ' || idx || ' and c2 = ' || idx
  6        into v_cnt;
  7    end loop;
  8  end;
  9  /

PL/SQL procedure successfully completed.

Now I have 200 child cursors.

TPACK@ukja1106> -- version count
TPACK@ukja1106> select version_count from v$sqlarea where sql_id = '&sql_id';

VERSION_COUNT
-------------
          200

3. But the problem is that simple search on V$SQL_OPTIMIZER_ENV is extremely slow. Following simple lookup takes 17.3 sec to be accomplished.

TPACK@ukja1106> -- get sql optimizer env
TPACK@ukja1106> select count(*) from v$sql_optimizer_env
  2  where sql_id = '&sql_id' and child_number = &child_number;

  COUNT(*)
----------
        37

Elapsed: 00:00:17.29

The bigger problem is that this simple query eats up 100% of CPU and makes the whole system hang with the library cache related contention as the count of the child cursors increases. Restarting the instance was the only way to continue my test.

4. The query follows the effective index range scan on the fixed index. No more tuning available.

TPACK@ukja1106> explain plan for
  2  select count(*) from v$sql_optimizer_env
  3  where sql_id = '&sql_id' and child_number = &child_number;

-----------------------------------------------------------------------------------------------
| Id  | Operation                | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |                    |     1 |    47 |     0   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE          |                    |     1 |    47 |            |          |
|*  2 |   FIXED TABLE FIXED INDEX| X$KQLFSQCE (ind:2) |     1 |    47 |     0   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("KQLFSQCE_SQLID"='ftg19cu1jf8st' AND "KQLFSQCE_CHNO"=0 AND
              "INST_ID"=USERENV('INSTANCE') AND BITAND("KQLFSQCE_FLAGS",8)=0 AND
              (BITAND("KQLFSQCE_FLAGS",4)=0 OR BITAND("KQLFSQCE_FLAGS",2)=0))

5. Querying the time model and latch activity tells nothing meaningful. It would have been an aggressive mutex activity which made my laptop hang as this is 11g enviroment.

TIME       DB time                                     17572455
TIME       sql execute elapsed time                    17562899
TIME       DB CPU                                      17500000

LATCH      session allocation                             31344
LATCH      DML lock allocation                            21187
LATCH      row cache objects                              12711
LATCH      shared pool                                     9956
LATCH      cache buffers chains                            8095

I’ve profiled the callstack and got nothing meaningful. It just shows the API call for the X$VIEW lookup.

STACK_TRACE                                                                         HIT_CNT    HIT_PCT
-------------------------------------------------------------------------------- ---------- ----------
00000000                                                                                 10        100
 ->7C80B710
  ->_OracleThreadStart@4()
   ->_opimai()
    ->_opimai_real()
     ->_sou2o()
      ->_opidrv()
       ->_opiodr()
        ->_opiino()
         ->_opitsk()
          ->_ttcpip()
           ->_opiodr()
            ->_kpoal8()
             ->_opifch2()
              ->_qergsFetch()
               ->_qerfxFetch()
                ->_kqlfsqce()
                 ->_kglic()
                  ->_kglic0()
                   ->_kgligi()
                    ->_kqlfsqcet()  <-- X$VIEW
                     ->_kghalf()
                      ->00000000
                       ->7C80B710
                        ->_OracleOradebugThreadStart@4()
                         ->_ssthreadsrgruncallback()
                          ->_ksdxcb()
                           ->_ksdxfdmp()
                            ->_ksedst_tracecb()
                             ->_ksedst1()
                              ->_skdstdst()

This problem was reproducible only when the query has many(over 200?) child cursors. And the V$[SYS|SES]_OPTIMIZER_ENV view did not have such problem.

Metalink didn’t give any bug related to V$SQL_OPTIMIZER_ENV. So I suspect that this is not a bug, but the lookup code for the view has some under-optimized part on which I have no control.

– Appended : 2010/5/20
This behavior is likely to be dependent on versions. I would share the result of more comprehensive tests on this in case something meaningful found.

– Appended : 2010/5/28
It looks like OS problem. More specifically, this unpderperformance took only in Oracle on Windows. Didn’t have a chance to dig it in more detail.

Written by Dion Cho

May 20, 2010 at 2:21 am

Posted in Misc.

Tagged with

Trivial Research on the Cardinality Feedback on 11gR2

with 9 comments

These are findings and dicussions on the undocumented cardinality feedback implementation on Oracle 11gR2.

This is a very fantastic but embarassing feature, so I decided to reproduce it myself and have some research on it. I focused on the following questions.

  1. Can I reproduce it?
  2. How does it work?
  3. Can I control it – hidden parameters and/or diagnostic event an/or hints?

And I think I got the right answers for all these questions.

1. Can I reproduce it?
I could reproduce it myself but with the different execution plans from the orignal post. Here is the result.(Sorry for the ugly indents)

UKJA@UKJA1120> select /* GG */ t.year_id, sum(f.metric1)
  2  from fact f, time t, dim2 d2, dim3 d3, dim4 d4
  3  where f.time_id=t.time_id
  4  and f.dim2_id=d2.dim2_id
  5  and f.dim3_id1=d3.dim3_id1
  6  and f.dim3_id2=d3.dim3_id2
  7  and f.dim4_id=d4.dim4_id
  8    and d2.dim2_lib='Value 5'
  9    and d3.dim3_lib='Value (2,2)'
 10    and d4.dim4_l2='L2.1'
 11    and attr2='ZZ4'
 12    and t.time_id=trunc(t.time_id,'W')
 13  group by t.year_id
 14  order by t.year_id;

YEAR_ID             SUM(F.METRIC1)
------------------- --------------
2009/01/01 00:00:00          38490

-- First execution
--------------------------------------------------------------------------------
SQL_ID  956ypc7b6prrj, child number 0
-------------------------------------
Plan hash value: 2090619557
-----------------------------------------------------------------------=-------------
| Id  | Operation                                | Name          | E-Rows  | A-Rows |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                         |               |			   |      1 |
|   1 |  SORT GROUP BY                           |               |			 1 |      1 |
|   2 |   NESTED LOOPS                           |               |			   |     42 |
|   3 |    NESTED LOOPS                          |               |			 1 |     56 |
|   4 |     NESTED LOOPS                         |               |			 1 |     56 |
|   5 |      NESTED LOOPS                        |               |			 8 |    560 |
|   6 |       MERGE JOIN CARTESIAN               |               |			 1 |     60 |
|   7 |        PARTITION RANGE ALL               |               |			 1 |     60 |
|*  8 |         TABLE ACCESS FULL                | TIME          |			 1 |     1  |
|   9 |        BUFFER SORT                       |               |			 1 |     60 |
|* 10 |         TABLE ACCESS FULL                | DIM3          |			 1 |      1 |
|  11 |       PARTITION RANGE ITERATOR           |               |			 8 |    560 |
|  12 |        PARTITION HASH ALL                |               |			 8 |    560 |
|* 13 |         TABLE ACCESS BY LOCAL INDEX ROWID| FACT          |			 8 |    560 |
|  14 |          BITMAP CONVERSION TO ROWIDS     |               |			   |  64760 |
|  15 |           BITMAP AND                     |               |			   |    120 |
|* 16 |            BITMAP INDEX SINGLE VALUE     | FACT_TIME_IDX |			   |    120 |
|* 17 |            BITMAP INDEX SINGLE VALUE     | FACT_DIM3_IDX |			   |    145 |
|  18 |      PARTITION HASH ITERATOR             |               |			 1 |     56 |
|* 19 |       TABLE ACCESS BY LOCAL INDEX ROWID  | DIM2          |			 1 |     56 |
|* 20 |        INDEX UNIQUE SCAN                 | DIM2_PK       |			 1 |    560 |
|* 21 |     INDEX UNIQUE SCAN                    | DIM4_PK       |			 1 |     56 |
|* 22 |    TABLE ACCESS BY INDEX ROWID           | DIM4          |			 1 |     42 |
-------------------------------------------------------------------------------------


-- Second Exectuion (Note that cardinality feedback worked like the magic!)
Plan hash value: 2424032429
------------------------------------------------------------------------------------
| Id  | Operation                                | Name          | E-Rows  | A-Rows |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                         |               |    		 |      1 |
|   1 |  SORT GROUP BY                           |               |  		 1 |      1 |
|   2 |   NESTED LOOPS                           |               |         |      3 |
|   3 |    NESTED LOOPS                          |               | 		  32 |      4 |
|   4 |     VIEW                                 | VW_GBC_17     |  		 3 |      4 |
|   5 |      HASH GROUP BY                       |               |  		 3 |      4 |
|*  6 |       HASH JOIN                          |               |    2823 |     56 |
|   7 |        PARTITION HASH ALL                |               | 		  56 |      1 |
|*  8 |         TABLE ACCESS FULL                | DIM2          | 		  56 |      1 |
|   9 |        NESTED LOOPS                      |               |   		   |    560 |
|  10 |         NESTED LOOPS                     |               |     504 |  64760 |
|  11 |          MERGE JOIN CARTESIAN            |               | 		  60 |     60 |
|* 12 |           TABLE ACCESS FULL              | DIM3          | 		   1 |      1 |
|  13 |           BUFFER SORT                    |               | 		  60 |     60 |
|  14 |            PARTITION RANGE ALL           |               | 	  	60 |     60 |
|* 15 |             TABLE ACCESS FULL            | TIME          | 	  	60 |     60 |
|  16 |          PARTITION RANGE ITERATOR        |               | 	  	   |  64760 |
|  17 |           PARTITION HASH ALL             |               |   		   |  64760 |
|  18 |            BITMAP CONVERSION TO ROWIDS   |               |  		   |  64760 |
|  19 |             BITMAP AND                   |               |  		   |    120 |
|* 20 |              BITMAP INDEX SINGLE VALUE   | FACT_TIME_IDX |  		   |    120 |
|* 21 |              BITMAP INDEX SINGLE VALUE   | FACT_DIM3_IDX |  		   |    145 |
|* 22 |         TABLE ACCESS BY LOCAL INDEX ROWID| FACT          |  		 8 |    560 |
|* 23 |     INDEX UNIQUE SCAN                    | DIM4_PK       |  		 1 |      4 |
|* 24 |    TABLE ACCESS BY INDEX ROWID           | DIM4          |    	11 |      3 |
-------------------------------------------------------------------------------------

Note
-----
   - cardinality feedback used for this statement

Very impressing, isn’t it? Oracle introuduced the concept of the cardinality feedback on 10g – as the name of automatic query tuning. It seems that Oracle has determined to apply that technology for the normal query processing.

2. How does it work?
10053 trace shows that Oracle uses OPT_ESTIMATE hint which is the one used exactly by the automatic query tuning engine.

SELECT /*+ OPT_ESTIMATE (TABLE "D4" MIN=42.000000 ) OPT_ESTIMATE (INDEX_SCAN "D4" "DIM4_PK" MIN=56.000000 ) OPT_ESTIMATE (INDEX_FILTER "D4" "DIM4_PK" MIN=56.000000 ) OPT_ESTIMATE (TABLE "D2" MIN=56.000000 ) OPT_ESTIMATE (INDEX_SCAN "D2" "DIM2_PK" MIN=560.000000 ) OPT_ESTIMATE (INDEX_FILTER "D2" "DIM2_PK" MIN=560.000000 ) OPT_ESTIMATE (TABLE "T" ROWS=60.000000 ) OPT_ESTIMATE (TABLE "F" MIN=560.000000 ) */ "T"."YEAR_ID" "YEAR_ID",SUM("F"."METRIC1") "SUM(F.METRIC1)" FROM "UKJA"."FACT" "F","UKJA"."TIME" "T","UKJA"."DIM2" "D2","UKJA"."DIM3" "D3","UKJA"."DIM4" "D4" WHERE "F"."TIME_ID"="T"."TIME_ID" AND "F"."DIM2_ID"="D2"."DIM2_ID" AND "F"."DIM3_ID1"="D3"."DIM3_ID1" AND "F"."DIM3_ID2"="D3"."DIM3_ID2" AND "F"."DIM4_ID"="D4"."DIM4_ID" AND "D2"."DIM2_LIB"='Value 5' AND "D3"."DIM3_LIB"='Value (2,2)' AND "D4"."DIM4_L2"='L2.1' AND "F"."ATTR2"='ZZ4' AND "T"."TIME_ID"=TRUNC("T"."TIME_ID",'W') GROUP BY "T"."YEAR_ID" ORDER BY "T"."YEAR_ID"

A series of OPT_ESTIMATE hint is appended on the fly even before the query transformation is being processed.

If you analyze the numbers used in the hints carefully, you would realize that Oracle uses the actual rows as the cardinality feedback.

It seems that Oracle stores the actual rows for each table/indexe at the first execution and uses them to feedback the cardinality on the next execution. And it also seems that Oracle does so only when there is a great difference between the estimated cardinality and the actual cardinality.

3. Can I control it?
The final question is how we control it. The hidden parameter “_optimizer_use_feedback” is the answer.

By disabling this parameter, I could prevent the cardinality feedback from happening.

alter session set "_optimizer_use_feedback" = false;

This cardinality feedback implementation is quite interesting but the details are not well known. It also has a chance to make many DBAs/developers embarassed.

Let us wait and see how the things go.

Written by Dion Cho

December 17, 2009 at 7:45 am

Logging in PL/SQL

with 3 comments

What is the best debugging tool ever invented in the history of the programming langauge? We have a choice of many powerful and cutting-edge tools these days, but still, the most powerful and handy debugging tool is logging which is written by developers themselves.

But unfortunately, PL/SQL has a weaking functionality on the logging and this is why somebody is trying to implement the fully functioning logging ability with PL/SQL.

Anyway, the most widely used way of logging is absolutely the DBMS_OUTPUT package. But it has a critical limitation that you can’t see the logs until the PL/SQL execution completes.

set serveroutput on

begin
	for idx in 1 .. 10 loop
		dbms_output.put_line(sysdate || ', ' || idx || 'th insertion');
		dbms_lock.sleep(1);
	end loop;
end;
/

2009/11/10 13:33:18, 1th insertion
2009/11/10 13:33:19, 2th insertion
2009/11/10 13:33:20, 3th insertion
2009/11/10 13:33:21, 4th insertion
2009/11/10 13:33:22, 5th insertion
2009/11/10 13:33:23, 6th insertion
2009/11/10 13:33:24, 7th insertion
2009/11/10 13:33:25, 8th insertion
2009/11/10 13:33:26, 9th insertion
2009/11/10 13:33:27, 10th insertion

PL/SQL procedure successfully completed.

Elapsed: 00:00:10.00

Another trick is to use DBMS_SYSTEM package with KSDWRT procedure. This procedure enables us to write a text to trace file(1), alert log(2) or both(3). With UTL_FILE package, you can leave a log to any text file. But you need to have a network access(telnet and/or ftp) to the database server to read those files, which is also a big limitation.

begin
	for idx in 1 .. 10 loop
		sys.dbms_system.ksdwrt(1, sysdate || ', ' || idx || 'th insertion');
		dbms_lock.sleep(1);
	end loop;
end;
/

-- C:\ORACLE\ADMIN\UKJA1021\UDUMP\ukja1021_ora_1448.trc
*** 2009-11-10 13:37:15.015
*** ACTION NAME:() 2009-11-10 13:37:15.000
*** MODULE NAME:(SQL*Plus) 2009-11-10 13:37:15.000
*** SERVICE NAME:(UKJA1021) 2009-11-10 13:37:15.000
*** SESSION ID:(140.1169) 2009-11-10 13:37:15.000
2009/11/10 13:37:15, 1th insertion
2009/11/10 13:37:16, 2th insertion
2009/11/10 13:37:17, 3th insertion
2009/11/10 13:37:18, 4th insertion
2009/11/10 13:37:19, 5th insertion
2009/11/10 13:37:20, 6th insertion
2009/11/10 13:37:21, 7th insertion
2009/11/10 13:37:22, 8th insertion
2009/11/10 13:37:23, 9th insertion
2009/11/10 13:37:24, 10th insertion

A little programming skill would be a big help on the situation like this. For instance, following example relies on the pipelined tablen function and DBMS_PIPE package to overcome the limitations mentioned above. While session A logs, another session B can read it simultaneously.

create or replace package pkg_log
as 
  type log_array is table of varchar2(4000);
	procedure log(message in varchar2);
	procedure flush;
	function get_log return log_array pipelined;
end;
/

show err

create or replace package body pkg_log
as
	procedure log(message in varchar2) is
		v_status number;
	begin
		dbms_pipe.pack_message(sysdate || ', ' || message);
    v_status := dbms_pipe.send_message('log');
	end log;
	
	procedure flush is
		v_status number;
	begin
		dbms_pipe.pack_message('$$END$$');
		v_status := dbms_pipe.send_message('log');
	end;
	
	function get_log return log_array pipelined is
		v_status  number;
		v_message varchar2(4000);
	begin
		while true loop
			v_status := dbms_pipe.receive_message('log');
    	if v_status = 0 then
      	dbms_pipe.unpack_message(v_message);
      	if v_message = '$$END$$' then
      		return;
      	end if;
      	pipe row(v_message);
      	pipe row('');
    	end if;
		end loop;
		return;
	end get_log;
end;
/

show err

It works like this.

-- session #1
begin
	for idx in 1 .. 10 loop
		pkg_log.log(idx || 'th insertion');
		dbms_lock.sleep(1);
	end loop;
	pkg_log.flush;
end;
/


-- session #2
set array 2
set pages 100
select * from table(pkg_log.get_log);

COLUMN_VALUE
---------------------------------------

2009/11/10 13:30:12, 1th insertion

2009/11/10 13:30:13, 2th insertion

2009/11/10 13:30:14, 3th insertion

2009/11/10 13:30:15, 4th insertion

2009/11/10 13:30:16, 5th insertion

2009/11/10 13:30:17, 6th insertion

2009/11/10 13:30:18, 7th insertion

2009/11/10 13:30:19, 8th insertion

2009/11/10 13:30:20, 9th insertion

2009/11/10 13:30:21, 10th insertion

This is a situation where a little knowledge can play an important role to solve the problem.

Written by Dion Cho

November 10, 2009 at 8:40 am

Posted in Programming

Tagged with

Finding out the row cache lock holder through V$ view

with 2 comments

Hanganalyze is always the best tool for finding row cache lock holder as following.

connect sys/oracle as sydba

oradebug setmypid
oradebug hanganalyze 3


-- trace file
==============
HANG ANALYSIS:
==============
Open chains found:
Chain 1 : <cnode/sid/sess_srno/proc_ptr/ospid/wait_event> :
    <0/158/44363/0x3424fb7c/5504/No Wait>
 -- <0/134/928/0x3424f58c/4600/row cache lock>

Am I able to find out the row cache lock holder only through V$ views? The answer is YES and it’s what V$ROWCACHE_PARENT view is meant for.

Let me explain with simple test case.

Create a sequence object with NOCACHE attribute and let two sessions invoke the sequence.nextvalue function recursively. Oracle needs to hold the row cache lock in 5(SRX) mode when calling NEXTVALUE on the nocached sequence, which means it’s very easy to replay the row cache lock contention.

create sequence s1 nocache;

-- temp.sql
declare
 v_value   number;
begin
  for idx in 1 .. 100000 loop
   select s1.nextval into v_value from dual;
  end loop;
end;
/

ho start sqlplus ukja/ukja@ukja1021 @temp
ho start sqlplus ukja/ukja@ukja1021 @temp

Now query on the V$ROWCACHE_PARENT view like following and you get the ID of the row cache lock holder.

col sid new_value sid

select h.address, h.saddr, s.sid, h.lock_mode
from v$rowcache_parent h, v$rowcache_parent w, v$session s
where h.address = w.address and
			w.saddr = (select saddr from v$session where event = 'row cache lock' 
									and rownum = 1) and
			h.saddr = s.saddr and
			h.lock_mode > 0
;

ADDRESS  SADDR           SID  LOCK_MODE
-------- -------- ---------- ----------
283AFB50 3432EB34        145          5

With the session id found out, it’s not a big deal to get the detailed information on the holder session.

UKJA@ukja1021> @session &sid
UKJA@ukja1021> set echo off
01. basic session info
SID                           : 145
SERIAL#                       : 447
SPID                          : 2328
MACHINE                       : POWER_GROUP\UKJAX
PROGRAM                       : sqlplus.exe
PGA                           : 515668
UGA                           : 156280
LAST_CALL_ET                  : 13
LOGON_TIME                    : 2009/10/15 13:50:31
-----------------

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.03
02. session wait
SID                           : 145
EVENT                         : row cache lock
P1                            : 13
P1RAW                         : 0000000D
P2                            : 0
P2RAW                         : 00
P3                            : 5
P3RAW                         : 00000005
SECONDS_IN_WAIT               : 0
STATE                         : WAITING
-----------------

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.01
03. process info
PID                           : 23
PROGRAM                       : ORACLE.EXE (SHAD)
PGA_USED_MEM                  : 324361
PGA_ALLOC_MEM                 : 623853
PGA_MAX_MEM                   : 623853
-----------------

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.01
04. sql info
SID                           : 141
SHARABLE_MEM                  : 8640
PERSISTENT_MEM                : 1156
RUNTIME_MEM                   : 592
EXECUTIONS                    : 100000
FETCHES                       : 100000
BUFFER_GETS                   : 403341
SQL_TEXT                      : SELECT S1.NEXTVAL FROM DUAL
-----------------


05. sql plan info

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------

SQL_ID  8c3n1ysfzzd1z, child number 0
-------------------------------------
SELECT S1.NEXTVAL FROM DUAL

Plan hash value: 2479889702

-----------------------------------------------------------------
| Id  | Operation        | Name | Rows  | Cost (%CPU)| Time     |
-----------------------------------------------------------------
|   0 | SELECT STATEMENT |      |       |     2 (100)|          |
|   1 |  SEQUENCE        | S1   |       |            |          |
|   2 |   FAST DUAL      |      |     1 |     2   (0)| 00:00:01 |
-----------------------------------------------------------------
...

Well, it’s just good to find out the usage of the view whose usage is never well known. :)

Written by Dion Cho

October 15, 2009 at 6:44 am

Posted in Misc.

Tagged with ,

DBMS_ADVANCED_REWRITE and DML

with 5 comments

Oracle 10g has introduced DBMS_ADVANCED_REWRITE package which enables a specific query to be transformed to another form. Very powerful feature, but this is designed for DW not OLTP. Thus has following restrictions.

  1. Does not work with the bind variables.(Metalink Doc ID. 392214.1)
  2. Only works for the SELECT statement.
  3. Does not work when the base table is modified through DML.

It’s not well known how to avoid 3rd restriction. The only way I got from some test cases is to use QUERY_REWRITE_INTEGRITY parameter.

Let me explain through simple test case.

1. Make objects.

UKJA@ukja1106> create table t1
  2  as select level as c1, level as c2
  3  from dual
  4  connect by level <= 1000;

Table created.

Elapsed: 00:00:00.04

UKJA@ukja1106> create index t1_n1 on t1(c1);

Index created.

Elapsed: 00:00:00.03

UKJA@ukja1106> exec dbms_stats.gather_table_stats(user,'t1');

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.18

2. The SELECT statement has FULL hint and (naturally) follows full table scan.

UKJA@ukja1106> explain plan for
  2  select /*+ full(t1) */ * from t1 where c1 = 1;

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |     7 |    10   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |     1 |     7 |    10   (0)| 00:00:01 |
--------------------------------------------------------------------------

3. How to fix it? DBMS_ADVANCED_REWRITE package would give you the power to change the SELECT statement on the fly with QUERY_REWRITE_INTEGRITY parameter set to TRUSTED.

UKJA@ukja1106> begin
  2  
  3  	     sys.dbms_advanced_rewrite.declare_rewrite_equivalence(
  4  	     'test_rewrite',
  5  	     'select /*+ full(t1) */ * from t1 where c1 = 1',
  6  	     'select /*+ index(t1) */ c1, c2 from t1 where c1 = 1',
  7  	     false,
  8  	     'text_match');
  9  end;
 10  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.01

UKJA@ukja1106> alter session set query_rewrite_integrity=trusted;

Session altered.

Elapsed: 00:00:00.00

4. Now you have a plan with magical index range scan.

UKJA@ukja1106> explain plan for
  2  select /*+ full(t1) */ * from t1 where c1 = 1;

-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |     1 |     7 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1    |     1 |     7 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | T1_N1 |     1 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

5. But one simple DML would prevent rewrite from happening.

UKJA@ukja1106> insert into t1 values(1, 1);

1 row created.

Elapsed: 00:00:00.00

UKJA@ukja1106> commit;

Commit complete.

Elapsed: 00:00:00.00

UKJA@ukja1106> explain plan for
  2  select /*+ full(t1) */ * from t1 where c1 = 1;

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |     7 |    10   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |     1 |     7 |    10   (0)| 00:00:01 |
--------------------------------------------------------------------------

6. One way to avoid this restriction is to set QUERY_REWRITE_INTEGRITY parameter to STALE_TOLERATED.

UKJA@ukja1106> alter session set query_rewrite_integrity=stale_tolerated;

Session altered.

Elapsed: 00:00:00.00

UKJA@ukja1106> explain plan for
  2  select /*+ full(t1) */ * from t1 where c1 = 1;

-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |     1 |     7 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1    |     1 |     7 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | T1_N1 |     1 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

7. Now DML does not prevent the rewrite from happening.

UKJA@ukja1106> insert into t1 values(1, 1);

1 row created.

Elapsed: 00:00:00.01

UKJA@ukja1106> commit;

Commit complete.

UKJA@ukja1106> explain plan for
  2  select /*+ full(t1) */ * from t1 where c1 = 1;

-------------------------------------------------------------------------------------
| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |       |     1 |     7 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1    |     1 |     7 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | T1_N1 |     1 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

8. But here you have another restriction. If the DML is not committed, the rewrite would fail even with STALE_TOLERATED setting.

UKJA@ukja1106> insert into t1 values(1, 1);

1 row created.

Elapsed: 00:00:00.01

UKJA@ukja1106> explain plan for
  2  select /*+ full(t1) */ * from t1 where c1 = 1;

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |     7 |    10   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |     1 |     7 |    10   (0)| 00:00:01 |
--------------------------------------------------------------------------

Hm… there seems to be really many restrictions for OLTP, but these would not make any problem for DW. But I believe that even OLTP would get a big and good effect when used in the right place.

Footnote:It makes me a bit gloomy not to post on OOW events but on these little tips. Well, the only consolation is that San Francisco is very far from here and I really hate long-distance journey. :(

Written by Dion Cho

October 13, 2009 at 5:43 am

Posted in Misc.

Tuning query on the fixed table

with 2 comments

Oracle performance monitoring is very oftenly followed by querying on the fixed table(x$ table). Query on the fixed table means a direct access on the SGA memory area, which means the promised efficiency.

But this does not mean that you don’t need to care about the performance. Reading memory is fast, but hundreds of megabytes of memory makes the situation far worse than you imagine. Just for this reason, many fixed tables have indexes.

UKJA@ukja1021> desc v$indexed_fixed_column
           Name                            Null?    Type
           ------------------------------- -------- ----------------------------
    1      TABLE_NAME                               VARCHAR2(30)
    2      INDEX_NUMBER                             NUMBER
    3      COLUMN_NAME                              VARCHAR2(30)
    4      COLUMN_POSITION                          NUMBER

For instance, X$KGLOB table which represents the list of LCO(Library Cache Object) has following two indexes.

UKJA@ukja1021> exec print_table('select * from v$indexed_fixed_column -
> 		  where table_name = ''X$KGLOB''');
TABLE_NAME                    : X$KGLOB
INDEX_NUMBER                  : 1
COLUMN_NAME                   : KGLNAHSH
COLUMN_POSITION               : 0
-----------------
TABLE_NAME                    : X$KGLOB
INDEX_NUMBER                  : 2
COLUMN_NAME                   : KGLOBT03
COLUMN_POSITION               : 0
-----------------

We have indexes on SQL Hash Value(KGLNAHSH) and SQL ID(KGLOBT03) columns. This means that you should always try to query X$KGLOB table through these two columns.

Here I make a simple test case to demonstrate the usefulness of the indexes on the fixed table.

1. Execute a simple SQL statement and extract SQL ID.

UKJA@ukja1021> select * from t1;

no rows selected

Elapsed: 00:00:00.00
UKJA@ukja1021> col sql_id new_value sql_id
UKJA@ukja1021> select regexp_replace(plan_table_output,
  2  				     'SQL_ID[[:blank:]]+([[:alnum:]]+),.*', '\1') as sql_id
  3  from table(dbms_xplan.display_cursor)
  4  where plan_table_output like 'SQL_ID%';

SQL_ID
--------------------------------------------------------------------------------
27uhu2q2xuu7r

2. Execute two SQL statements which query X$KGLOB table through KGNAOBJ(without index) and KGLOBT03(with index) column respectively.

UKJA@ukja1021> select count(*) from sys.xm$kglob
  2  where kglnaobj = 'select * from t1';

  COUNT(*)
----------
         2

Elapsed: 00:00:00.06
UKJA@ukja1021> 
UKJA@ukja1021> select count(*) from sys.xm$kglob
  2  where kglobt03 = '&sql_id';
old   2: where kglobt03 = '&sql_id'
new   2: where kglobt03 = '27uhu2q2xuu7r'

  COUNT(*)
----------
         2

Elapsed: 00:00:00.01

3. TKPROF result on both of the SQL statements.

select count(*) from sys.xm$kglob
where kglnaobj = 'select * from t1'

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      0.06       0.05          0          0          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      0.06       0.05          0          0          0           1

Misses in library cache during parse: 0
Optimizer mode: ALL_ROWS
Parsing user id: 61  

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  SORT AGGREGATE (cr=0 pr=0 pw=0 time=59402 us)
      2   FIXED TABLE FULL X$KGLOB (cr=0 pr=0 pw=0 time=43810 us)

select count(*) from sys.xm$kglob
where kglobt03 = '27uhu2q2xuu7r'

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      0.00       0.00          0          0          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      0.00       0.00          0          0          0           1

Misses in library cache during parse: 0
Optimizer mode: ALL_ROWS
Parsing user id: 61  

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  SORT AGGREGATE (cr=0 pr=0 pw=0 time=167 us)
      2   FIXED TABLE FIXED INDEX X$KGLOB (ind:2) (cr=0 pr=0 pw=0 time=120 us)

Interpretation:

  • Logical reads is 0 on both cases.Fixed table is read by direct access on the memory not by scanning database blocks. For this reason logical reads has no meaning here.
  • Execution plan cleary shows that the first one follows full scan(FIXED TABLE FULL) and the second one follows index scan(FIXED TABLE FIXED INDEX (ind:2)).
  • This makes the difference of 0.05s and 0s of the elapsed time.

Keep in mind the fact that the traditional way of comparing the logical reads is useless in this test case.

4. My personal monitoring script shows following additional info.

@mon_on userenv('sid')

select count(*) from sys.xm$kglob
where kglnaobj = 'select * from t1';

@mon_off

select count(*) from sys.xm$kglob
where kglobt03 = '&amp;sql_id';

@mon_off2
@mon_show2
...
02. time model

STAT_NAME                                   VALUE1         VALUE2           DIFF
----------------------------------- -------------- -------------- --------------
DB time                                    166,458        115,196        -51,262
sql execute elapsed time                   164,356        113,308        -51,048
DB CPU                                     134,972        103,749        -31,223
...

03. latch

LATCH_NAME                         D_GETS   D_MISSES   D_SLEEPS  D_IM_GETS
------------------------------ ---------- ---------- ---------- ----------
library cache                       -4186          0          0          0
row cache objects                     -45          0          0          0
enqueues                              -19          0          0          0
enqueue hash chains                   -18          0          0          0
...

The biggest difference lies on the library cache latchacquisition count. Full scan on X$KGLOB needs far more library cache latch acquisition than index scan. More scans on the memory means more CPU assumption, more latch acquisition and potential concurrency problem.

Carelessly written monitoring scripts are likely to make unwanted side effects especially without the knowledge on these facts. Even many commerical monitoring tools are making these kinds of mistakes.

Same principles can be applied to the dictionary views. Following post is reporting an interesting investigation on the performance of querying dictionary views.

Written by Dion Cho

September 29, 2009 at 8:09 am

Posted in Misc.

Tagged with

Interesting case of ORA-01792 error.

with one comment

One of my customers had a very long but not complex SQL statement generated dynamically and hit the following ORA error.

ORA-01792: maximum number of columns in a table or view is 1000

You have very little chance to meet this error in the real life. The definition of the error code is like following.

01792
 "maximum number of columns in a table or view is 1000"
// *Cause: An attempt was made to create a table or view with more than 1000
//         columns, or to add more columns to a table or view which pushes
//         it over the maximum allowable limit of 1000. Note that unused
//         columns in the table are counted toward the 1000 column limit.
// *Action: If the error is a result of a CREATE command, then reduce the
//         number of columns in the command and resubmit. If the error is
//         a result of an ALTER TABLE command, then there are two options:
//         1) If the table contained unused columns, remove them by executing
//            ALTER TABLE DROP UNUSED COLUMNS before adding new columns;
//         2) Reduce the number of columns in the command and resubmit.

But the customer did not have any table whose column count is over 1000. Besides that the SQL statement was just a plain query not any type of DDL.

A close look on the definition would tell you that a view also has a 1000 limitation. So I requested the source SQL statement and the statement has following pattern.

select * from (
   select b, 
       decode(a, 1, 1) as c1,
       decode(a, 2, 2) as c2,
       ...
       decode(a, 1000, 1000) as c1000
   from t1
) 
  • The SQL statement has declared unnecessary inline view which seems to be generated automatically.
  • The inline view has declared just over 1000 dynamic columns using decode.

So, it seems that Oracle cannot declare columns more than 1000 inside a inline view. Let me prove it with following simple test case.

ORA-01792 error with the inline view containing 1001 columns.

UKJA@ukja1021> declare
  2  	     v_cursor	     sys_refcursor;
  3  	     v_sql		     varchar2(32767);
  4  begin
  5  	     v_sql := 'select * from (select ';
  6  	     for idx in 1 .. 1000 loop
  7  		     v_sql := v_sql || rpad('1',10, '1')  || ' as c'|| idx || ', ';
  8  	     end loop;
  9  	     v_sql := v_sql || '1  from dual)';
 10  
 11  	     open v_cursor for v_sql;
 12  	     close v_cursor;
 13  end;
 14  /
declare
*
ERROR at line 1:
ORA-01792: maximum number of columns in a table or view is 1000
ORA-06512: at line 11

Without inline view, no ORA-01792 error.

Elapsed: 00:00:00.06
UKJA@ukja1021> 
UKJA@ukja1021> declare
  2  	     v_cursor	     sys_refcursor;
  3  	     v_sql		     varchar2(32767);
  4  begin
  5  	     v_sql := 'select ';
  6  	     for idx in 1 .. 1000 loop
  7  		     v_sql := v_sql || rpad('1',10, '1')  || ' as c'|| idx || ', ';
  8  	     end loop;
  9  	     v_sql := v_sql || '1  from dual';
 10  
 11  	     open v_cursor for v_sql;
 12  	     close v_cursor;
 13  end;
 14  /

PL/SQL procedure successfully completed.

One thing you should keep in mind is that Oracle limits the 1000 columns while it is checking syntax. Removing view with view merging would have no effect on this 1000 limitation.

This customer had the luck to be free from ORA-01792 error simply by removing unnecessary inline view. But under more complex situation, the workaround would have not been that easy.

Written by Dion Cho

September 10, 2009 at 4:25 am

Posted in Troubleshooting

Tagged with

Follow

Get every new post delivered to your Inbox.

Join 60 other followers