Dion Cho – Oracle Performance Storyteller

We are natural born scientists

Archive for the ‘Transaction’ Category

Detecting serializable transaction

leave a comment »

One of our customers had this problem.

We hit ORA-08177 : can’t serialize access for this transaction error without any reason. As far as I know, we do not use serializable transaction. What should I do now?

My recommendation was to check it themselves whether they actually have serializable transactions.

Metalink note# 252545.1 provides a useful technique to identify serializable transaction. Here goes a very simple test case to demonstrate it.

1. Create serializable transaction in session #1

UKJA@ukja102> set transaction isolation level serializable;

Transaction set.

UKJA@ukja102> update t1 set c1 = c1 + 1 where rownum = 1;

1 row updated.




2. Check which session is transacting serializably in session #2

UKJA@ukja102> select s.sid, s.serial#,
  2    substr(q.sql_text, 1, 100) as sql_text
  3  from
  4    v$session s, v$sql q
  5  where
  6    s.taddr = (select addr from v$transaction
  7        where bitand(flag,268435456) = 268435456)
  8    and ((s.prev_hash_value = q.hash_value and s.prev_sql_addr = q.address)
  9         or (s.sql_hash_value = q.hash_value and s.sql_address = q.address))
 10  ;

       SID    SERIAL#
---------- ----------
SQL_TEXT
--------------------------------------------------------------------------------
       149        829
update t1 set c1 = c1 + 1 where rownum = 1




Cool, isn’t it?

=========================================================

One tiny complaint(but not significant here) is the potential performance problem of the above query. See the execution plan.

UKJA@ukja102> @plan
--------------------------------------------------------------
| Id  | Operation                | Name              | Rows  |
--------------------------------------------------------------
|   0 | SELECT STATEMENT         |                   |     1 |
|   1 |  NESTED LOOPS            |                   |     1 |
|   2 |   NESTED LOOPS           |                   |     1 |
|*  3 |    FIXED TABLE FULL      | X$KGLCURSOR_CHILD |     1 |
|*  4 |    FIXED TABLE FULL      | X$KSUSE           |     1 |
|*  5 |     FIXED TABLE FULL     | X$KTCXB           |     1 |
|*  6 |   FIXED TABLE FIXED INDEX| X$KSLED (ind:2)   |     1 |
--------------------------------------------------------------





It full scans the sql child cursors which means that the performance might not be good when you have tons of child cursors. The X$KGLCHILD_CURSOR(which is actually X$KGLOB) has following indexes.

UKJA@ukja102> select * from v$indexed_fixed_column
  2  where table_name = 'X$KGLOB';

TABLE_NAME           INDEX_NUMBER COLUMN_NAME          COLUMN_POSITION
-------------------- ------------ -------------------- ---------------
X$KGLOB                         1 KGLNAHSH                           0
X$KGLOB                         2 KGLOBT03                           0





So I would rewrite the query using union all.

UKJA@ukja102> select s.sid, s.serial#,
  2    substr(q.sql_text, 1, 100) as sql_text
  3  from
  4    v$session s, v$sql q
  5  where
  6    s.taddr = (select addr from v$transaction
  7        where bitand(flag,268435456) = 268435456)
  8    and (s.prev_hash_value = q.hash_value and s.prev_sql_addr = q.address)
  9  union all
 10   select s.sid, s.serial#,
 11    substr(q.sql_text, 1, 100) as sql_text
 12  from
 13    v$session s, v$sql q
 14  where
 15    s.taddr = (select addr from v$transaction
 16        where bitand(flag,268435456) = 268435456)
 17    and (s.sql_hash_value = q.hash_value and s.sql_address = q.address)
 18  ;

------------------------------------------------------------------------
| Id  | Operation                  | Name                      | Rows  |
------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |                           |     2 |
|   1 |  UNION-ALL                 |                           |       |
|   2 |   NESTED LOOPS             |                           |     1 |
|   3 |    NESTED LOOPS            |                           |     1 |
|*  4 |     FIXED TABLE FULL       | X$KSUSE                   |     1 |
|*  5 |      FIXED TABLE FULL      | X$KTCXB                   |     1 |
|*  6 |     FIXED TABLE FIXED INDEX| X$KSLED (ind:2)           |     1 |
|*  7 |    FIXED TABLE FIXED INDEX | X$KGLCURSOR_CHILD (ind:1) |     1 |
|   8 |   NESTED LOOPS             |                           |     1 |
|   9 |    NESTED LOOPS            |                           |     1 |
|* 10 |     FIXED TABLE FULL       | X$KSUSE                   |     1 |
|* 11 |      FIXED TABLE FULL      | X$KTCXB                   |     1 |
|* 12 |     FIXED TABLE FIXED INDEX| X$KSLED (ind:2)           |     1 |
|* 13 |    FIXED TABLE FIXED INDEX | X$KGLCURSOR_CHILD (ind:1) |     1 |
------------------------------------------------------------------------





As I mentioned here, a general query tuning technique should always be applied to queries on v$ or x$ views.

Written by Dion Cho

April 10, 2009 at 8:18 am

Rethinking on ORA-01555 error

with 6 comments

You might have nothing to add on ORA-01555 error. It’s just too well known issue and there are tons of documents and helps available already.

But my test case might make you rethink on this issue.

1. Case A – Undo block overwritten

I made a very simple test case of ORA-01555 by undo block overwritten.

select * from t1 where fget1 = 1
*
ERROR at line 1:
ORA-01555: snapshot too old: rollback segment number 24 with name "_SYSSMU24$" too small



The tkprof report is as following.

select * 
from
 t1 where fget1 = 1


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00      10.32          0          0          0           0  
    <-- Ignore this. Due to my trick for quick 1555 error
Fetch        1      0.03       1.25        569       1000          0           0   
    <-- Look! 1000 logical reads for 0 row!
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        3      0.03      11.58        569       1000          0           0

Misses in library cache during parse: 0
Optimizer mode: ALL_ROWS
Parsing user id: 61  

Rows     Row Source Operation
-------  ---------------------------------------------------
      0  FILTER  (cr=0 pr=0 pw=0 time=10326555 us)  
      0   TABLE ACCESS FULL T1 (cr=0 pr=0 pw=0 time=27 us) 
      <-- But no actual reads on the table t1



I have no actual reads on the table t1, but tkprof report is saying that I read 1,000 blocks! Where does this number come from?

NAME                                             DIFF       VALUE1       VALUE2
---------------------------------------- ------------ ------------ ------------
...
session logical reads                           1,079           72        1,151
consistent changes                              1,049            1        1,050
consistent gets from cache                      1,013           71        1,084
consistent gets                                 1,013           71        1,084
consistent gets - examination                     997           11        1,008
data blocks consistent reads - undo reco          995            0          995
rds applied
...



Yes! I have 995 consistent reads to apply undo record! This means that Oracle read up 1000 undo blocks for read consistency. Very impressive but disappointing. :)

2. Case B – Transaction table overwritten

I made a another simple test case of ORA-01555 by transaction table overwritten.

select /*+ index(t1) */ * 
from
 t1 where fget1(c1) = 1 and c1 > 0
;

ERROR:
ORA-01555: snapshot too old: rollback segment number 44 with name "_SYSSMU44$" too small



The tkprof report is as following.

select /*+ index(t1) */ * 
from
 t1 where fget1(c1) = 1 and c1 > 0


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.04       0.04          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch       34      0.21      68.05        701      42802          0          66
<-- 42,802 reads for 66 rows!
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total       36      0.26      68.10        701      42802          0          66

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 61  

Rows     Row Source Operation
-------  ---------------------------------------------------
     66  TABLE ACCESS BY INDEX ROWID T1 (cr=102 pr=15 pw=0 time=66111732 us)
     67   INDEX RANGE SCAN T1_N1 (cr=67 pr=14 pw=0 time=67090640 us)(object id 82550)
   <-- But only 102 reads on the table and index!






Again, I have no actual reads on the table and index, but tkprof report is saying that I read 42,802 blocks! Holly shit! Where does this number come from?

NAME                                             DIFF       VALUE1       VALUE2
---------------------------------------- ------------ ------------ ------------
...
session logical reads                          43,043        4,720       47,763
consistent gets from cache                     43,019        2,116       45,135
consistent gets                                43,019        2,116       45,135
consistent gets - examination                  42,808          799       43,607
consistent changes                             42,749        1,730       44,479
transaction tables consistent reads - un       42,729            0       42,729
do records applied
...



Out of 42,802 reads, I read 42,729 blocks to rollback transaction table!

This is not the end. Following is latch actitvity of transaction table rollback.

LATCH_NAME                         D_GETS   D_MISSES   D_SLEEPS  D_IM_GETS
------------------------------ ---------- ---------- ---------- ----------
row cache objects                 9406163          0          0          0
cache buffers chains              2288055         94          2       2761
undo global data                  1298330         85          0          0
enqueue hash chains                920170          1          0          0
...



9.4M row cache objects latch acquistion! For what? I’ve profiled the row cache objects latch.

       SID LADDR    CALLED                  NAME                       HITS
---------- -------- -------------------------------------------- ----------
       115 2D0B4010 kqreqd                  row cache objects           123
       115 2D0B4010 kqreqd: reget           row cache objects           849
       115 2D0B4010 kqrpre: find obj        row cache objects          3895
       115 2DF245DC kqreqd                  row cache objects            13
       115 2DF245DC kqreqd: reget           row cache objects            24
       115 2DF245DC kqrpre: find obj        row cache objects            80
       115 2DFA4644 kqreqd                  row cache objects           132
       115 2DFA4644 kqreqd: reget           row cache objects           227
       115 2DFA4644 kqrpre: find obj        row cache objects           185



Hm… Looks like Oracle is very busy to find some row cache objects. Which objects?

PARAMETER                            D_GETS
-------------------------------- ----------
dc_rollback_segments                2963371
dc_tablespaces                       158642
dc_segments                           19719
dc_users                               1081
...



Makes sense. Looks like Oracle needs to read dictionary to lookup the rollback segment information to rollback transaction tables.

The most traditional way to analyze ORA-01555 is to look into the V$UNDOSTAT view. But above test cases show the importance and power of investingation with statistics, latch activity(including profiling) and even row cache activity.

More is better, isn’t it?

Written by Dion Cho

April 4, 2009 at 2:49 pm

Oracle’s commit is fast – What about SQL Server?

with 4 comments

1. Oracle updates 10 rows and commit

2. Oracle updates 1,000,000,000 rows and commit

Whose commit is faster?

The answer is “both of them are fast enough”. This is due to fast commit mechanism.

Fast commit means that Oracle does not clean out all dirty blocks modified by the transaction. The cleanout is delayed. The dirty blocks are cleaned out when accessed at later time(and maybe by another session). This is called delayed block cleanout.

Delayed block cleanout has many problems(like notorious 1555 error), but anyhow, by virtue of it, Oracle’s commit is always fast irrespective of the amount of modified data.

Now my concern is “okay, what about SQL Server?”

1. SQL Server updates 10 rows and commit.

2. SQL Server updates 1,000,000,000 rows and commit.

Whose commit is faster?

My first thought was as following. “Unlike Oracle, SQL Server manges row level lock per row basis(Oracle’s row level lock is per transaction basis). So, the more rows you update, the heavier your commit is”.

But,  think one more time!!!

Lock escalation.

With 1,000,000,000 rows updates, SQL Server would have already escalated row level lock to table level lock. This means that you don’t need to clean out all the rows. Only table level lock is cleaned out and that’s all.

I believe that the threshold of lock escalation is 3,000~5,000 rows(but not sure).

Yes, both Oracle and SQL Server has always fast commit performance, but the mechanisms are quite mutually opposite!

Written by Dion Cho

February 17, 2009 at 1:28 pm

Follow

Get every new post delivered to your Inbox.

Join 64 other followers