Dion Cho – Oracle Performance Storyteller

We are natural born scientists

Troubleshooting PGA leak with PGA heap dump

with 12 comments

Jonathan Lewis shared interesting and practical experience on PGA memory leak here. Here he mentions on the memory leaks by forall batch insert bug.

This is actually a very interesting case where troubleshooting using PGA heap dump seems to be one of the feasible ways to analyze the problem(as Tanel responded to Jonathan’s request). I would just like to show and discuss how I would investigate this problem and find the solution using PGA heap dump.

This is the simple simulation made by Jonathan Lewis which makes memory leak on PGA.

define m_string_length = 20

drop table t1 purge;
create table t1(v1 varchar2( &m_string_length ));

create or replace procedure proc1 (
	i_rowcount	in number	default 1000000,
	i_bulk_pause	in number	default 0,
	i_forall_pause	in number	default 0,
	i_free_pause	in number	default 0
)
as
	type w_type is table of varchar2( &m_string_length );
	w_list 		w_type := w_type();
	w_free		w_type := w_type();
begin
	for i in 1..i_rowcount loop
		w_list.extend;
		w_list(i) := rpad('x', &m_string_length );
	end loop;

	dbms_lock.sleep(i_bulk_pause);

	forall i in 1..w_list.count
	  insert into t1 values(w_list(i));

	dbms_lock.sleep(i_forall_pause);
	commit;
	w_list := w_free;
	dbms_session.free_unused_user_memory;

	dbms_lock.sleep(i_free_pause);
end;
/



Let me monitor the session memory usage while making multiple calls on this procedure.

UKJA@ukja102> exec proc1;

PL/SQL procedure successfully completed.

SYS@ukja10> select
  2     name, value
  3  from
  4     v$sesstat ss,
  5     v$statname sn
  6  where
  7     sn.name like '%ga memory%'
  8  and        ss.statistic# = sn.statistic#
  9  and        ss.sid = 149
 10  ;

NAME                                      VALUE
------------------------------ ----------------
session uga memory                  498,942,840
session uga memory max              500,053,908
session pga memory                  599,907,924
session pga memory max              675,602,004

UKJA@ukja102> exec proc1;

PL/SQL procedure successfully completed.

NAME                                      VALUE
------------------------------ ----------------
session uga memory                  695,429,288
session uga memory max              695,429,288
session pga memory                  904,191,572
session pga memory max              904,257,108

UKJA@ukja102> exec proc1;

PL/SQL procedure successfully completed.

NAME                                      VALUE
------------------------------ ----------------
session uga memory                  840,565,088
session uga memory max              840,565,088
session pga memory                1,077,861,972
session pga memory max            1,077,861,972



Very serious memory leak! Isn’t it? 1) Oracle does not release memory even after the call has completed and 2) Oracle does not 100% reuse the memory.

The problem is that it’s almost impossible to identify on which objects we’re losing memory. As far as I know, Oracle does not expose the PGA memory structure by v$ views nor x$views. (Jonathan Lewis pointed me to the X$KSMPP view)

So what else can we do? PGA heap dump is the only thing left to us.

oradebug setospid 6760
oradebug dump heapdump 1



The heap dump file is quite easy to interpret, but it is sometimes too long to interpret with our eyes. So I built heap_analyze.sql to retrieve summarized report from raw dump file. Here is the result for the PGA heap dump.

UKJA@ukja102> @heap_analyze heap_dump_1.trc
UKJA@ukja102> set echo off

     TSIZE
----------
 598488868

Elapsed: 00:00:00.57

ATYPE                         CSIZE  HEAP_SIZE  RATIO
------------------------ ---------- ---------- ------
freeable                  498650144  598488868  83.32
perm                         181928  598488868    .03
recreate                      87180  598488868    .01
free                       99569616  598488868  16.64

Elapsed: 00:00:00.75

CTYPE                               CSIZE  HEAP_SIZE  RATIO
------------------------------ ---------- ---------- ------
Alloc environm                       4144  598488868    .00
Fixed Uga                           20572  598488868    .00
KFIO PGA struct                        72  598488868    .00
KFK PGA                               260  598488868    .00
KJZT context                           60  598488868    .00
KSFQ heap                            3928  598488868    .00
KSFQ heap descr                        92  598488868    .00
PLS cca hp desc                       212  598488868    .00
PLS non-lib hp                      18560  598488868    .00
callheap                             2144  598488868    .00
external name                          24  598488868    .00
joxp heap                            2000  598488868    .00
kews sqlstat st                      1292  598488868    .00
kgh stack                           17012  598488868    .00
kjztprq struct                       2068  598488868    .00
koh-kghu call h                      1328  598488868    .00
kopolal dvoid                        2524  598488868    .00
kpuinit env han                      1584  598488868    .00
kzsna:login nam                        24  598488868    .00
ldm context                         12712  598488868    .00
perm                               181928  598488868    .03
qmtmInit                            13980  598488868    .00
session heap                    498632732  598488868  83.32
free                             99569616  598488868  16.64

24 rows selected.

Elapsed: 00:00:00.65

DS                        CSIZE  HEAP_SIZE  RATIO
-------------------- ---------- ---------- ------
083BD9CC                  10320  598488868    .00
08563470                  12436  598488868    .00
085A7600              498567256  598488868  83.30
free                   99898856  598488868  16.69

Elapsed: 00:00:00.68



Hm… nothing special except the fact that the subheap(085A7600) occupies 83.3% of the PGA heap. So I do subheap dump for that heap address.

oradebug dump heapdump_addr 1 0xa067600

UKJA@ukja102> @heap_analyze heap_subdump_1.trc
UKJA@ukja102> set echo off

     TSIZE
----------
 498561776

Elapsed: 00:00:00.40

ATYPE                         CSIZE  HEAP_SIZE  RATIO
------------------------ ---------- ---------- ------
freeable                  488817960  498561776  98.05
perm                          54896  498561776    .01
recreate                      14852  498561776    .00
free                        9674068  498561776   1.94

Elapsed: 00:00:00.56

CTYPE                               CSIZE  HEAP_SIZE  RATIO
------------------------------ ---------- ---------- ------
PLS cca hp desc                       400  498561776    .00
PLS non-lib hp                  488740608  498561776  98.03
Session NCHAR l                       552  498561776    .00
kgict                                  40  498561776    .00
kgicttab                               44  498561776    .00
kgicu                                  92  498561776    .00
kgiob                                1928  498561776    .00
kgiobdtb                              192  498561776    .00
kgsc ht segs                         5720  498561776    .00
koddcal dvoid                          24  498561776    .00
kodmcon kodmc                          64  498561776    .00
kodpai image                         1036  498561776    .00
koh-kghu sessi                      15888  498561776    .00
koh-kghu sessio                     14252  498561776    .00
kokahin kgglk                         140  498561776    .00
kokl lob id has                      1036  498561776    .00
kokl transactio                       268  498561776    .00
koklug hlctx in                        24  498561776    .00
koklug hxctx in                        24  498561776    .00
kpuinit env han                      1584  498561776    .00
kqlpWrntoStr:st                       112  498561776    .00
ksulu : ksulueo                        40  498561776    .00
kwqaalag                               92  498561776    .00
kwqb context me                        92  498561776    .00
kwqica hash tab                        92  498561776    .00
kwqidwh memory                         92  498561776    .00
kwqmahal                               92  498561776    .00
kxs-heap-d                           1036  498561776    .00
kxs-heap-p                           4148  498561776    .00
kxs-krole                             780  498561776    .00
kxsFrame4kPage                      28840  498561776    .01
kxsc: kkspbd0                         968  498561776    .00
kxsc: kkspsc0                        7756  498561776    .00
kzctxhugi1                           4108  498561776    .00
kzsrcrdi                               60  498561776    .00
perm                                54896  498561776    .01
session languag                       552  498561776    .00
system trigger                         36  498561776    .00
free                              9674068  498561776   1.94

39 rows selected.

Elapsed: 00:00:00.51

DS                        CSIZE  HEAP_SIZE  RATIO
-------------------- ---------- ---------- ------
08B3608C                   2076  498561776    .00
08B3EAAC              488738512  498561776  98.03
08B578F4                   2080  498561776    .00
08B5A114                   2080  498561776    .00
0FD400B0                   2080  498561776    .00
free                    9814948  498561776   1.97

6 rows selected.

Elapsed: 00:00:00.54



Well, again, another subheap(08B3EAAC) does occupy 98% of the space.

oradebug dump heapdump_addr 1 0x0A08EAAC

UKJA@ukja102> @heap_analyze heap_subdump_2.trc
UKJA@ukja102> set echo off

     TSIZE
----------
 488662552

Elapsed: 00:00:02.50

ATYPE                         CSIZE  HEAP_SIZE  RATIO
------------------------ ---------- ---------- ------
freeable                    4750540  488662552    .97
perm                            144  488662552    .00
free                      483911868  488662552  99.03

Elapsed: 00:00:03.37

CTYPE                               CSIZE  HEAP_SIZE  RATIO
------------------------------ ---------- ---------- ------
DPAGE                             4750096  488662552    .97
peihstdep                             260  488662552    .00
perm                                  144  488662552    .00
pl_iot_alloc                           92  488662552    .00
pl_lut_alloc                           92  488662552    .00
free                            483911868  488662552  99.03

6 rows selected.

Elapsed: 00:00:03.20

no rows selected

Elapsed: 00:00:02.42



Okay, now I see the problem. The deepest subheap occupies 466M(488662552) but, the 99% is FREE! Oracle does not release the free space even after the call has completed and even denies to reuse it 100%.

This certainly looks like a bug and I search metalink using these keywords.

metalink search

And this links me to the bug 5866410 which is described as the free chunks problem with FORALL insertion.

Bulk insert in PLSQL can consume a large amount of PGA memory which can lead to ORA-4030 errors.

A heapdump will show lot of free memory in the free lists which is not used but instead fresh allocations are made.

Well, above process might seem natural and easy for who read this post, but following things make the process somewhat difficult to follow

  • The heap dump file is sometimes quite long and hard to analyze without proper analysis tool.
  • Heap has hierarchical structure as shown above. It sometimes is composed of multiple subheaps and subheap itself could be composed of multiple subheapds.
  • The object names are sometimes hard to decode.

But I believe that above test case proves the usefulness of the PGA heap dump analysis and it would turn out to be very successful under certain circumstances.

Written by Dion Cho

June 9, 2009 at 5:35 am

12 Responses

Subscribe to comments with RSS.

  1. Dion,

    Nice presentation.

    The problem is that it’s almost impossible to identify on which objects we’re losing memory. As far as I know, Oracle does not expose the PGA memory structure by v$ views nor x$views.

    If you want to play with the pga heap from SQL, you might want to take a look at x$ksmpp.

    Jonathan Lewis

    June 9, 2009 at 6:43 am

  2. A couple of other items you might want to look at:

    As far as I can tell, x$ksmup is the session heap; x$ksmsp is the sga heap (and x$ksmspr is the “shared pool reserved).

    x$ksmhp exposes the subheaps of the sga heap, but you can only address it by ksmchds, and x$ksmmem is the complete memory array. But x$ksmhp and x$ksmmem are only for the sga heap – I haven’t found their equivalents for the pga and session heaps.

    Jonathan Lewis

    June 9, 2009 at 7:20 am

  3. @Jonathan.

    Thank you for the infos.

    I would look into those views for further investigation.

    Dion Cho

    June 9, 2009 at 7:23 am

  4. The script heap_analyse must be refined. Spacing in trace file is platform dependent. On my Linux x86_64 the following line “select to_number(substr(column_value, 21, 9)) as csize,” return empty string and the to_number make it error. After investigation the heap size value in trace file was at 30 not 21.

    I suggest using regexp_replace to extract values
    ” select to_number(regexp_replace(column_value,’.*sz=[[:space:]]+([[:digit:]]+)[[:space:]].*’,’\1′)) as csize”

    tested it and it worked.

    B. Polarski

    June 9, 2009 at 12:16 pm

  5. humm… and the rest must also be transformed:

    select
    to_number(regexp_replace(column_value,’.*sz=[[:space:]]+([[:digit:]]+)[[:space:]].*’,’\1′)) csize,
    rtrim(regexp_replace(column_value,’.*sz=[[:space:]]+[[:digit:]]+[[:space:]]+([[:alnum:]]+)[[:space:]].*’,’\1′)) atype,
    rtrim(regexp_replace(column_value,’.*[“]+(.*)[“].*’,’\1′)) ctype,
    rtrim(regexp_replace(column_value,’.*[“]+.*[“](.*)’,’\1′)) ds
    from table(get_trace_file2(‘&1’))

    B. Polarski

    June 9, 2009 at 12:53 pm

  6. @Polarski

    Thanks for the work.

    That’s why I left the comment “this works only for local database”. It was just a dirty sample and you made it better!

    Dion Cho

    June 10, 2009 at 2:41 am

  7. […] 1-How to deal with memory leaks by using memory dumps. (PGA leak bug) Jonathan Lewis – PGA Leaks Dion Cho – Troubleshooting PGA leak with PGA heap dump […]

  8. Hello Don,

    With impdp we get
    ORA-04030: out of process memory when trying to allocate 2080 bytes (kxs-heap-c,kghsseg: qcstxsInit)
    As far as we know no real memory shortage.

    Could impdp use array bulk inserts and hit the same bug?

    Regards Hans-Peter

    Regards Hans-Peter

    Hans-Peter Sloot

    August 20, 2009 at 9:48 am

    • Hans.

      1. Have you checked your OS configuration?
      2. And tried analyzing PGA heap dump?
      3. And searched metalink for that?

      Dion Cho

      August 21, 2009 at 11:52 pm

      • Hi Dion,

        Sorry I should have added.
        The ulimit where checked again and again.
        The impdp shows various ORA-04030. Until now we have generated at least 4 different.
        I have no direct access to the instance.
        The dba’s managed to get a heap dump for (sort subheap,sort key) generated while impdp is creating indexes. The heapdump shows that the sort subheap has only 6k free whereas 64k is requested. The total pga is 358M with 6M free in various subheaps.
        Metalink is not of much help.
        Oracle Support isn’t either alas.
        The just tell us to exclude indexes + statistics + procedure etc.

        Regards Hans-Peter

        Hans-Peter Sloot

        August 24, 2009 at 8:32 am

        • @Hans.

          Thansk for the adding.

          Without proper data(including PGA heap dump + corresponding subheap dump) when 4030 took place, there would be not much thing to tell for me.

          You might like to post the question on the OTN forum and/or freelist.

          Regards.

          Dion Cho

          August 24, 2009 at 9:33 am

  9. Dion,

    This is really good stuff, and help me immensely this morning while troubleshooting what looked like a leak in a sessions PGA. Thanks for taking the time to post it.

    Thanks,

    Steve

    Steve Howard

    March 8, 2010 at 3:01 pm


Leave a comment