Batching NLJ optimization and ordering
You might be aware of the batching nested loops join(batching NLJ) optimization introduced at Oracle 11g. For who are not aware of this new feature, let me show you two different execution plans(Oracle 10g vs. 11g) of the same SQL.
-- Oracle 10g ------------------------------------------------ | Id | Operation | Name | ------------------------------------------------ | 0 | SELECT STATEMENT | | |* 1 | COUNT STOPKEY | | | 2 | TABLE ACCESS BY INDEX ROWID | T2 | <-- Here | 3 | NESTED LOOPS | | | 4 | TABLE ACCESS BY INDEX ROWID| T1 | |* 5 | INDEX RANGE SCAN | T1_N1 | |* 6 | INDEX RANGE SCAN | T2_N1 | ------------------------------------------------ -- Oracle 11g ------------------------------------------------ | Id | Operation | Name | ------------------------------------------------ | 0 | SELECT STATEMENT | | |* 1 | COUNT STOPKEY | | | 2 | NESTED LOOPS | | | 3 | NESTED LOOPS | | | 4 | TABLE ACCESS BY INDEX ROWID| T1 | |* 5 | INDEX RANGE SCAN | T1_N1 | |* 6 | INDEX RANGE SCAN | T2_N1 | | 7 | TABLE ACCESS BY INDEX ROWID | T2 | <-- And here ------------------------------------------------
You might note that Oracle 10g and 11g have very different location of the table lookup operation on the table T2. This is a symptom of the nested loops join optimization of Oracle 11g. By using batching NLJ, you might see a decreased logical reads and improved performance.
So far, so good, but a couple of days ago, I hit a somewhat strange ordering problem with the batching NLJ. Following is a replayable test case. Just note that the query is using index(t1_n1) to implement the pagination without the overhead of ORDER BY.
create table t1 as select 1 as c1, mod(level, 4) as c2, level as c3, level as c4, rpad('x',1000) as dummy from dual connect by level <= 1000; create table t2 as select 1001-level as c1, level as c2, rpad('x',1000) as dummy from dual connect by level <= 100; create index t1_n1 on t1(c1, c2, c3); create index t2_n1 on t2(c1); exec dbms_stats.gather_table_stats(user, 't1'); exec dbms_stats.gather_table_stats(user, 't2'); explain plan for select /*+ leading(t1 t2) use_nl(t2) index_asc(t1) index_asc(t2) */ rownum as rnum, t2.c1, t1.c4, t2.c2 from t1, t2 where t1.c3 = t2.c1 and t1.c1 = 1 and t1.c2 = 0 and rownum <= 20 ; select * from table(dbms_xplan.display); -- Read from the disk alter system flush buffer_cache; select * from ( select /*+ leading(t1 t2) use_nl(t2) index_asc(t1) index_asc(t2) */ rownum as rnum, t2.c1, t1.c4, t2.c2 from t1, t2 where t1.c3 = t2.c1 and t1.c1 = 1 and t1.c2 = 0 and rownum <= 20 ) where rnum >= 15 ; -- Read from the buffer cache select * from ( select /*+ leading(t1 t2) use_nl(t2) index_asc(t1) index_asc(t2) */ rownum as rnum, t2.c1, t1.c4, t2.c2 from t1, t2 where t1.c3 = t2.c1 and t1.c1 = 1 and t1.c2 = 0 and rownum <= 20 ) where rnum >= 15 ; -- Disable exceptions for buffer cache misses alter session set "_nlj_batching_misses_enabled" = 0; -- Read from the disk alter system flush buffer_cache; select * from ( select /*+ leading(t1 t2) use_nl(t2) index_asc(t1) index_asc(t2) */ rownum as rnum, t2.c1, t1.c4, t2.c2 from t1, t2 where t1.c3 = t2.c1 and t1.c1 = 1 and t1.c2 = 0 and rownum <= 20 ) where rnum >= 15 ; -- Read from the buffer cache select * from ( select /*+ leading(t1 t2) use_nl(t2) index_asc(t1) index_asc(t2) */ rownum as rnum, t2.c1, t1.c4, t2.c2 from t1, t2 where t1.c3 = t2.c1 and t1.c1 = 1 and t1.c2 = 0 and rownum <= 20 ) where rnum >= 15 ;
To save your time, let me show you the result of the above test case with some comments appended.
-- Case1 : batching NLJ enabled -- when the query reads from the disk RNUM C1 C4 C2 ---------- ---------- ---------- ---------- 15 960 960 41 16 964 964 37 17 980 980 21 <-- Why 980 here? 18 968 968 33 19 972 972 29 20 976 976 25 -- when the query reads from the buffer cache RNUM C1 C4 C2 ---------- ---------- ---------- ---------- 15 960 960 41 16 964 964 37 17 968 968 33 18 972 972 29 19 976 976 25 20 980 980 21 -- Case 2: batching NLJ disabled -- when the query reads from the disk RNUM C1 C4 C2 ---------- ---------- ---------- ---------- 15 960 960 41 16 964 964 37 17 968 968 33 18 972 972 29 19 976 976 25 20 980 980 21 -- when the query reads from the buffer cache RNUM C1 C4 C2 ---------- ---------- ---------- ---------- 15 960 960 41 16 964 964 37 17 968 968 33 18 972 972 29 19 976 976 25 20 980 980 21
The simple explanation is :
“The new optimization code of the nested loops join does not guarantee that the rows would be returned as an order of the outer table, especially when it reads the data from the disk.”
This could be a limitation when you want to get the ordered rows using index and NLJ in the pagination query, but I don’t believe that this could be classified as a bug. The only way to ensure the order of the returning rows is always to use ORDER BY clause.
But if you still want to use INDEX not ORDER BY in the above query(which was a natural selection in Oracle 10g), the workarounds is one of the followings.
- Set parameter – _nlj_batching_misses_enabled = 0;
- Set parameter – _nlj_batching_enabled = 0;
- Append hint – NO_NLJ_BATCHING(t2)
It seems that I need to do further research on the batching NLJ – it’s exact mechanism and the pitfalls.
Hi Dion,
Which version are you using ?
I cannot see the behavior on 11.2.0.1 on Linux
coskan
August 16, 2010 at 8:15 am
Hi Coskan.
I can reproduce it on my 11.1.0.6 and 11.2.0.1 database on Windows(32bit).
Could you change the statement like following? (broaden the range of the returning rows)
Dion Cho
August 17, 2010 at 12:42 am
Hi Dion,
Could you elaborate on the mechanics of the NLJ-batching?
How is the improvement reached?
regards Hans-Peter
Hans-Peter
August 17, 2010 at 9:57 am
Hi, Hans.
For now, I don’t have a clear picture on NLJ batching and prefetching. I would do deeper research sometime myself, but still expect that some gurus would unveil it in the near future.
Dion Cho
August 18, 2010 at 6:53 am
Most of optimizer feature in new version may cause wrong result,Oh…h, Why they aren’t disable by default….
maclean
September 9, 2010 at 2:56 pm