oracle中not exists对外层查询的影响详解
前言
最近同事发现了一个问题,在12c中跑的bufferget很高,但是在10g中跑的buffer很低。怀疑是不是12c的优化器有问题。
这个10g的环境和12c的环境,数据量大致一样,只是有很少部分的不同,但是就是这个很少部分不同,造成了notexists中的子查询返回不同的值,进而对外层查询产生不同的影响。
我们来用如下的代码模拟一下。
初始化数据:
--10g droptablet1; droptablet2; createtablet1(idnumber,namevarchar2(20),dep_idvarchar2(10)); createtablet2(idnumber,namevarchar2(20),dep_idvarchar2(10)); insertintot1selectrownum,'a','kk'fromdualconnectbylevel<=3000000; insertintot2selectrownum,'a','kk'fromdualconnectbylevel<=1000000; insertintot2selectrownum,'a','mm'fromdual; commit; --12c droptablet1; droptablet2; createtablet1(idnumber,namevarchar2(20),dep_idvarchar2(10)); createtablet2(idnumber,namevarchar2(20),dep_idvarchar2(10)); insertintot1selectrownum,'a','kk'fromdualconnectbylevel<=3000000; insertintot2selectrownum,'a','kk'fromdualconnectbylevel<=1000000; commit;
我们看到,12c的数据和10g只是有很少的差别,t1表12c和10g都一样,t2表在12c只是少了一行数据。
--10g SQL>selectdep_id,count(*)fromt1groupbydep_id; DEP_IDCOUNT(*) ------------------------------ kk3000000 SQL>selectdep_id,count(*)fromt2groupbydep_id; DEP_IDCOUNT(*) ------------------------------ mm1 kk1000000 SQL> --12c SQL>selectdep_id,count(*)fromt1groupbydep_id; DEP_IDCOUNT(*) ------------------------------ kk3000000 SQL>selectdep_id,count(*)fromt2groupbydep_id; DEP_IDCOUNT(*) ------------------------------ kk1000000 SQL>
我们将要执行的sql语句是:
selectcount(*) fromt1,t2 wheret1.id=t2.id andt1.dep_id='kk' andnotexists(select1 fromt1,t2 wheret1.id=t2.id andt2.dep_id='mm');
我们先来看执行情况的差距,10g的bufferget小,12c多:
--10g SQL>select/*+gather_plan_statistics*/count(*)fromt1,t2wheret1.id=t2.idandt1.dep_id='kk'andnotexists(select1fromt1,t2wheret1.id=t2.idandt2.dep_id='mm'); COUNT(*) ---------- 0 SQL>select*fromtable(dbms_xplan.display_cursor(null,null,'ALLSTATSLAST')); PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- SQL_ID22t5mb43w55pr,childnumber0 ------------------------------------- select/*+gather_plan_statistics*/count(*)fromt1,t2wheret1.id=t2.idandt1.dep_id='kk'andnot exists(select1fromt1,t2wheret1.id=t2.idandt2.dep_id='mm') Planhashvalue:3404612428 ------------------------------------------------------------------------------------------------------------------ |Id|Operation|Name|Starts|E-Rows|A-Rows|A-Time|Buffers|OMem|1Mem|Used-Mem| ------------------------------------------------------------------------------------------------------------------ |0|SELECTSTATEMENT||1||1|00:00:00.02|2086|||| |1|SORTAGGREGATE||1|1|1|00:00:00.02|2086|||| |*2|FILTER||1||0|00:00:00.02|2086|||| |*3|HASHJOIN||0|901K|0|00:00:00.01|0|39M|5518K|| |4|TABLEACCESSFULL|T2|0|901K|0|00:00:00.01|0|||| |*5|TABLEACCESSFULL|T1|0|2555K|0|00:00:00.01|0|||| |*6|HASHJOIN||1|23|1|00:00:00.02|2086|1517K|1517K|612K(0)| |*7|TABLEACCESSFULL|T2|1|23|1|00:00:00.02|2082|||| |8|TABLEACCESSFULL|T1|1|2555K|1|00:00:00.01|4|||| ------------------------------------------------------------------------------------------------------------------ PredicateInformation(identifiedbyoperationid): --------------------------------------------------- 2-filter(ISNULL) 3-access("T1"."ID"="T2"."ID") 5-filter("T1"."DEP_ID"='kk') 6-access("T1"."ID"="T2"."ID") 7-filter("T2"."DEP_ID"='mm') Note ----- -dynamicsamplingusedforthisstatement 34rowsselected. SQL> --12c SQL>select/*+gather_plan_statistics*/count(*)fromt1,t2wheret1.id=t2.idandt1.dep_id='kk'andnotexists(select1fromt1,t2wheret1.id=t2.idandt2.dep_id='mm'); COUNT(*) ---------- 1000000 SQL>select*fromtable(dbms_xplan.display_cursor(null,null,'ALLSTATSLAST')); PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ SQL_ID22t5mb43w55pr,childnumber0 ------------------------------------- select/*+gather_plan_statistics*/count(*)fromt1,t2where t1.id=t2.idandt1.dep_id='kk'andnotexists(select1fromt1,t2 wheret1.id=t2.idandt2.dep_id='mm') Planhashvalue:1692274438 -------------------------------------------------------------------------------------------------------------------- |Id|Operation|Name|Starts|E-Rows|A-Rows|A-Time|Buffers|OMem|1Mem|Used-Mem| -------------------------------------------------------------------------------------------------------------------- |0|SELECTSTATEMENT||1||1|00:00:00.79|10662|||| |1|SORTAGGREGATE||1|1|1|00:00:00.79|10662|||| |*2|FILTER||1||1000K|00:00:00.74|10662|||| |*3|HASHJOIN||1|1215K|1000K|00:00:00.52|8579|43M|6111K|42M(0)| |4|TABLEACCESSFULL|T2|1|1215K|1000K|00:00:00.01|2083|||| |*5|TABLEACCESSFULL|T1|1|2738K|3000K|00:00:00.07|6496|||| |*6|HASHJOINRIGHTSEMI||1|35|0|00:00:00.02|2083|1245K|1245K|461K(0)| |*7|TABLEACCESSFULL|T2|1|23|0|00:00:00.02|2083|||| |8|TABLEACCESSFULL|T1|0|2738K|0|00:00:00.01|0|||| -------------------------------------------------------------------------------------------------------------------- PredicateInformation(identifiedbyoperationid): --------------------------------------------------- 2-filter(ISNULL) 3-access("T1"."ID"="T2"."ID") 5-filter("T1"."DEP_ID"='kk') 6-access("T1"."ID"="T2"."ID") 7-filter("T2"."DEP_ID"='mm') Note ----- -dynamicstatisticsused:dynamicsampling(level=2) 35rowsselected. SQL> SQL>
可以看到第23,24行,在10g中运行时,buffers是0,而在12c中,即78,79行,buffer是2083+6496。
也就是说在10g中,外层查询不进行t1和t2的扫描,直接返回结果了,而在12c中,外层查询还要进行t1表和t2表层扫描才返回结果。
这其实不是10g和12c的差别,而是notexists的返回数据对外层的影响。子查询要返回0行记录,才满足notexist的条件,从而返回外层查询结果。
在10g中,子查询返回了一行记录
--10g SQL>select1fromt1,t2wheret1.id=t2.idandt2.dep_id='mm'; 1 ---------- 1 SQL>
不满足notexists(即0行才满足),所以,也就不用在外层继续查询了。直接返回记录0行。
在12c中,子查询返回0行记录,满足notexist的条件,所以还需要在外层查询中继续查询。
--12c SQL>selectcount(*)fromt1,t2wheret1.id=t2.idandt2.dep_id='kk'; COUNT(*) ---------- 1000000 SQL>setline1000 SQL>setpages1000 SQL>colPLAN_TABLE_OUTPUTfora250 SQL> SQL> SQL>select/*+gather_plan_statistics*/count(*)fromt1,t2wheret1.id=t2.idandt1.dep_id='kk'andnotexists(select1fromt1,t2wheret1.id=t2.idandt2.dep_id='kk'); COUNT(*) ---------- 0 SQL>select*fromtable(dbms_xplan.display_cursor(null,null,'ALLSTATSLAST')); PLAN_TABLE_OUTPUT --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- SQL_IDc5hj2p2jt1fxf,childnumber0 ------------------------------------- select/*+gather_plan_statistics*/count(*)fromt1,t2where t1.id=t2.idandt1.dep_id='kk'andnotexists(select1fromt1,t2 wheret1.id=t2.idandt2.dep_id='kk') Planhashvalue:1692274438 -------------------------------------------------------------------------------------------------------------------- |Id|Operation|Name|Starts|E-Rows|A-Rows|A-Time|Buffers|OMem|1Mem|Used-Mem| -------------------------------------------------------------------------------------------------------------------- |0|SELECTSTATEMENT||1||1|00:00:00.28|2087|||| |1|SORTAGGREGATE||1|1|1|00:00:00.28|2087|||| |*2|FILTER||1||0|00:00:00.28|2087|||| |*3|HASHJOIN||0|1215K|0|00:00:00.01|0|69M|7428K|| |4|TABLEACCESSFULL|T2|0|1215K|0|00:00:00.01|0|||| |*5|TABLEACCESSFULL|T1|0|2738K|0|00:00:00.01|0|||| |*6|HASHJOINRIGHTSEMI||1|2738K|1|00:00:00.28|2087|43M|6111K|42M(0)| |*7|TABLEACCESSFULL|T2|1|1215K|1000K|00:00:00.12|2083|||| |8|TABLEACCESSFULL|T1|1|2738K|1|00:00:00.01|4|||| -------------------------------------------------------------------------------------------------------------------- PredicateInformation(identifiedbyoperationid): --------------------------------------------------- 2-filter(ISNULL) 3-access("T1"."ID"="T2"."ID") 5-filter("T1"."DEP_ID"='kk') 6-access("T1"."ID"="T2"."ID") 7-filter("T2"."DEP_ID"='kk') Note ----- -dynamicstatisticsused:dynamicsampling(level=2) 35rowsselected. SQL>
可以看到第38,39行的buffer为0.
总结
以上就是这篇文章的全部内容了,希望本文的内容对大家的学习或者工作能带来一定的帮助,如果有疑问大家可以留言交流。