double trouble
In the latest Quiz Night, I asked how you could make a query more efficient by changing a two table join into a three table join – with the clue that my third table was a repeat of the first table. Gary Myers, in comment 4, provided the type of answer I was looking for. Sometimes it is more efficient to get a small amount of data from a table on a first pass then go back and get the rest of the data on a second pass – especially if the first pass is an ‘index only’ operation.
I’ve created a little demonstration that gives you some idea of the approach:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
|
create table t1 as with generator as ( select --+ materialize
rownum id
from dual
connect by
rownum <= 10000
) select rownum id,
mod(rownum,100) mod1,
trunc(dbms_random.value(0,10000)) random1,
lpad(rownum,10,'0') small_vc,
rpad('x',60) padding
from generator v1,
generator v2
where rownum <= 100000
; create table t2 as with generator as ( select --+ materialize
rownum id
from dual
connect by
rownum <= 10000
) select rownum id,
mod(rownum,100) mod2,
trunc(dbms_random.value(0,10000)) random2,
lpad(rownum,10,'0') small_vc,
rpad('x',60) padding
from generator v1,
generator v2
where rownum <= 100000
; create index t1_i1 on t1(mod1, random1); create index t2_i1 on t2(mod2, random2); |
This creates two tables of 100,000 (fairly short) rows. Note the mod columns which return 1,000 rows per value, and the random columns which return approximately 10 rows per value. When I give Oracle the following query, it overestimates the final result set and chooses what I know to be a relatively resource-intensive execution plan:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
|
select t1.padding,
t2.padding
from t1, t2
where t1.mod1 = 50
and t2.random2 = t1.random1 and t2.mod2 = 50 ; ----------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | ----------------------------------------------------------- | 0 | SELECT STATEMENT | | 1045 | 138K| 388 | |* 1 | HASH JOIN | | 1045 | 138K| 388 | |* 2 | TABLE ACCESS FULL| T1 | 1000 | 68000 | 193 | |* 3 | TABLE ACCESS FULL| T2 | 1000 | 68000 | 193 | ----------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - access("T2"."RANDOM2"="T1"."RANDOM1")
2 - filter("T1"."MOD1"=50)
3 - filter("T2"."MOD2"=50)
|
This plan is dictated largely by the fact that I have to collect quite a lot of data from both tables then eliminate a large fraction of the data I have collected. This pattern is the driver for what I am about to do: I know that I want a small volume of data eventually but if I have to go to the table at every step of the plan then I will have to do a lot of redundant work and carry a lot of redundant data at some point. Remember – it’s often the case that “visiting the table” is the expensive part of any query.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
|
select /*+
leading(t1 t2 t3)
use_nl(t3)
rowid(t3)
*/
t3.padding,
t2.padding
from t1,
t2,
t1 t3
where t1.mod1 = 50
and t2.random2 = t1.random1 and t2.mod2 = 50 and t3.rowid = t1.rowid ; --------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | --------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1045 | 163K| 1244 | | 1 | NESTED LOOPS | | 1045 | 163K| 1244 | |* 2 | HASH JOIN | | 1045 | 90915 | 199 | |* 3 | INDEX RANGE SCAN | T1_I1 | 1000 | 19000 | 4 | |* 4 | TABLE ACCESS FULL | T2 | 1000 | 68000 | 193 | | 5 | TABLE ACCESS BY USER ROWID| T1 | 1 | 73 | 1 | --------------------------------------------------------------------- Query Block Name / Object Alias (identified by operation id): ------------------------------------------------------------- 1 - SEL$1
3 - SEL$1 / T1@SEL$1
4 - SEL$1 / T2@SEL$1
5 - SEL$1 / T3@SEL$1
Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("T2"."RANDOM2"="T1"."RANDOM1")
3 - access("T1"."MOD1"=50)
4 - filter("T2"."MOD2"=50)
|
I don’t think the optimizer can generate a plan like this at present – but I may be wrong. I’ve reduced my workload by taking advantage of an existing index on table t1 to do a range scan that picks up only the columns that I need to join to t2. In this case the t2 access path was still a full tablescan – but even so I have reduced the workload against t1, and by the time I revisit it by rowid I will only be visiting the (relatively) small number of rows I really need.
(Left as an exercise to the reader: I could have written the query as a four part join, visiting both table segments by rowid for just those rows that I really needed; have a go, and check that you’ve got it right. Don’t forget that any references to the “non-index” columns that appear in the query have to be changed to reference the second occurrence of the table – note how I’ve changed t1.padding in my original query to t3.padding in the rewrite.)
Footnote:
If you think this type of path is a little odd take a look at the typical stucture of a nested loop join that appears under “nlj_batching” in 11g (this isnt the same t1 and t2 as above, by the way):
1
2
3
4
5
6
7
8
9
10
11
12
13
|
select /*+ ordered use_nl(t1) index(t1(n1)) */
t2.n1, t1.n2
from t2,t1 where t2.n2 = 45
and t1.n1 = t2.n1; ------------------------------------------------------ | Id | Operation | Name | Rows | ------------------------------------------------------ | 0 | SELECT STATEMENT | | 225 | | 1 | NESTED LOOPS | | | | 2 | NESTED LOOPS | | 225 | |* 3 | TABLE ACCESS FULL | T2 | 15 | |* 4 | INDEX RANGE SCAN | T1_I1 | 15 | | 5 | TABLE ACCESS BY INDEX ROWID| T1 | 15 | ------------------------------------------------------ |
Notice how Oracle can present a single join as two nested loops – one into the index and a second into the table. This is why I think there may be options within the optimizer to do my little trick automatically – if not now, then soon.
Update June 2012
I’ve just had an exchange of email with an Oak Table member who has pointed me to US patent 8103658 (dated November 2009) – which looks like a remarkably good description of this technique. So maybe the method will become an automatic option for the optimizer some time in the next couple of years.
Comment it's also very fantastic, please enjoy on the original post address, which you can find below
参考至:http://jonathanlewis.wordpress.com/2010/05/18/double-trouble/#comment-36301
如有错误,欢迎指正
邮箱:czmcj@163.com
相关推荐
14.6 Double Trouble 14.7 渐进的符号表示(Asymptotic Notation) 15 基数法则(Cardinality Rules) 15.1 由计算另一项计算该项(Counting One Thing by Counting Another) 15.2 计算序列(Counting Sequences) ...
在本项目"js_double-trouble"中,我们可以从提供的压缩包文件名推测,它可能是一个关于解决JavaScript中的常见难题或挑战的资源库。以下将详细介绍JavaScript中的几个关键知识点,以及它们可能如何成为开发者面临的...
"double-trouble:适用于您的 BIOE 双专业的网络实用程序" 是一个专为西雅图华盛顿大学生物工程专业学生设计的在线工具,旨在协助他们更好地理解并规划双学位的学习路径。这个工具可能包含了丰富的信息和资源,以帮助...
"Double Trouble"这个名字暗示了这个项目可能涉及到解决复杂或双重视角的问题,或者是通过合作来解决技术难题。 【描述】"双重麻烦" 提供了一个简洁的项目概述,但没有提供具体的技术细节。通常,这样的项目可能会...
特别是在处理二进制数据或者进行低级编程时,了解如何将十六进制(HEX)转换为浮点数(float)或双精度浮点数(double)至关重要。这个"HEX-Float转换工具"就是这样一个实用程序,它帮助用户方便快捷地完成这种转换...
System.out.println("Trouble"); } ``` - 运行结果:`b=3` - 解析:这段代码通过`Byte`和`Double`包装类实现了不同类型之间的转换。首先,通过`Byte`构造函数将字符串`"3"`转换为`byte`值3,然后转换为`double`...
测试 * FTP dialogs have trouble with very long directory path. * FTP Open account drop down list very small on Windows 2000. * Relative file paths no longer work with FTP. * Incorrect behavior of ...
Name: Saliency Filters Reference: Perazzi, Federico, et al. "Saliency filters: Contrast based ...3. if you have a trouble in mexGenerateSuperPixel.mexw64, please use another superpixel method instead.
42. **Foul trouble**:犯规困扰,球员接近被罚下场的情况。 43. **Free agent**:自由球员,合同到期或新人,可自由选择加入任何球队。 44. **Free throw**:罚球,因犯规而获得的无防守投篮机会。 45. **...
With them we may have less trouble dealing with problems in life and enjoy a better-off life. However, one point should be kept in mind that we should take sensible use of them, always being the ...
1. **听力理解**:这部分测试学生对英语口语的理解能力,包括听取并理解单个单词(如trouble, double, label等)、短语(如use up, get up, put up等)以及完整的句子。这要求学生具备良好的听力技巧,能够迅速捕捉...
3To conclude, …..are just like a double-edged sword. With them we may have less trouble dealing with problems in life and enjoy a better-off life. However, one point should be kept in mind that we ...
第三句模板常用于结尾段,总结全文并提出建议:“To conclude, ….are just like a double-edged sword. With them we may have less trouble dealing with problems in life and enjoy a better-off life. However,...
2. **语言知识**:题目涉及的词汇和表达反映了Unit 5的核心学习内容,如“trouble”、“double”、“related”等词汇,以及与科学、生活变化相关的句子,强调了日常会话和学术语言的结合,符合初中英语教学对语言...
结尾段落采用双刃剑的比喻,强调合理使用的重要性,“To conclude, …..are just like a double-edged sword. With them we may have less trouble dealing with problems in life and enjoy a better-off life. ...
3. 结束段:“To conclude, …..are just like a double-edged sword. With them we may have less trouble dealing with problems in life and enjoy a better-off life. However, one point should be kept in ...
例句3:“To conclude, smartphones are just like a double-edged sword. With them we may have less trouble dealing with problems in life and enjoy a better-off life. However, one point should be kept in...
give trouble if it was 64 chars. Thanks to Toms Baugis and Teus de Jong. 3) The popup menu would not close itself auto- matically if the StartMinimized property was true. Thanks to Ingo Krueger,...
- 总结观点:以"To conclude, …are just like a double-edged sword. With them we may have less trouble dealing with problems in life and enjoy a better-off life. However, one point should be kept in ...