As previously promised, I thought I might look at how the CBO goes about costing a Full Table Scan (FTS) with system statistics and the CPU costing model, so we can understand why the CBO may have chosen one option over the other.
WARNING: You might need to grab a calculator to help you along
To illustrate, I’m simply going to use the original BOWIE_STUFF table and index setup I created in my earlier Introduction to the CBO. I’ll however recreate the demo here again from scratch to refresh your memory:
I first create a table that has 100,000 rows, with an indexed “ID” column that has 100 distinct, evenly distributed values. For those mathematically challenged, this means each distinct value will return 1000 rows.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
|
SQL> CREATE TABLE bowie_stuff AS SELECT (mod(rownum, 100 )+ 1 )* 10 id, 'Ziggy Stardust' name FROM dual CONNECT BY LEVEL <= 100000 ;
Table created. SQL> CREATE INDEX bowie_stuff_i ON bowie_stuff(id); Index created. SQL> exec dbms_stats.gather_table_stats(ownname=> null, tabname=> 'BOWIE_STUFF' , cascade=> true, estimate_percent=> null, method_opt=> 'FOR ALL COLUMNS SIZE 1' );
PL/SQL procedure successfully completed. SQL> select blocks from dba_tables where table_name= 'BOWIE_STUFF' ;
BLOCKS ------ 329
|
Note the table has 329 blocks. It’s a number I’ll refer to a number of times throughout.
1
2
3
4
5
6
7
8
9
10
|
SQL> SELECT index_name, blevel, leaf_blocks, clustering_factor FROM user_indexes WHERE index_name = 'BOWIE_STUFF_I' ;
INDEX_NAME BLEVEL ------------- ------ BOWIE_STUFF_I 1
LEAF_BLOCKS CLUSTERING_FACTOR ----------- ----------------- 207 32900
|
Note also that the index has a blevel of 1, 207 leaf blocks and a rather poor Clustering Factor (CF) of 32900, not close at all to the number of blocks in the table. As we’ll see, the CF is so bad that the CBO will choose a FTS over the index.
1
2
3
4
5
|
SQL> show parameter db_file_multi
NAME VALUE ----------------------------- ----- db_file_multiblock_read_count 16
|
Note the db_file_multiblock_read_count is manually set to 16. Relevant when calculating the cost of a FTS with the I/O costing model. Less so with CPU costing in use as we’ll see.
Finally, if we look at the system statistics in place:
1
2
3
4
5
6
7
8
|
SQL> select pname, pval 1 from sys.aux_stats$ where pname in ( 'SREADTIM' , 'MREADTIM' , 'MBRC' , 'CPUSPEED' );
PNAME PVAL 1
-------- ----- SREADTIM 5
MREADTIM 10
CPUSPEED 1745
MBRC 10
|
All of these values will be relevant when calculating the cost of a FTS with the CPU costing model.
OK, we now have all the information we need to determine how the CBO will treat both index and FTS activities on this table.
Let’s start by refreshing ourselves with how the I/O based CBO model will deal with such a scenario.
1
2
3
|
SQL> alter session set "_optimizer_cost_model" = io;
Session altered. |
OK, let’s just run a simple query that selects data for a specific ID. Remember, there are 100 evenly distributed distinct IDs so this query will return 1% of the data (1000 rows):
1
2
3
4
5
6
7
8
9
10
|
SQL> set autotrace traceonly SQL> SELECT * FROM bowie_stuff WHERE id = 420 ;
----------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | ----------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1000 | 18000 | |
|* 1 | TABLE ACCESS FULL| BOWIE_STUFF | 1000 | 18000 | |
----------------------------------------------------------------- |
Note: the CBO has decided to use a FTS to select the 1% of rows as it has the lowest associated cost.
As previously discussed, the cost of using the index is approximately:
index blevel + ceil(index selectivity x leaf blocks) + ceil(table selectivity x clustering factor)
= 1 + (207 x 0.01) + (32900 x 0.01) = 1 + 3 + 329 = 333
Note: the 1 for the blevel can be dropped by the CBO bringing the cost down to 332, to be discussed another time .
As previously discussed, the FTS cost is approximately:
segment header I/O + ceil(table blocks/fudged mbrc value)
Note: for a db_file_multiblock_read_count of 16, the adjusted, “fudged” value used by the CBO is approximately 10.4.
Therefore, for the above example, the FTS cost is calculated as:
= 1 + ceil(329/10.4) = 1 + 32 = 33
33 is significantly less than 333 so the FTS easily wins out.
So how do things change when using System Statistics and the CPU costing model ? How does the CBO calculate the cost of the FTS with the above system statistics in place ?
As I’ve previously discussed, the significant change with the CPU costing model is not so much how it impacts the cost of index accesses but that of the FTS.
Let’s run the same SQL, but this time using the CPU costing model:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
|
SQL> alter session set "_optimizer_cost_model" = cpu;
Session altered. SQL> SELECT * FROM bowie_stuff WHERE id = 420 ;
1000 rows selected.
-------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1000 | 18000 | 70 ( 5 ) | 00: 00: 01 |
|* 1 | TABLE ACCESS FULL| BOWIE_STUFF | 1000 | 18000 | 70 ( 5 ) | 00: 00: 01 |
-------------------------------------------------------------------------------- |
Note: CBO is still picking the FTS as the CF is truly awful. However the cost of the FTS has increased significantly from 33 to 70, although nowhere near the approximate 333 cost of using the index.
So why has the FTS cost increased and how is this new cost calculated ?
As previously discussed , the CPU costing formula is basically:
(sum of all the single block I/Os x average wait time for a single block I/O +
sum of all the multiblock I/Os x average wait time for a multiblock I/O +
sum of all the required CPU cycles / CPU cycles per second)
/
average wait time for a single block I/O
If we first focus on the single block I/O portion of the formula, the only single block read considered by the CBO during a FTS is the one associated with reading the segment header. Note that the average wait time for a single block read is the SREADTIM system statistic.
If there’s just the one single block I/O, the single block read portion of the formula effectively equates to (1 x sreadtim) / sreadtim, which just equals 1. So 1 is basically added to the cost with regard to reading the segment header as it is with the I/O costing model.
OK, lets next look at the portion of the formula with regard to multiblock I/Os.
The sum of all the multiblock I/Os is calculated in a similar manner as it was with the I/O costing model. It’s simply the number of blocks in the table below the HWM (329 in our example) but this time divided by the MBRC system statistic. Note however the MBRC statistic isn’t some fudged, somewhat arbitrarily set figure based on the db_file_multiblock_read_count parameter, but the actual average size of multiblock I/Os in the specific database environment. Note also that the average wait time for a multiblock read is theMREADTIM system statistic.
So the total wait time for all multiblock reads in the above example is:
sum of all the multiblock I/Os x average wait time for a multiblock I/O =(BLOCKS/MBRC) x MREADTIM = ceil(329/10) x 10 = 330.
This value is then divided by the average wait time for a single block read (theSREADTIM system statistic) to give the overall cost of multiblock reads, but expressed in units of single block I/Os.
The total cost for multiblock I/Os is therefore:
((BLOCKS/MBRC) x MREADTIM)/ SREADTIM = 330/5 = 66.
So the total costs associated for all I/Os is the 1 for reading the segment header plus 66 for all the multiblock reads = 67.
However, the cost of the FTS is 70, not 67. Where does the additional cost of 3 come from ?
Well, that’s the CPU portion of the formula. The CBO has determined that the FTS operation will require ‘x’ number of CPU cycles and this value is then divided by the CPUSPEED to determine how long this CPU activity will take.
This CPU elapsed figure is then again divided by the average wait of a single block read (SREADTIM) to also put the CPU costs in units of single block reads. In this example, the total CPU related costs amount to 3.
Oracle gives us an indication of what the CPU component is in the overall cost within the execution plan via the %CPU value (which is 5 in the above execution plan). The (%CPU) value is the ceil of the overall percentage of CPU costs as calculated by the following formula:
%CPU = ceil(CPU related costs/overall costs)
So in our example, %CPU = ceil(3/70 x 100) = ceil(4.29) = 5% (as indeed displayed in the above execution plan).
Again, all the costs associated with a FTS with the CPU costing model can be derived and make some kinda sense. Providing all the necessary inputs are all actually correct and valid, the CBO will indeed correctly decide to use a FTS over an index when it’s the less expensive option.
I’ll next expand these points and why understanding how these costs are derived can be extremely useful.
You can now put your calculators away
参考至:http://richardfoote.wordpress.com/2009/11/25/the-cbo-cpu-costing-model-indexes-vs-full-table-scans/
如有错误,欢迎指正
邮箱:czmcj@163.com
相关推荐
- **集射间击穿电压 (V(BR)CBO)**: 当集电极电流为-100微安且发射极电流为0时,最小值为-40伏特。 - **集射间击穿电压 (V(BR)CEO)**: 当集电极电流为-0.1毫安且基极电流为0时,最小值为-25伏特。 - **发射基间击穿...
第6课:CBO优化程序简介.pdf 第7课:解释执行计划.pdf 第8课:优化程序访问路径.pdf 第9课:优化程序联接.pdf 第10课:优化程序其他操作.pdf 第11课:优化程序统计信息.pdf 第12课:使用绑定变量.pdf 第13课...
- CBO的成本估算模型:CBO基于统计信息和数据库配置来估算每个操作的成本,如I/O、CPU等。 - 优化器模式:固定规则(Rule-Based Optimizer, RBO)与成本基础(CBO)的区别,以及何时应该使用CBO。 - 参数调优:学习...
本主题将深入探讨如何在C#中实现基于ComboBox(cbo)和DataGridView(dgv)的输入筛选功能,这在数据展示和交互中是非常常见的需求。 1. **ComboBox(cbo)基础** ComboBox是Windows Forms中的一个控件,它结合了...
10. **成本基础优化**:`hive.cbo.enable=true`开启后,Hive会根据数据统计信息进行成本估算,选择最优执行计划。 11. **使用ORC文件格式**:ORC文件格式提供了更好的压缩和列式存储,对查询性能有显著提升。 12. ...
在分析不同的访问路径(如全表扫描Full Table Scan和索引扫描Index Range Scan)时,CPU资源的消耗会因操作不同而变化。全表扫描涉及读取和处理数据块,而索引扫描则涉及查找索引项和回表操作。通过对比这两个访问...
在缺省情况下,ORACLE采用CHOOSE优化器, 为了避免那些不必要的全表扫描(full table scan) , 你必须尽量避免使用CHOOSE优化器,而直接采用基于规则或者基于成本的优化器. 2. 访问Table的方式 ORACLE 采用两种...
标题 "UFIDA.U9.CBO.ItemMasterBE.zip" 指的是用友U9系统中的一个插件开发相关的代码示例。用友U9是一款面向大型企业的全面企业管理软件,其核心设计思想是“实时企业,全球商务”。CBO(Client Business Object)是...
2. 查看索引的统计信息:select ttt.index_name, ttt.num_rows, ttt.distinct_keys, ttt.avg_leaf_blocks_per_key, ttt.clustering_factor from dba_indexes ttt where ttt.owner='SCOTT'; 人工进行统计可以使用...
U9V25组件化实施方案实例指导_CBO-210财务相关档案.pptx
9. **优化器改进**:查询优化器在5.7版本中有了显著改进,包括新的统计信息收集方法和CBO(成本基优化器)的使用,使得优化器能够做出更准确的选择,提高查询效率。 10. **操作与管理工具**:MySQL Workbench是官方...
- 符号:V(BR)CBO - 测试条件:IC=100μA, IE=0 - 值:40V 2. **集电极-发射极击穿电压**: - 符号:V(BR)CEO - 测试条件:IC=0.1mA, IB=0 - 值:25V 3. **发射极-基极击穿电压**: - 符号:V(BR)EBO - ...
在缺省情况下,ORACLE采用CHOOSE优化器, 为了避免那些不必要的全表扫描(full table scan) , 你必须尽量避免使用CHOOSE优化器,而直接采用基于规则或者基于成本的优化器. 2. 访问Table的方式 ORACLE 采用两种...
- `ANALYZE TABLE` 语句用于收集表和索引的统计信息,帮助CBO做出更好的决策。但分析表可能会改变执行计划,因此应谨慎使用并备份执行计划。 5. **使用HINT注意事项**: - 不应过度依赖HINT,因为它们可能导致...
新的统计信息和查询优化策略,如CBO(成本基优化器),使得复杂查询的性能显著提升。 3. **JSON支持**: 在5.7版本中,MySQL添加了对JSON数据类型的原生支持。这使得数据库能够直接处理非结构化数据,无需将它们...
### 如何理解CBO在Oracle数据库中的工作原理 本文旨在深入探讨成本基础优化器(Cost-Based Optimizer,简称CBO)在Oracle数据库中的工作方式及其重要性。通过一个具体的示例,我们将逐步分析CBO如何根据表统计信息...
五,数据库列表主数据库: data/data_fzybbs.mdb星空钱庄:data/bank_db.asp社区农场:data/Sheep.asa广告公司:data/X_AdvCom.mdb音乐播放器: data/cbo_music.asp证券市场: data/stock.mdb版主评定: data/disadmin_...