The CBO and Indexes: OPTIMIZER_INDEX_COST_ADJ Part II

czmmiao

浏览: 4432023 次
性别:
来自: 厦门

最近访客更多访客>>

amwfngt

zzbing

sky3063

hotsunshine

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

In OPTIMIZER_INDEX_COST_ADJ Part I, I discussed how this parameter can significantly impact the overall cost of an index related execution path by “adjusting” the overall costs by the percentage denoted in the parameter. An OPTIMIZER_INDEX_COST_ADJ value of 25 for example will result in the costs of an index access path being reduced to just 25% of the calculated total.

Reducing the effective costs of using an index will obviously increase the likelihood of an index being chosen by the CBO over a Full Table Scan (FTS). This may be a good thing when an appropriate index is not being used by the CBO. However, this may also be a very bad thing if it results in an inappropriate index being chosen by the CBO over a more efficient FTS …

There are typically 3 different methods by which this parameter is set.

Method One: Set it to a really low value such that indexes are typically preferred over a FTS.

This is such a common piece of advice. I wish I was given a dollar every time I read someone suggest set (or reduce further) the OPTIMIZER_INDEX_COST_ADJ to be a really low value in order for a specific piece of SQL to start using an index. Classic example right here where someone suggests setting this parameter to 12 in order to get the CBO to use an index.

Why 12 ?

Perhaps because they once had a SQL statement that used a FTS instead of an index and setting the OPTIMIZER_INDEX_COST_ADJ to 12 made the CBO use the index and so setting it to 12 obviously works, right ? Perhaps because they were born on the 12th and 12 has always been a lucky number ? Who knows ?

However, simply setting the OPTIMIZER_INDEX_COST_ADJ to a low value for no specific reason is generally a very dangerous and ill advised thing to do.

Remember this key point. The purpose of the OPTIMIZER_INDEX_COST_ADJ isnot to ensure indexes are favoured over a FTS. As I discussed in Part I, the purpose is to more accurately reflect any discrepancies between the actual costs of a single block I/O associated with an index access path versus the actual costs of a multiblock I/O associated with a FTS. Therefore, the OPTIMIZER_INDEX_COST_ADJ should as accurately as possible reflect any such discrepancies so that the comparative costs are true and reflective of the real corresponding costs.

The parameter should attempt to set a even playing field between the use of an index and a FTS, not simply just adjust things in favour of indexes generally.

Setting this parameter to an arbitrarily low number may indeed make a specific SQL statement suddenly use an index when previously it used a FTS. Perhaps this SQL statement might indeed benefit from now using the index.

However, this parameter is a global parameter in the sense that it impactsall SQL statements for the entire system or session. Although it might indeed make the CBO use indexes appropriately in places, unfortunately if set incorrectly, it might also make the CBO start to use indexes inappropriately as well. Perhaps the use of a FTS is actually the way to go, is actually the most efficient and less costly option for other SQL statements, but by making this parameter so low as to favour indexes by so much, it might suddenly make the CBO choose indexes that are less efficient and much more costly than the alternative FTS.

Because there are of course many many occasions when a FTS is actually the desired access method as it’s simply the cheaper and less costly alternative. In fact, an index that’s selected inappropriately, an index that’s used instead of the cheaper FTS can actually be far more damaging to overall database performance than the potential damage caused by an inappropriate FTS.

Consider this. In the worst case scenario, the maximum number of logical (or physical) I/Os that a FTS will generate is the number of blocks in the table. The number of actual I/Os is likely to be far fewer than the number of blocks in the table as a FTS will generally perform multiblock I/Os and read all the table blocks in larger “chunks” at a time. However, even with a poorly configured db_file_multiblock_read_count set to effectively 1, the other great advantage of a FTS is that Oracle only needs to access a specific block the once. So reading all the table blocks once, one at a time, is effectively as bad as a FTS can get.

In a 10,000,000 row table housed in say 100,000 table blocks, the maximum number of LIOs is effectively 100,000for a FTS.

However, the worse case scenario for an index range scan can potentially be much much worse. In a poorly clustered index, it’s actually possible to have to read not only each and every leaf block in the index structure (plus a few branch blocks to get to the first index leaf block of interest) but in addition also perform as many logical (or physical) I/Os as there are rows in the table.

In a 10,000,000 row table housed in 100,000 table blocks with say 25,000 leaf blocks in a corresponding index, the maximum number of LIOs could potentially be as high as 10,025,003 for an index range scan, in the order of 100 times worse than the corresponding worse case scenario with the FTS. Remember of course that a FTS is likely to actually have far fewer actual LIOs than the worse case scenario, assuming the db_file_multiblock_read_count is set to something reasonable. So it could quite conceivably be in the order of 1000 times worse than the corresponding FTS.

An index scan you see may need to visit a specific table block many times because the indexed data in the table may be randomly distributed throughout the table (the index has a very bad clustering factor in other words) and so only accesses individual rows within a table block at differing times during the index range scan. As each index entry is read, the corresponding rowids keep referencing different table blocks from those recently accessed. If a specific table block contains say 100 rows, an index range scan may need to access this same block as many as a 100 different times during the index range scan operation.

A very simple example to illustrate this point.

Here, I create a table with10,000,000 rows, with an ID column that has values that are randomly distributed throughout the table:

SQL> create table bowie (id number, text varchar2(80));

Table created.

SQL> insert into bowie select ceil(dbms_random.value(0, 10000)), ‘This is just a piece of text designed to make a row a reasonable size’ from dual connect by level <= 10000000;

10000000 rows created.

SQL> commit;

Commit complete.

I now create an index on this ID column and collect 100% accurate statistics:

SQL> create index bowie_id on bowie(id);

Index created.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>’BOWIE’, estimate_percent=>null, cascade=>true, method_opt=> ‘FOR ALL COLUMNS SIZE 1′);

PL/SQL procedure successfully completed.

I now run a select statement that selects just 10%of rows in the table. Many would expect the index to be used by the CBO as 90% of all rows are of no interest:

SQL> select * from bowie where id between 1 and 1000;

1000873 rows selected.

Elapsed: 00:01:49.75

Execution Plan
———————————————————-
Plan hash value: 1845943507

—————————————————————————
| Id | Operation         | Name | Rows | Bytes | Cost (%CPU)| Time     |
—————————————————————————
|   0 | SELECT STATEMENT |       | 1000K|    69M| 22068   (2)| 00:01:51 |
|* 1 |  TABLE ACCESS FULL| BOWIE | 1000K|    69M| 22068   (2)| 00:01:51 |
—————————————————————————

Predicate Information (identified by operation id):
—————————————————

1 – filter(“ID”<=1000 AND “ID”>=1)

Statistics
———————————————————-
          1 recursive calls
          0 db block gets
     108986 consistent gets
     108672 physical reads
          0 redo size
    8920558 bytes sent via SQL*Net to client
       2596 bytes received via SQL*Net from client
        202 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
    1000873 rows processed

However, we note the CBO has actually selected a FTS and the query is a little slow at nearly 1 minute and 50 seconds.

Hummmm, I know, let’s change the OPTIMIZER_INDEX_COST_ADJ parameter to a really low figure and see if the CBO now decides “to do the right thing” and use the index.

I’m going to pick the value of, uuuummmmm let me think, oh I know, let’s use a value of 2 because I scored 2 goals yesterday in football and 2 has always been somewhat lucky for me. That should do the trick, let’s make the index appear to be only 2% of it’s original cost and really give this index a good chance of being selected by the CBO

SQL> alter session set optimizer_index_cost_adj = 2;

Session altered.

SQL> select * from bowie where id between 1 and 1000;

1000873 rows selected.

Elapsed: 02:16:44.42

Execution Plan
———————————————————-
Plan hash value: 4117205494

—————————————————————————————-
| Id | Operation                   | Name     | Rows | Bytes | Cost (%CPU)| Time     |
—————————————————————————————-
|   0 | SELECT STATEMENT            |          | 1000K|    69M| 19970   (1)| 00:01:40 |
|   1 | TABLE ACCESS BY INDEX ROWID| BOWIE    | 1000K|    69M| 19970   (1)| 00:01:40 |
|* 2 |   INDEX RANGE SCAN          | BOWIE_ID | 1000K|       |    42   (0)| 00:00:01 |
—————————————————————————————-

Predicate Information (identified by operation id):
—————————————————

2 – access(“ID”>=1 AND “ID”<=1000)

Statistics
———————————————————-
          1 recursive calls
          0 db block gets
     998652 consistent gets
     916134 physical reads
          0 redo size
    5034208 bytes sent via SQL*Net to client
       2596 bytes received via SQL*Net from client
        202 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
    1000873 rows processed

Well the good news is that the CBO is now using the index.

However, the bad news is that the elapsed time has gone from 1 min 50 seconds to a massive 2 hours 16 minutes and 44 seconds.

Changing the OPTIMIZER_INDEX_COST_ADJ to an arbitrarily “low” value has suddenly caused some critical SQL statements to now perform appallingly.

And remember, this for a query that was only returning 10% of the data …

Why are things so bad now ?

Because the ID values are randomly distributed throughout the table, the index on the ID column has as a result a terrible/awful clustering factor. Therefore, as the index range scan is performed, Oracle is forced to keep visiting different table blocks with each new index key, potentially returning to the same table block time and time again. The number of logical I/Os is going to be massive and has increased significantly from that of the FTS but the CBO is effectively only costing 2% of all these I/Os in it’s calculations because of the OPTIMIZER_INDEX_COST_ADJ value.

It was giving the index an unfair advantage and severely underestimating the true costs associated with using the index vs. the FTS.

Worse, as it’s using an index, all these massive numbers of blocks are likely going to cause more performance issues in the buffer cache due to the more favourable manner in which such blocks are cached over blocks accessed via a FTS.

I might also add that in this specific example, the CBO was using the CPU costing model (which I’ll discuss later). So the OPTIMIZER_INDEX_COST_ADJ parameter still has an impact even when the CBO in theory should already have a good idea on the comparative costs of a single vs a multiblock I/O.

So simply setting the OPTIMIZER_INDEX_COST_ADJ parameter to an arbitrarily low value is a very dangerous tuning technique due to the risk of inappropriate index access paths being selected by the CBO.

I’ll next discuss two better techniques in how to set the OPTIMIZER_INDEX_COST_ADJ parameter to a more appropriate value.

参考至：http://richardfoote.wordpress.com/2009/07/22/the-cbo-and-indexes-optimizer_index_cost_adj-part-ii/
如有错误，欢迎指正
邮箱:czmcj@163.com

分享到：

The CBO and Indexes: OPTIMIZER_INDEX_COS ... | The CBO and Indexes: OPTIMIZER_INDEX_COS ...

2014-06-27 22:42
浏览 799
评论(0)
分类:数据库
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论