Oracle 分区(partition)技术

axes

浏览: 12984 次
性别:
来自: 北京

最近访客更多访客>>

xgyule

ya19890605

ptkr

xiaojianhx

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

数据库

分区表的好处:
一：提高数据的可用性，分区表逻辑上是一个表，实际各分区的数据是独立存放的，一个分区可以离线的
同时，其他分区可以正常操作.
二：减轻管理负担，对一个20g的对象多备份，移动，收缩等操作，显然要比在20个1g的对象上执行同
样的操作要更有挑战性，分区采用分而治之的方法，而且分区实际上独立存放，从而可以用这些小对
象上的操作来代替大表上操作
三：提高查询效率，在olap系统中，存在诸多非常大的对象，可能存放5年，10年的历史数据，决策报表
需要数据量也非常多，分区技术在此环境下，充分利用分区消除，可以大幅度的提高查询效率，但对
于oltp系统，应用的不同将会导致几乎感受不到这种好处。

各种类型分区使用注意点:

范围(range)分区:
一::对于分区表,如果where条件种没有分区列,那么oracle会扫描所有的分区,然后做PARTITION RANGE
ALL 操作,这样成本将比未分区的全表扫描稍微高点,因为需要合并各个分区.
二:范围分区可以用values less than (maxvalue)增加一个默认分区,maxvalue 常量表示该分区用来存放所有其
他分区无法存放的记录,
三:范围分区可以对各种谓词做分区消除,包括=,>,<,<>等比hash,和list分区要灵活

散列(hash)分区
一:oracle根据分区列的hash函数计算值, hash分区数来自动决定某一条记录放在哪一个分区(你无法决定).
二:分区数应为2的一个幂,如2,4,8,16……如若不然,记录的散列将会不均匀.
三:分区列应该有很好的选择性,如果在10000条记录中,分区列只有5个不同的值,那么很可能所有的记录都集中在
少数几个分区中.无法把10000条记录均匀的分散到这5个分区中.
四:hash分区对于非严格=的谓词,很难做分区消除,没有range分区灵活.
五:如果hash分区的分区数有增加或减少,数据会在所有分区中重新再分布

列值(list)分区
一:对于既无法使用范围分区,同时若列的选择不很好,又无法使用hash分区的时候,可以采用list分区,如区域
代号,部门代号等字段.
二:分区对于非严格=的谓词,很难做分区消除,没有range分区灵活.
三:oracle9i 以后才支持list分区.

复合分区
一:主分区必须是范围分区,子分区可以是hash分区或者列表分区
二:如果where条件中有主分区的分区列,则支持范围分区消除,如果where条件中再加上子分区的分区列,则
会在前面分区消除结果集中再次做分区消除,如果where条件中只有子分区的分区列,则会扫描每一个主
分区.在每一个主分区中做子分区列的分区消除.这种情况下,成本可能会比未分区的成本还要高一些.

下面是一些试验例子:

创建范围分区表
SQL> Create Table t(owner, object_name, subobject_name, object_id, data_object_id, object_type, created,
2 last_ddl_time, timestamp, status, temporary, generated, secondary)
3 Partition By Range(object_id)
4 (Partition p_3000 Values Less Than(3000) Tablespace users,
5 Partition p_6000 Values Less than(6000) Tablespace users,
6 Partition p_9000 Values Less Than(9000) Tablespace users,
7 Partition p_12000 Values Less Than(12000) Tablespace users,
8 Partition p_15000 Values Less Than(15000) Tablespace users,
9 Partition p_18000 Values Less Than(18000) Tablespace users,
10 Partition p_21000 Values Less Than(21000) Tablespace users,
11 Partition p_24000 Values Less Than(24000) Tablespace users,
12 Partition p_27000 Values Less Than(27000) Tablespace users,
13 Partition p_others Values Less Than(Maxvalue) Tablespace users
14 )
15 As
16 Select owner, object_name, subobject_name, object_id, data_object_id, object_type, created,
17 last_ddl_time, timestamp, status, temporary, generated, secondary
18From dba_objects;
再建立一个非分区表,后面用来做对比
SQL> Create Table t1(owner, object_name, subobject_name, object_id, data_object_id, object_type, created,
2 last_ddl_time, timestamp, status, temporary, generated, secondary)
3 As
4 Select owner, object_name, subobject_name, object_id, data_object_id, object_type, created,
5 last_ddl_time, timestamp, status, temporary, generated, secondary
6From dba_objects;

SQL> explain plan for select count(*) from t where object_id>4000 and object_id<5000;
已解释。
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost | Pstart| Pstop |
------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 6 | | |
| 1 | SORT AGGREGATE | | 1 | 3 | | | |
|* 2 | TABLE ACCESS FULL | T | 985 | 2955 | 6 | 2 | 2 |

pstart,pstop 表示开始分区和结束分区,本例中只对第二个分区做全表扫描

SQL> explain plan for select count(*) from t1 where object_id>4000 and object_id<5000;
已解释。
SQL> select * from table(dbms_xplan.display);
--------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
--------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 4 | 41 |
| 1 | SORT AGGREGATE | | 1 | 4 | |
|* 2 | TABLE ACCESS FULL | T1 | 962 | 3848 | 41 |
非分区表无法做分区消除,对整个表做全表扫描,成本比分区表要高很多

创建hash 分区表
oracle根据hash分区数,以及分区列的hash函数计算值,来自动决定某一条记录放在拿一个分区(你无法决定),
这样可以很均匀的把数据分散到每一个分区中;
SQL> Create Table t(owner, object_name, subobject_name, object_id, data_object_id, object_type, created,
2 last_ddl_time, timestamp, status, temporary, generated, secondary)
3 Partition By Hash(object_id)
4 (Partition p_1 Tablespace users,
5 Partition p_2 Tablespace users,
6 Partition p_3 Tablespace users,
7 Partition p_4 Tablespace users,
8 Partition p_5 Tablespace users,
9 Partition p_6 Tablespace users,
10 Partition p_7 Tablespace users,
11 Partition p_8 Tablespace users
12 )
13 As
14 Select owner, object_name, subobject_name, object_id, data_object_id, object_type, created,
15 last_ddl_time, timestamp, status, temporary, generated, secondary
16From dba_objects;

SQL> explain plan for select * from t where object_id=1000;
已解释。
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost | Pstart| Pstop |
------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 46 | 8142 | 7 | | |
|* 1 | TABLE ACCESS FULL | T | 46 | 8142 | 7 | 1 | 1 |

对于非=谓词,hash分区很难做分区消除
SQL> explain plan for select * from t where object_id<=1000 and object_id>=999;
已解释。
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
| Id | Operation | Name | Rows | Bytes | Cost | Pstart | Pstop |
------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 93 | 16461 | 46 | | |
| 1 | PARTITION HASH ALL | | | | | 1 | 8 |
|* 2 | TABLE ACCESS FULL | T | 93| 16461 | 46 | 1 | 8 |
上面语句扫描了所有8个分区.

创建list 分区表
SQL> Create Table t(owner, object_name, subobject_name, object_id, data_object_id, object_type, created,
2 last_ddl_time, timestamp, status, temporary, generated, secondary)
3 Partition By list(OWNER)
4 (Partition p_1 Values ('HR','SCOTT') Tablespace users,
5 Partition p_2 Values ('MDSYS') Tablespace users,
6 Partition p_3 Values ('SH','SYS') Tablespace users,
7 Partition p_4 Values ('OE','OLAPSYS','SYSTEM') Tablespace users,
8 Partition p_5 Values ('ODM','ODM_MTR') Tablespace users,
9 Partition p_6 Values ('QS','QS_CS','QS_ES','QS_OS','QS_WS','WKSYS','WMSYS') Tablespace users,
10 Partition p_7 Values ('PM','PUBLIC') Tablespace users,
11 Partition p_8 Values (DEFAULT) Tablespace users
12 )
13 As
14 Select owner, object_name, subobject_name, object_id, data_object_id, object_type, created,
15 last_ddl_time, timestamp, status, temporary, generated, secondary
16From dba_objects;

SQL> explain plan for select * from t where WNER='SYS';
已解释。
SQL> select * from table(dbms_xplan.display);
| Id | Operation | Name | Rows | Bytes | Cost | Pstart| Pstop |
------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 162 | 28674 | 21 | | |
|* 1 | TABLE ACCESS FULL | T | 162 | 28674 | 21 | 3 | 3 |

创建复合分区表

SQL> Create Table t(owner, object_name, subobject_name, object_id, data_object_id, object_type, created,
2 last_ddl_time, timestamp, status, temporary, generated, secondary)
3 Partition By Range(object_id)
4 Subpartition By list(owner)
5 (Partition p_6000 Values Less Than(6000) Tablespace users
6 (subpartition p_6_1 values ('HR','SCOTT', 'SH','SYS'),
7 subPartition p_6_2 values ('OE','OLAPSYS','SYSTEM'),
8 subPartition p_6_3 values (default)
9 ),
10 Partition p_12000 Values Less than(12000) Tablespace users
11 (subpartition p_12_1 values ('HR','SCOTT', 'SH','SYS'),
12 subPartition p_12_2 values ('OE','OLAPSYS','SYSTEM'),
13 subPartition p_12_3 values (default)
14 ),
15 Partition p_18000 Values Less Than(18000) Tablespace users
16 (subpartition p_18_1 values ('HR','SCOTT', 'SH','SYS'),
17 subPartition p_18_2 values ('OE','OLAPSYS','SYSTEM'),
18 subPartition p_18_3 values (default)
19 ),
20 Partition p_24000 Values Less Than(24000) Tablespace users
21 (subpartition p_24_1 values ('HR','SCOTT', 'SH','SYS'),
22 subPartition p_24_2 values ('OE','OLAPSYS','SYSTEM'),
23 subPartition p_24_3 values (default)
24 ),
25 Partition p_others Values Less Than(Maxvalue) Tablespace users
26 (subpartition p_oth_1 values ('HR','SCOTT', 'SH','SYS'),
27 subPartition p_oth_2 values ('OE','OLAPSYS','SYSTEM'),
28 subPartition p_oth_3 values (default)
29 )
30 )
31 As
32 Select owner, object_name, subobject_name, object_id, data_object_id, object_type, created,
33 last_ddl_time, timestamp, status, temporary, generated, secondary
34 From dba_objects
35 ;
Table created
SQL> explain plan for select * from t where OBJECT_ID>=5000 AND OBJECT_ID<16000;
已解释。
| Id | Operation | Name | Rows | Bytes | Cost | Pstart | Pstop |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10561 | 917K| 29 | | |
| 1 | PARTITION RANGE ITERATOR| | | | | 1 | 3 |
| 2 | PARTITION LIST ALL | | | | | 1 | 3 |
|* 3 | TABLE ACCESS FULL | T | 10561 | 917K| 29 | 1 | 9 |

首先做范围分区消除,oracle确定要扫描5000-16000之间的三个分区,对于每个范围分区下面的子分区,全部扫描,
然后做PARTITION LIST ALL 合并各个范围分区的子分区.如果where条件中有自分区列,oracle也会对自分区做分
区消除,如下面,pstart 和pend 为key
SQL> explain plan for select * from t where OBJECT_ID>=5000 AND OBJECT_ID<16000 and wner='SH';

已解释。

SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------

----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost | Pstart | Pstop |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 556 | 49484 | 29 | | |
| 1 | PARTITION RANGE ITERATOR | | | | | 1 | 3 |
|* 2 | TABLE ACCESS FULL | T | 556 | 49484 | 29 | KEY | KEY |

但如果where条件中只有子分区列,那么成本会比未分区表的扫描还要高,因为oracle需要对各个分区及子分区做合并动作,如下
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------

------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost | Pstart | Pstop |
------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1564 | 135K| 47 | | |
| 1 | PARTITION RANGE ALL | | | | | 1 | 5 |
|* 2 | TABLE ACCESS FULL | T | 1564 | 135K| 47 | KEY | KEY |

分享到：