- 浏览: 36208 次
- 性别:
- 来自: 上海
文章分类
最新评论
oracle中分析函数使用
分析函数一般在DW数据库中使用,会很方便你的一些报表实现!
正文:
Oracle 分析函数使用介绍
分析函数是oracle816引入的一个全新的概念,为我们分析数据提供了一种简单高效的处理方式.在分析函数出现以前,我们必须使用自联查询,子查询或者内联视图,甚至复杂的存储过程实现的语句,现在只要一条简单的sql语句就可以实现了,而且在执行效率方面也有相当大的提高.下面我将针对分析函数做一些具体的说明.
今天我主要给大家介绍一下以下几个函数的使用方法
1. 自动汇总函数rollup,cube(详细介绍:http://xsb.itpub.net/post/419/29159)
2. rank 函数, rank,dense_rank,row_number->over(partition by fieldName1 order by fieldName2)指定查询出来的数据按fieldName1分组,按fieldName2排序最后排名)
3. lag,lead函数
4. sum,avg,的移动增加,移动平均数
5. ratio_to_report报表处理函数
6. first,last取基数的分析函数
基础数据
06:34:23 SQL> select * from t;
BILL_MONTH AREA_CODE NET_TYPE LOCAL_FARE
--------------- ---------- ---------- --------------
200405 5761 G 7393344.04
200405 5761 J 5667089.85
200405 5762 G 6315075.96
200405 5762 J 6328716.15
200405 5763 G 8861742.59
200405 5763 J 7788036.32
200405 5764 G 6028670.45
200405 5764 J 6459121.49
200405 5765 G 13156065.77
200405 5765 J 11901671.70
200406 5761 G 7614587.96
200406 5761 J 5704343.05
200406 5762 G 6556992.60
200406 5762 J 6238068.05
200406 5763 G 9130055.46
200406 5763 J 7990460.25
200406 5764 G 6387706.01
200406 5764 J 6907481.66
200406 5765 G 13562968.81
200406 5765 J 12495492.50
200407 5761 G 7987050.65
200407 5761 J 5723215.28
200407 5762 G 6833096.68
200407 5762 J 6391201.44
200407 5763 G 9410815.91
200407 5763 J 8076677.41
200407 5764 G 6456433.23
200407 5764 J 6987660.53
200407 5765 G 14000101.20
200407 5765 J 12301780.20
200408 5761 G 8085170.84
200408 5761 J 6050611.37
200408 5762 G 6854584.22
200408 5762 J 6521884.50
200408 5763 G 9468707.65
200408 5763 J 8460049.43
200408 5764 G 6587559.23
BILL_MONTH AREA_CODE NET_TYPE LOCAL_FARE
--------------- ---------- ---------- --------------
200408 5764 J 7342135.86
200408 5765 G 14450586.63
200408 5765 J 12680052.38
40 rows selected.
Elapsed: 00:00:00.00
1. 1使用rollup函数的介绍
Quote:
下面是直接使用普通sql语句求出各地区的汇总数据的例子
06:41:36 SQL> set autot on
06:43:36 SQL> select area_code,sum(local_fare) local_fare
06:43:50 2 from t
06:43:51 3 group by area_code
06:43:57 4 union all
06:44:00 5 select '合计' area_code,sum(local_fare) local_fare
06:44:06 6 from t
06:44:08 7 /
AREA_CODE LOCAL_FARE
---------- --------------
5761 54225413.04
5762 52039619.60
5763 69186545.02
5764 53156768.46
5765 104548719.19
合计 333157065.31
6 rows selected.
Elapsed: 00:00:00.03
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=7 Card=1310 Bytes=
24884)
1 0 UNION-ALL
2 1 SORT (GROUP BY) (Cost=5 Card=1309 Bytes=24871)
3 2 TABLE ACCESS (FULL) OF 'T' (Cost=2 Card=1309 Bytes=24871)
4 1 SORT (AGGREGATE)
5 4 TABLE ACCESS (FULL) OF 'T' (Cost=2 Card=1309 Bytes=17017)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
6 consistent gets
0 physical reads
0 redo size
561 bytes sent via SQL*Net to client
503 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
6 rows processed
下面是使用分析函数rollup得出的汇总数据的例子
06:44:09 SQL> select nvl(area_code,'合计') area_code,sum(local_fare) local_fare
06:45:26 2 from t
06:45:30 3 group by rollup(nvl(area_code,'合计'))
06:45:50 4 /
AREA_CODE LOCAL_FARE
---------- --------------
5761 54225413.04
5762 52039619.60
5763 69186545.02
5764 53156768.46
5765 104548719.19
333157065.31
6 rows selected.
Elapsed: 00:00:00.00
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=5 Card=1309 Bytes= 24871)
1 0 SORT (GROUP BY ROLLUP) (Cost=5 Card=1309 Bytes=24871)
2 1 TABLE ACCESS (FULL) OF 'T' (Cost=2 Card=1309 Bytes=24871
)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
4 consistent gets
0 physical reads
0 redo size
557 bytes sent via SQL*Net to client
503 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
6 rows processed
从上面的例子我们不难看出使用rollup函数,系统的sql语句更加简单,耗用的资源更少,从6个consistent gets降到4个consistent gets,如果基表很大的话,结果就可想而知了.
1. 2使用cube函数的介绍
Quote:
为了介绍cube函数我们再来看看另外一个使用rollup的例子
06:53:00 SQL> select area_code,bill_month,sum(local_fare) local_fare
06:53:37 2 from t
06:53:38 3 group by rollup(area_code,bill_month)
06:53:49 4 /
AREA_CODE BILL_MONTH LOCAL_FARE
---------- --------------- --------------
5761 200405 13060433.89
5761 200406 13318931.01
5761 200407 13710265.93
5761 200408 14135782.21
5761 54225413.04
5762 200405 12643792.11
5762 200406 12795060.65
5762 200407 13224298.12
5762 200408 13376468.72
5762 52039619.60
5763 200405 16649778.91
5763 200406 17120515.71
5763 200407 17487493.32
5763 200408 17928757.08
5763 69186545.02
5764 200405 12487791.94
5764 200406 13295187.67
5764 200407 13444093.76
5764 200408 13929695.09
5764 53156768.46
5765 200405 25057737.47
5765 200406 26058461.31
5765 200407 26301881.40
5765 200408 27130639.01
5765 104548719.19
333157065.31
26 rows selected.
Elapsed: 00:00:00.00
系统只是根据rollup的第一个参数area_code对结果集的数据做了汇总处理,而没有对bill_month做汇总分析处理,cube函数就是为了这个而设计的.
下面,让我们看看使用cube函数的结果
06:58:02 SQL> select area_code,bill_month,sum(local_fare) local_fare
06:58:30 2 from t
06:58:32 3 group by cube(area_code,bill_month)
06:58:42 4 order by area_code,bill_month nulls last
06:58:57 5 /
--nulls last 使得为空的列排至最后.
AREA_CODE BILL_MONTH LOCAL_FARE
---------- --------------- --------------
5761 200405 13060.43
5761 200406 13318.93
5761 200407 13710.27
5761 200408 14135.78
5761 54225.41
5762 200405 12643.79
5762 200406 12795.06
5762 200407 13224.30
5762 200408 13376.47
5762 52039.62
5763 200405 16649.78
5763 200406 17120.52
5763 200407 17487.49
5763 200408 17928.76
5763 69186.54
5764 200405 12487.79
5764 200406 13295.19
5764 200407 13444.09
5764 200408 13929.69
5764 53156.77
5765 200405 25057.74
5765 200406 26058.46
5765 200407 26301.88
5765 200408 27130.64
5765 104548.72
200405 79899.53
200406 82588.15
200407 84168.03
200408 86501.34
333157.05
30 rows selected.
Elapsed: 00:00:00.01
可以看到,在cube函数的输出结果比使用rollup多出了几行统计数据.这就是cube函数根据bill_month做的汇总统计结果
1.3 rollup 和 cube函数的再深入
Quote:
从上面的结果中我们很容易发现,每个统计数据所对应的行都会出现null,
我们如何来区分到底是根据那个字段做的汇总呢,
这时候,oracle的grouping函数就粉墨登场了.
如果当前的汇总记录是利用该字段得出的,grouping函数就会返回1,否则返回0
1 select decode(grouping(area_code),1,'all area',to_char(area_code)) area_code,
2 decode(grouping(bill_month),1,'all month',bill_month) bill_month,
3 sum(local_fare) local_fare
4 from t
5 group by cube(area_code,bill_month)
6* order by area_code,bill_month nulls last
07:07:29 SQL> /
AREA_CODE BILL_MONTH LOCAL_FARE
---------- --------------- --------------
5761 200405 13060.43
5761 200406 13318.93
5761 200407 13710.27
5761 200408 14135.78
5761 all month 54225.41
5762 200405 12643.79
5762 200406 12795.06
5762 200407 13224.30
5762 200408 13376.47
5762 all month 52039.62
5763 200405 16649.78
5763 200406 17120.52
5763 200407 17487.49
5763 200408 17928.76
5763 all month 69186.54
5764 200405 12487.79
5764 200406 13295.19
5764 200407 13444.09
5764 200408 13929.69
5764 all month 53156.77
5765 200405 25057.74
5765 200406 26058.46
5765 200407 26301.88
5765 200408 27130.64
5765 all month 104548.72
all area 200405 79899.53
all area 200406 82588.15
all area 200407 84168.03
all area 200408 86501.34
all area all month 333157.05
30 rows selected.
Elapsed: 00:00:00.01
07:07:31 SQL>
可以看到,所有的空值现在都根据grouping函数做出了很好的区分,这样利用rollup,cube和grouping函数,我们做数据统计的时候就可以轻松很多了.
2.1 rank函数的介绍
介绍完rollup和cube函数的使用,下面我们来看看rank系列函数的使用方法.
问题2.我想查出这几个月份中各个地区的总话费的排名.
Quote:
为了将rank,dense_rank,row_number函数的差别显示出来,我们对已有的基础数据做一些修改,将5763的数据改成与5761的数据相同.
1 update t t1 set local_fare = (
2 select local_fare from t t2
3 where t1.bill_month = t2.bill_month
4 and t1.net_type = t2.net_type
5 and t2.area_code = '5761'
6* ) where area_code = '5763'
07:19:18 SQL> /
8 rows updated.
Elapsed: 00:00:00.01
我们先使用rank函数来计算各个地区的话费排名.
07:34:19 SQL> select area_code,sum(local_fare) local_fare,
07:35:25 2 rank() over (order by sum(local_fare) desc) fare_rank
07:35:44 3 from t
07:35:45 4 group by area_codee
07:35:50 5
07:35:52 SQL> select area_code,sum(local_fare) local_fare,
07:36:02 2 rank() over (order by sum(local_fare) desc) fare_rank
07:36:20 3 from t
07:36:21 4 group by area_code
07:36:25 5 /
AREA_CODE LOCAL_FARE FARE_RANK
---------- -------------- ----------
5765 104548.72 1
5761 54225.41 2
5763 54225.41 2
5764 53156.77 4
5762 52039.62 5
Elapsed: 00:00:00.01
我们可以看到红色标注的地方出现了,跳位,排名3没有出现
下面我们再看看dense_rank查询的结果.
07:36:26 SQL> select area_code,sum(local_fare) local_fare,
07:39:16 2 dense_rank() over (order by sum(local_fare) desc ) fare_rank
07:39:39 3 from t
07:39:42 4 group by area_code
07:39:46 5 /
AREA_CODE LOCAL_FARE FARE_RANK
---------- -------------- ----------
5765 104548.72 1
5761 54225.41 2
5763 54225.41 2
5764 53156.77 3 这是这里出现了第三名
5762 52039.62 4
Elapsed: 00:00:00.00
在这个例子中,出现了一个第三名,这就是rank和dense_rank的差别,
rank如果出现两个相同的数据,那么后面的数据就会直接跳过这个排名,而dense_rank则不会,
差别更大的是,row_number哪怕是两个数据完全相同,排名也会不一样,这个特性在我们想找出对应没个条件的唯一记录的时候又很大用处
1 select area_code,sum(local_fare) local_fare,
2 row_number() over (order by sum(local_fare) desc ) fare_rank
3 from t
4* group by area_code
07:44:50 SQL> /
AREA_CODE LOCAL_FARE FARE_RANK
---------- -------------- ----------
5765 104548.72 1
5761 54225.41 2
5763 54225.41 3
5764 53156.77 4
5762 52039.62 5
在row_nubmer函数中,我们发现,哪怕sum(local_fare)完全相同,我们还是得到了不一样排名,我们可以利用这个特性剔除数据库中的重复记录.
这个帖子中的几个例子是为了说明这三个函数的基本用法的. 下个帖子我们将详细介绍他们的一些用法.
2. rank函数的介绍
a. 取出数据库中最后入网的n个用户
select user_id,tele_num,user_name,user_status,create_date
from (
select user_id,tele_num,user_name,user_status,create_date,
rank() over (order by create_date desc) add_rank
from user_info
)
where add_rank <= :n;
b.根据object_name删除数据库中的重复记录
create table t as select obj#,name from sys.obj$;
再insert into t1 select * from t1 数次.
delete from t1 where rowid in (
select row_id from (
select rowid row_id,row_number() over (partition by obj# order by rowid ) rn
) where rn <> 1
);
c. 取出各地区的话费收入在各个月份排名.
SQL> select bill_month,area_code,sum(local_fare) local_fare,
2 rank() over (partition by bill_month order by sum(local_fare) desc) area_rank
3 from t
4 group by bill_month,area_code
5 /
BILL_MONTH AREA_CODE LOCAL_FARE AREA_RANK
--------------- --------------- -------------- ----------
200405 5765 25057.74 1
200405 5761 13060.43 2
200405 5763 13060.43 2
200405 5762 12643.79 4
200405 5764 12487.79 5
200406 5765 26058.46 1
200406 5761 13318.93 2
200406 5763 13318.93 2
200406 5764 13295.19 4
200406 5762 12795.06 5
200407 5765 26301.88 1
200407 5761 13710.27 2
200407 5763 13710.27 2
200407 5764 13444.09 4
200407 5762 13224.30 5
200408 5765 27130.64 1
200408 5761 14135.78 2
200408 5763 14135.78 2
200408 5764 13929.69 4
200408 5762 13376.47 5
20 rows selected.
SQL>
3. lag和lead函数介绍
取出每个月的上个月和下个月的话费总额
1 select area_code,bill_month, local_fare cur_local_fare,
2 lag(local_fare,2,0) over (partition by area_code order by bill_month ) pre_local_fare,
3 lag(local_fare,1,0) over (partition by area_code order by bill_month ) last_local_fare,
4 lead(local_fare,1,0) over (partition by area_code order by bill_month ) next_local_fare,
5 lead(local_fare,2,0) over (partition by area_code order by bill_month ) post_local_fare
6 from (
7 select area_code,bill_month,sum(local_fare) local_fare
8 from t
9 group by area_code,bill_month
10* )
SQL> /
AREA_CODE BILL_MONTH CUR_LOCAL_FARE PRE_LOCAL_FARE LAST_LOCAL_FARE NEXT_LOCAL_FARE POST_LOCAL_FARE
--------- ---------- -------------- -------------- --------------- --------------- ---------------
5761 200405 13060.433 0 0 13318.93 13710.265
5761 200406 13318.93 0 13060.433 13710.265 14135.781
5761 200407 13710.265 13060.433 13318.93 14135.781 0
5761 200408 14135.781 13318.93 13710.265 0 0
5762 200405 12643.791 0 0 12795.06 13224.297
5762 200406 12795.06 0 12643.791 13224.297 13376.468
5762 200407 13224.297 12643.791 12795.06 13376.468 0
5762 200408 13376.468 12795.06 13224.297 0 0
5763 200405 13060.433 0 0 13318.93 13710.265
5763 200406 13318.93 0 13060.433 13710.265 14135.781
5763 200407 13710.265 13060.433 13318.93 14135.781 0
5763 200408 14135.781 13318.93 13710.265 0 0
5764 200405 12487.791 0 0 13295.187 13444.093
5764 200406 13295.187 0 12487.791 13444.093 13929.694
5764 200407 13444.093 12487.791 13295.187 13929.694 0
5764 200408 13929.694 13295.187 13444.093 0 0
5765 200405 25057.736 0 0 26058.46 26301.881
5765 200406 26058.46 0 25057.736 26301.881 27130.638
5765 200407 26301.881 25057.736 26058.46 27130.638 0
5765 200408 27130.638 26058.46 26301.881 0 0
20 rows selected.
利用lag和lead函数,我们可以在同一行中显示前n行的数据,也可以显示后n行的数据.
4. sum,avg,max,min移动计算数据介绍
计算出各个连续3个月的通话费用的平均数
--如果要超过3个月的又该如何填写? Over里头的填写??
1 select area_code,bill_month, local_fare,
2 sum(local_fare)
3 over ( partition by area_code
4 order by to_number(bill_month)
5 range between 1 preceding and 1 following ) "3month_sum",
6 avg(local_fare)
7 over ( partition by area_code
8 order by to_number(bill_month)
9 range between 1 preceding and 1 following ) "3month_avg",
10 max(local_fare)
11 over ( partition by area_code
12 order by to_number(bill_month)
13 range between 1 preceding and 1 following ) "3month_max",
14 min(local_fare)
15 over ( partition by area_code
16 order by to_number(bill_month)
17 range between 1 preceding and 1 following ) "3month_min"
18 from (
19 select area_code,bill_month,sum(local_fare) local_fare
20 from t
21 group by area_code,bill_month
22* )
SQL> /
AREA_CODE BILL_MONTH LOCAL_FARE 3month_sum 3month_avg 3month_max 3month_min
--------- ---------- ---------------- ---------- ---------- ---------- ----------
5761 200405 13060.433 26379.363 13189.6815 13318.93 13060.433
5761 200406 13318.930 40089.628 13363.2093 13710.265 13060.433
5761 200407 13710.265 41164.976 13721.6587 14135.781 13318.93
40089.628 = 13060.433 + 13318.930 + 13710.265
13363.2093 = (13060.433 + 13318.930 + 13710.265) / 3
13710.265 = max(13060.433 + 13318.930 + 13710.265)
13060.433 = min(13060.433 + 13318.930 + 13710.265)
5761 200408 14135.781 27846.046 13923.023 14135.781 13710.265
5762 200405 12643.791 25438.851 12719.4255 12795.06 12643.791
5762 200406 12795.060 38663.148 12887.716 13224.297 12643.791
5762 200407 13224.297 39395.825 13131.9417 13376.468 12795.06
5762 200408 13376.468 26600.765 13300.3825 13376.468 13224.297
5763 200405 13060.433 26379.363 13189.6815 13318.93 13060.433
5763 200406 13318.930 40089.628 13363.2093 13710.265 13060.433
5763 200407 13710.265 41164.976 13721.6587 14135.781 13318.93
5763 200408 14135.781 27846.046 13923.023 14135.781 13710.265
5764 200405 12487.791 25782.978 12891.489 13295.187 12487.791
5764 200406 13295.187 39227.071 13075.6903 13444.093 12487.791
5764 200407 13444.093 40668.974 13556.3247 13929.694 13295.187
5764 200408 13929.694 27373.787 13686.8935 13929.694 13444.093
5765 200405 25057.736 51116.196 25558.098 26058.46 25057.736
5765 200406 26058.460 77418.077 25806.0257 26301.881 25057.736
5765 200407 26301.881 79490.979 26496.993 27130.638 26058.46
5765 200408 27130.638 53432.519 26716.2595 27130.638 26301.881
20 rows selected.
5. ratio_to_report函数的介绍 ---- 占比直接得出.
Quote:
1 select bill_month,area_code,sum(local_fare) local_fare,
2 ratio_to_report(sum(local_fare)) over
3 ( partition by bill_month ) area_pct
4 from t
5* group by bill_month,area_code
SQL> break on bill_month skip 1
SQL> compute sum of local_fare on bill_month
SQL> compute sum of area_pct on bill_month
SQL> /
BILL_MONTH AREA_CODE LOCAL_FARE AREA_PCT
---------- --------- ---------------- ----------
200405 5761 13060.433 .171149279
5762 12643.791 .165689431
5763 13060.433 .171149279
5764 12487.791 .163645143
5765 25057.736 .328366866
********** ---------------- ----------
sum 76310.184 1
200406 5761 13318.930 .169050772
5762 12795.060 .162401542
5763 13318.930 .169050772
5764 13295.187 .168749414
5765 26058.460 .330747499
********** ---------------- ----------
sum 78786.567 1
200407 5761 13710.265 .170545197
5762 13224.297 .164500127
5763 13710.265 .170545197
5764 13444.093 .167234221
5765 26301.881 .327175257
********** ---------------- ----------
sum 80390.801 1
200408 5761 14135.781 .170911147
5762 13376.468 .161730539
5763 14135.781 .170911147
5764 13929.694 .168419416
5765 27130.638 .328027751
********** ---------------- ----------
sum 82708.362 1
20 rows selected.
分析函数一般在DW数据库中使用,会很方便你的一些报表实现!
正文:
Oracle 分析函数使用介绍
分析函数是oracle816引入的一个全新的概念,为我们分析数据提供了一种简单高效的处理方式.在分析函数出现以前,我们必须使用自联查询,子查询或者内联视图,甚至复杂的存储过程实现的语句,现在只要一条简单的sql语句就可以实现了,而且在执行效率方面也有相当大的提高.下面我将针对分析函数做一些具体的说明.
今天我主要给大家介绍一下以下几个函数的使用方法
1. 自动汇总函数rollup,cube(详细介绍:http://xsb.itpub.net/post/419/29159)
2. rank 函数, rank,dense_rank,row_number->over(partition by fieldName1 order by fieldName2)指定查询出来的数据按fieldName1分组,按fieldName2排序最后排名)
3. lag,lead函数
4. sum,avg,的移动增加,移动平均数
5. ratio_to_report报表处理函数
6. first,last取基数的分析函数
基础数据
06:34:23 SQL> select * from t;
BILL_MONTH AREA_CODE NET_TYPE LOCAL_FARE
--------------- ---------- ---------- --------------
200405 5761 G 7393344.04
200405 5761 J 5667089.85
200405 5762 G 6315075.96
200405 5762 J 6328716.15
200405 5763 G 8861742.59
200405 5763 J 7788036.32
200405 5764 G 6028670.45
200405 5764 J 6459121.49
200405 5765 G 13156065.77
200405 5765 J 11901671.70
200406 5761 G 7614587.96
200406 5761 J 5704343.05
200406 5762 G 6556992.60
200406 5762 J 6238068.05
200406 5763 G 9130055.46
200406 5763 J 7990460.25
200406 5764 G 6387706.01
200406 5764 J 6907481.66
200406 5765 G 13562968.81
200406 5765 J 12495492.50
200407 5761 G 7987050.65
200407 5761 J 5723215.28
200407 5762 G 6833096.68
200407 5762 J 6391201.44
200407 5763 G 9410815.91
200407 5763 J 8076677.41
200407 5764 G 6456433.23
200407 5764 J 6987660.53
200407 5765 G 14000101.20
200407 5765 J 12301780.20
200408 5761 G 8085170.84
200408 5761 J 6050611.37
200408 5762 G 6854584.22
200408 5762 J 6521884.50
200408 5763 G 9468707.65
200408 5763 J 8460049.43
200408 5764 G 6587559.23
BILL_MONTH AREA_CODE NET_TYPE LOCAL_FARE
--------------- ---------- ---------- --------------
200408 5764 J 7342135.86
200408 5765 G 14450586.63
200408 5765 J 12680052.38
40 rows selected.
Elapsed: 00:00:00.00
1. 1使用rollup函数的介绍
Quote:
下面是直接使用普通sql语句求出各地区的汇总数据的例子
06:41:36 SQL> set autot on
06:43:36 SQL> select area_code,sum(local_fare) local_fare
06:43:50 2 from t
06:43:51 3 group by area_code
06:43:57 4 union all
06:44:00 5 select '合计' area_code,sum(local_fare) local_fare
06:44:06 6 from t
06:44:08 7 /
AREA_CODE LOCAL_FARE
---------- --------------
5761 54225413.04
5762 52039619.60
5763 69186545.02
5764 53156768.46
5765 104548719.19
合计 333157065.31
6 rows selected.
Elapsed: 00:00:00.03
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=7 Card=1310 Bytes=
24884)
1 0 UNION-ALL
2 1 SORT (GROUP BY) (Cost=5 Card=1309 Bytes=24871)
3 2 TABLE ACCESS (FULL) OF 'T' (Cost=2 Card=1309 Bytes=24871)
4 1 SORT (AGGREGATE)
5 4 TABLE ACCESS (FULL) OF 'T' (Cost=2 Card=1309 Bytes=17017)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
6 consistent gets
0 physical reads
0 redo size
561 bytes sent via SQL*Net to client
503 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
6 rows processed
下面是使用分析函数rollup得出的汇总数据的例子
06:44:09 SQL> select nvl(area_code,'合计') area_code,sum(local_fare) local_fare
06:45:26 2 from t
06:45:30 3 group by rollup(nvl(area_code,'合计'))
06:45:50 4 /
AREA_CODE LOCAL_FARE
---------- --------------
5761 54225413.04
5762 52039619.60
5763 69186545.02
5764 53156768.46
5765 104548719.19
333157065.31
6 rows selected.
Elapsed: 00:00:00.00
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=5 Card=1309 Bytes= 24871)
1 0 SORT (GROUP BY ROLLUP) (Cost=5 Card=1309 Bytes=24871)
2 1 TABLE ACCESS (FULL) OF 'T' (Cost=2 Card=1309 Bytes=24871
)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
4 consistent gets
0 physical reads
0 redo size
557 bytes sent via SQL*Net to client
503 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
6 rows processed
从上面的例子我们不难看出使用rollup函数,系统的sql语句更加简单,耗用的资源更少,从6个consistent gets降到4个consistent gets,如果基表很大的话,结果就可想而知了.
1. 2使用cube函数的介绍
Quote:
为了介绍cube函数我们再来看看另外一个使用rollup的例子
06:53:00 SQL> select area_code,bill_month,sum(local_fare) local_fare
06:53:37 2 from t
06:53:38 3 group by rollup(area_code,bill_month)
06:53:49 4 /
AREA_CODE BILL_MONTH LOCAL_FARE
---------- --------------- --------------
5761 200405 13060433.89
5761 200406 13318931.01
5761 200407 13710265.93
5761 200408 14135782.21
5761 54225413.04
5762 200405 12643792.11
5762 200406 12795060.65
5762 200407 13224298.12
5762 200408 13376468.72
5762 52039619.60
5763 200405 16649778.91
5763 200406 17120515.71
5763 200407 17487493.32
5763 200408 17928757.08
5763 69186545.02
5764 200405 12487791.94
5764 200406 13295187.67
5764 200407 13444093.76
5764 200408 13929695.09
5764 53156768.46
5765 200405 25057737.47
5765 200406 26058461.31
5765 200407 26301881.40
5765 200408 27130639.01
5765 104548719.19
333157065.31
26 rows selected.
Elapsed: 00:00:00.00
系统只是根据rollup的第一个参数area_code对结果集的数据做了汇总处理,而没有对bill_month做汇总分析处理,cube函数就是为了这个而设计的.
下面,让我们看看使用cube函数的结果
06:58:02 SQL> select area_code,bill_month,sum(local_fare) local_fare
06:58:30 2 from t
06:58:32 3 group by cube(area_code,bill_month)
06:58:42 4 order by area_code,bill_month nulls last
06:58:57 5 /
--nulls last 使得为空的列排至最后.
AREA_CODE BILL_MONTH LOCAL_FARE
---------- --------------- --------------
5761 200405 13060.43
5761 200406 13318.93
5761 200407 13710.27
5761 200408 14135.78
5761 54225.41
5762 200405 12643.79
5762 200406 12795.06
5762 200407 13224.30
5762 200408 13376.47
5762 52039.62
5763 200405 16649.78
5763 200406 17120.52
5763 200407 17487.49
5763 200408 17928.76
5763 69186.54
5764 200405 12487.79
5764 200406 13295.19
5764 200407 13444.09
5764 200408 13929.69
5764 53156.77
5765 200405 25057.74
5765 200406 26058.46
5765 200407 26301.88
5765 200408 27130.64
5765 104548.72
200405 79899.53
200406 82588.15
200407 84168.03
200408 86501.34
333157.05
30 rows selected.
Elapsed: 00:00:00.01
可以看到,在cube函数的输出结果比使用rollup多出了几行统计数据.这就是cube函数根据bill_month做的汇总统计结果
1.3 rollup 和 cube函数的再深入
Quote:
从上面的结果中我们很容易发现,每个统计数据所对应的行都会出现null,
我们如何来区分到底是根据那个字段做的汇总呢,
这时候,oracle的grouping函数就粉墨登场了.
如果当前的汇总记录是利用该字段得出的,grouping函数就会返回1,否则返回0
1 select decode(grouping(area_code),1,'all area',to_char(area_code)) area_code,
2 decode(grouping(bill_month),1,'all month',bill_month) bill_month,
3 sum(local_fare) local_fare
4 from t
5 group by cube(area_code,bill_month)
6* order by area_code,bill_month nulls last
07:07:29 SQL> /
AREA_CODE BILL_MONTH LOCAL_FARE
---------- --------------- --------------
5761 200405 13060.43
5761 200406 13318.93
5761 200407 13710.27
5761 200408 14135.78
5761 all month 54225.41
5762 200405 12643.79
5762 200406 12795.06
5762 200407 13224.30
5762 200408 13376.47
5762 all month 52039.62
5763 200405 16649.78
5763 200406 17120.52
5763 200407 17487.49
5763 200408 17928.76
5763 all month 69186.54
5764 200405 12487.79
5764 200406 13295.19
5764 200407 13444.09
5764 200408 13929.69
5764 all month 53156.77
5765 200405 25057.74
5765 200406 26058.46
5765 200407 26301.88
5765 200408 27130.64
5765 all month 104548.72
all area 200405 79899.53
all area 200406 82588.15
all area 200407 84168.03
all area 200408 86501.34
all area all month 333157.05
30 rows selected.
Elapsed: 00:00:00.01
07:07:31 SQL>
可以看到,所有的空值现在都根据grouping函数做出了很好的区分,这样利用rollup,cube和grouping函数,我们做数据统计的时候就可以轻松很多了.
2.1 rank函数的介绍
介绍完rollup和cube函数的使用,下面我们来看看rank系列函数的使用方法.
问题2.我想查出这几个月份中各个地区的总话费的排名.
Quote:
为了将rank,dense_rank,row_number函数的差别显示出来,我们对已有的基础数据做一些修改,将5763的数据改成与5761的数据相同.
1 update t t1 set local_fare = (
2 select local_fare from t t2
3 where t1.bill_month = t2.bill_month
4 and t1.net_type = t2.net_type
5 and t2.area_code = '5761'
6* ) where area_code = '5763'
07:19:18 SQL> /
8 rows updated.
Elapsed: 00:00:00.01
我们先使用rank函数来计算各个地区的话费排名.
07:34:19 SQL> select area_code,sum(local_fare) local_fare,
07:35:25 2 rank() over (order by sum(local_fare) desc) fare_rank
07:35:44 3 from t
07:35:45 4 group by area_codee
07:35:50 5
07:35:52 SQL> select area_code,sum(local_fare) local_fare,
07:36:02 2 rank() over (order by sum(local_fare) desc) fare_rank
07:36:20 3 from t
07:36:21 4 group by area_code
07:36:25 5 /
AREA_CODE LOCAL_FARE FARE_RANK
---------- -------------- ----------
5765 104548.72 1
5761 54225.41 2
5763 54225.41 2
5764 53156.77 4
5762 52039.62 5
Elapsed: 00:00:00.01
我们可以看到红色标注的地方出现了,跳位,排名3没有出现
下面我们再看看dense_rank查询的结果.
07:36:26 SQL> select area_code,sum(local_fare) local_fare,
07:39:16 2 dense_rank() over (order by sum(local_fare) desc ) fare_rank
07:39:39 3 from t
07:39:42 4 group by area_code
07:39:46 5 /
AREA_CODE LOCAL_FARE FARE_RANK
---------- -------------- ----------
5765 104548.72 1
5761 54225.41 2
5763 54225.41 2
5764 53156.77 3 这是这里出现了第三名
5762 52039.62 4
Elapsed: 00:00:00.00
在这个例子中,出现了一个第三名,这就是rank和dense_rank的差别,
rank如果出现两个相同的数据,那么后面的数据就会直接跳过这个排名,而dense_rank则不会,
差别更大的是,row_number哪怕是两个数据完全相同,排名也会不一样,这个特性在我们想找出对应没个条件的唯一记录的时候又很大用处
1 select area_code,sum(local_fare) local_fare,
2 row_number() over (order by sum(local_fare) desc ) fare_rank
3 from t
4* group by area_code
07:44:50 SQL> /
AREA_CODE LOCAL_FARE FARE_RANK
---------- -------------- ----------
5765 104548.72 1
5761 54225.41 2
5763 54225.41 3
5764 53156.77 4
5762 52039.62 5
在row_nubmer函数中,我们发现,哪怕sum(local_fare)完全相同,我们还是得到了不一样排名,我们可以利用这个特性剔除数据库中的重复记录.
这个帖子中的几个例子是为了说明这三个函数的基本用法的. 下个帖子我们将详细介绍他们的一些用法.
2. rank函数的介绍
a. 取出数据库中最后入网的n个用户
select user_id,tele_num,user_name,user_status,create_date
from (
select user_id,tele_num,user_name,user_status,create_date,
rank() over (order by create_date desc) add_rank
from user_info
)
where add_rank <= :n;
b.根据object_name删除数据库中的重复记录
create table t as select obj#,name from sys.obj$;
再insert into t1 select * from t1 数次.
delete from t1 where rowid in (
select row_id from (
select rowid row_id,row_number() over (partition by obj# order by rowid ) rn
) where rn <> 1
);
c. 取出各地区的话费收入在各个月份排名.
SQL> select bill_month,area_code,sum(local_fare) local_fare,
2 rank() over (partition by bill_month order by sum(local_fare) desc) area_rank
3 from t
4 group by bill_month,area_code
5 /
BILL_MONTH AREA_CODE LOCAL_FARE AREA_RANK
--------------- --------------- -------------- ----------
200405 5765 25057.74 1
200405 5761 13060.43 2
200405 5763 13060.43 2
200405 5762 12643.79 4
200405 5764 12487.79 5
200406 5765 26058.46 1
200406 5761 13318.93 2
200406 5763 13318.93 2
200406 5764 13295.19 4
200406 5762 12795.06 5
200407 5765 26301.88 1
200407 5761 13710.27 2
200407 5763 13710.27 2
200407 5764 13444.09 4
200407 5762 13224.30 5
200408 5765 27130.64 1
200408 5761 14135.78 2
200408 5763 14135.78 2
200408 5764 13929.69 4
200408 5762 13376.47 5
20 rows selected.
SQL>
3. lag和lead函数介绍
取出每个月的上个月和下个月的话费总额
1 select area_code,bill_month, local_fare cur_local_fare,
2 lag(local_fare,2,0) over (partition by area_code order by bill_month ) pre_local_fare,
3 lag(local_fare,1,0) over (partition by area_code order by bill_month ) last_local_fare,
4 lead(local_fare,1,0) over (partition by area_code order by bill_month ) next_local_fare,
5 lead(local_fare,2,0) over (partition by area_code order by bill_month ) post_local_fare
6 from (
7 select area_code,bill_month,sum(local_fare) local_fare
8 from t
9 group by area_code,bill_month
10* )
SQL> /
AREA_CODE BILL_MONTH CUR_LOCAL_FARE PRE_LOCAL_FARE LAST_LOCAL_FARE NEXT_LOCAL_FARE POST_LOCAL_FARE
--------- ---------- -------------- -------------- --------------- --------------- ---------------
5761 200405 13060.433 0 0 13318.93 13710.265
5761 200406 13318.93 0 13060.433 13710.265 14135.781
5761 200407 13710.265 13060.433 13318.93 14135.781 0
5761 200408 14135.781 13318.93 13710.265 0 0
5762 200405 12643.791 0 0 12795.06 13224.297
5762 200406 12795.06 0 12643.791 13224.297 13376.468
5762 200407 13224.297 12643.791 12795.06 13376.468 0
5762 200408 13376.468 12795.06 13224.297 0 0
5763 200405 13060.433 0 0 13318.93 13710.265
5763 200406 13318.93 0 13060.433 13710.265 14135.781
5763 200407 13710.265 13060.433 13318.93 14135.781 0
5763 200408 14135.781 13318.93 13710.265 0 0
5764 200405 12487.791 0 0 13295.187 13444.093
5764 200406 13295.187 0 12487.791 13444.093 13929.694
5764 200407 13444.093 12487.791 13295.187 13929.694 0
5764 200408 13929.694 13295.187 13444.093 0 0
5765 200405 25057.736 0 0 26058.46 26301.881
5765 200406 26058.46 0 25057.736 26301.881 27130.638
5765 200407 26301.881 25057.736 26058.46 27130.638 0
5765 200408 27130.638 26058.46 26301.881 0 0
20 rows selected.
利用lag和lead函数,我们可以在同一行中显示前n行的数据,也可以显示后n行的数据.
4. sum,avg,max,min移动计算数据介绍
计算出各个连续3个月的通话费用的平均数
--如果要超过3个月的又该如何填写? Over里头的填写??
1 select area_code,bill_month, local_fare,
2 sum(local_fare)
3 over ( partition by area_code
4 order by to_number(bill_month)
5 range between 1 preceding and 1 following ) "3month_sum",
6 avg(local_fare)
7 over ( partition by area_code
8 order by to_number(bill_month)
9 range between 1 preceding and 1 following ) "3month_avg",
10 max(local_fare)
11 over ( partition by area_code
12 order by to_number(bill_month)
13 range between 1 preceding and 1 following ) "3month_max",
14 min(local_fare)
15 over ( partition by area_code
16 order by to_number(bill_month)
17 range between 1 preceding and 1 following ) "3month_min"
18 from (
19 select area_code,bill_month,sum(local_fare) local_fare
20 from t
21 group by area_code,bill_month
22* )
SQL> /
AREA_CODE BILL_MONTH LOCAL_FARE 3month_sum 3month_avg 3month_max 3month_min
--------- ---------- ---------------- ---------- ---------- ---------- ----------
5761 200405 13060.433 26379.363 13189.6815 13318.93 13060.433
5761 200406 13318.930 40089.628 13363.2093 13710.265 13060.433
5761 200407 13710.265 41164.976 13721.6587 14135.781 13318.93
40089.628 = 13060.433 + 13318.930 + 13710.265
13363.2093 = (13060.433 + 13318.930 + 13710.265) / 3
13710.265 = max(13060.433 + 13318.930 + 13710.265)
13060.433 = min(13060.433 + 13318.930 + 13710.265)
5761 200408 14135.781 27846.046 13923.023 14135.781 13710.265
5762 200405 12643.791 25438.851 12719.4255 12795.06 12643.791
5762 200406 12795.060 38663.148 12887.716 13224.297 12643.791
5762 200407 13224.297 39395.825 13131.9417 13376.468 12795.06
5762 200408 13376.468 26600.765 13300.3825 13376.468 13224.297
5763 200405 13060.433 26379.363 13189.6815 13318.93 13060.433
5763 200406 13318.930 40089.628 13363.2093 13710.265 13060.433
5763 200407 13710.265 41164.976 13721.6587 14135.781 13318.93
5763 200408 14135.781 27846.046 13923.023 14135.781 13710.265
5764 200405 12487.791 25782.978 12891.489 13295.187 12487.791
5764 200406 13295.187 39227.071 13075.6903 13444.093 12487.791
5764 200407 13444.093 40668.974 13556.3247 13929.694 13295.187
5764 200408 13929.694 27373.787 13686.8935 13929.694 13444.093
5765 200405 25057.736 51116.196 25558.098 26058.46 25057.736
5765 200406 26058.460 77418.077 25806.0257 26301.881 25057.736
5765 200407 26301.881 79490.979 26496.993 27130.638 26058.46
5765 200408 27130.638 53432.519 26716.2595 27130.638 26301.881
20 rows selected.
5. ratio_to_report函数的介绍 ---- 占比直接得出.
Quote:
1 select bill_month,area_code,sum(local_fare) local_fare,
2 ratio_to_report(sum(local_fare)) over
3 ( partition by bill_month ) area_pct
4 from t
5* group by bill_month,area_code
SQL> break on bill_month skip 1
SQL> compute sum of local_fare on bill_month
SQL> compute sum of area_pct on bill_month
SQL> /
BILL_MONTH AREA_CODE LOCAL_FARE AREA_PCT
---------- --------- ---------------- ----------
200405 5761 13060.433 .171149279
5762 12643.791 .165689431
5763 13060.433 .171149279
5764 12487.791 .163645143
5765 25057.736 .328366866
********** ---------------- ----------
sum 76310.184 1
200406 5761 13318.930 .169050772
5762 12795.060 .162401542
5763 13318.930 .169050772
5764 13295.187 .168749414
5765 26058.460 .330747499
********** ---------------- ----------
sum 78786.567 1
200407 5761 13710.265 .170545197
5762 13224.297 .164500127
5763 13710.265 .170545197
5764 13444.093 .167234221
5765 26301.881 .327175257
********** ---------------- ----------
sum 80390.801 1
200408 5761 14135.781 .170911147
5762 13376.468 .161730539
5763 14135.781 .170911147
5764 13929.694 .168419416
5765 27130.638 .328027751
********** ---------------- ----------
sum 82708.362 1
20 rows selected.
- Oracle中分析函数使用.zip (13.8 KB)
- 下载次数: 19
相关推荐
1 Oracle开发专题之:分析函数 OVER 2 Oracle开发专题之:分析函数 Rank Dense rank row number 3 Oracle开发专题之:分析函数3 Top Bottom N First Last NTile 4 Oracle开发专题之:窗口函数 5 Oracle开发专题...
Oracle分析函数是数据库管理系统Oracle中的一种高级特性,自Oracle 8.1.6版本开始引入,主要用于处理复杂的聚合计算和数据分析任务。它们提供了一种更高效、更灵活的方式来执行诸如累计计算、分组内的百分比计算、前...
Oracle分析函数是数据库管理系统Oracle中的一种高级查询工具,主要用于处理多行数据并返回与每一行相关的聚合信息。这些函数在在线分析处理(OLAP)环境中特别有用,因为它们能够对数据进行复杂的分析,例如计算累计...
### Oracle分析函数详解 #### 一、Oracle分析函数概述 Oracle分析函数是在处理大量数据时进行高级数据分析的强大工具,主要用于在线分析处理(OLAP)场景。这类函数可以在单个SQL语句中对数据进行复杂的计算,包括...
### Oracle分析函数详解 #### 一、Oracle分析函数概述 Oracle分析函数是在处理大量数据时极为有用的一套工具,主要用于在线分析处理(OLAP)场景。这类函数可以在多个级别上进行数据聚合,并支持复杂的排序、分组...
Oracle 分析函数详解 Oracle 分析函数是 Oracle 数据库中的一种强大功能,能够帮助用户快速进行数据分析和处理。在本文中,我们将对 Oracle 分析函数进行详细的介绍,并对其各个函数进行解释。 一、总体介绍 ...
### Oracle分析函数详解 #### 一、概述 Oracle分析函数是一种强大的工具,它允许用户对分组数据执行复杂的计算,并且结果可以根据特定条件进行动态调整。这种灵活性使得Oracle分析函数在处理复杂的数据集时非常...
### Oracle分析函数详解 #### 一、概述 Oracle分析函数是一种强大的工具,它自Oracle 8.1.6版本开始引入,并在后续版本中不断完善和发展。这类函数的主要用途在于能够针对一组数据执行复杂的聚合计算,并且不同于...
### Oracle分析函数详解 #### 一、概述 Oracle分析函数是一种强大的工具,它允许用户对分组数据执行复杂的计算,并且能够返回多个结果行。这与传统的聚合函数(如`SUM`、`COUNT`等)形成鲜明对比,后者通常只针对...
### Oracle分析函数及开窗函数详解 #### 一、Oracle分析函数概述 Oracle自8.1.6版本开始引入了分析函数,这类函数主要用于计算基于组的聚合值,并且与传统的聚合函数不同的是,分析函数可以针对每个组返回多行结果...
Oracle数据库中的RATIO_TO_REPORT()函数是一个非常有用的分析函数,尤其在进行数据比例分析和比较时。这个函数能够计算一个值相对于所有值总和的比例,返回的结果是一个百分比。配合OVER()子句,它可以用于全局或者...
从给定的文件信息来看,主要讨论的是Oracle 10g R2中的分析函数,这是一种在数据库查询中处理复杂数据汇总需求的高级功能。分析函数允许用户基于一组数据执行复杂的计算,而不仅仅是简单的聚合(如SUM,AVG等),...
Oracle自定义脱敏函数是数据库管理中用于保护敏感数据的一种常见方法,尤其是在处理用户个人信息时。本文将深入解析Oracle自定义脱敏函数的实现细节,以确保在数据共享、备份或分析时不泄露关键信息。 首先,我们来...
### Oracle中的分析函数详解 #### 引言 随着Oracle 8i版本的发布,一系列新的分析函数被引入,这些函数极大地增强了数据处理的能力,并且在复杂查询方面提供了更高效的解决方案。本文将详细介绍Oracle 8i中引入的...
Oracle数据库管理员(DBA)在IT领域中扮演着至关重要的角色,负责管理、维护和优化Oracle数据库系统。《Oracle DBA必备技能详解》这本书是DBA和Oracle技术爱好者深入理解Oracle 10g数据库系统的宝贵资源。以下是根据...
Oracle 9i 引入了分析函数,这是一种强大的功能,允许在查询中对分组数据执行更复杂的计算。与传统的聚合函数(如SUM、AVG、COUNT等)不同,分析函数可以为每个组内的每一行返回多个结果,这使得它们非常适合于处理...
Oracle 函数大全是对 Oracle 数据库中各种函数的总结和分类,包括分析函数、聚合函数、转换函数、日期型函数、字符型函数、数值型函数和其他函数等。 一、分析函数 Oracle 分析函数是 Oracle 数据库中的一种强大...