- 浏览: 90372 次
- 性别:
- 来自: 深圳
文章分类
- 全部博客 (81)
- 读书笔记 (14)
- NetBeans学习 (1)
- JavaBeans and Bean Events (3)
- 《Pro Oracle SQL》Chapter 2 SQL Execution (13)
- 《Pro Oracle SQL》Chapter 3 Access and Join Methods (16)
- Pro Oracle SQL Chapter 5 (0)
- Pro Oracle SQL Chapter 6 (0)
- Pro Oracle SQL Chapter 7 (9)
- Pro Oracle SQL Chapter 8 (9)
- 《Pro Oracle SQL》Chapter 9 The Model Clause (11)
- 《Pro Oracle SQL》Chapter 10 Subquery Factoring (7)
最新评论
-
mojunbin:
这个不能不顶。
《Pro Oracle SQL》 Chapter2--2.1 Oracle Architecture Basics -
Branding:
谢谢,获益匪浅
《Pro Oracle SQL》--chapter 5--5.6 Building Logical Expressions -
Branding:
《Pro Oracle SQL》--Chapter 5--5.4 Questions about the Question -
Branding:
谢谢
《Pro Oracle SQL》 翻译序 -- 读书心得 -
jiaoshiguoke:
继续 加油
《Pro Oracle SQL》--Chapter 6--6.1 Explain Plans--之三
《Pro Oracle SQL》Chapter 8 -- 8.7 Other Analytic Functions 之二
- 博客分类:
- Pro Oracle SQL Chapter 8
The analytic function ratio_to_report calculates the ratio of a value to the sum of values in the data
partition. If the partitioning clause is not specified, this function calculates the ratio of a value to the sum
values in the whole result set. This analytic function is very useful in calculating ratios at various levels
without a need for self-joins.
Ratio_to_report is useful in computing the percentage of a value compared to the total value in a
report. For example, consider a sales report of a product in a retail chain. Each outlet in the retail chain
contributed to the total sum of sales computed for that product , and knowing what percentage of sales is generated from an outlet is quite useful for market trend analysis. Ratio_to_report allows you to
compute the percentage easily. Further, this ratio can be calculated at various levels such as district,
region, and country. Essentially, data can be diced and sliced at various ways for market trend analysis.
Ratio_to_report 在报告中计算某数值相对于总数的百分比是有用的。 例如,考虑一零售链上某产品的销售报告。在零售链中的每一分店某款产品的销售额汇总成了那款产品的总销售金额 ,知道某一分店的销售额百分比,对于市场趋势分析是非常有用的。Ratio_to_report使得你计算百分比非常容易。再者,这个比率能在各种层次上计算,诸如district(地区),region(区域), and country(国家). 本质上,数据可以用各种方式切割 以利于市场趋势分析。
product, country, region, and year level and the ratio sales_ratio_prod is computed at the product,
country, and region level. Ratio_to_report function returns a ratio and it is multipled by 100 to compute
percentage.
The ratio_to_report(sale) over( partition by product, country, region, year) clause
calculates the ratio of Sale column value to the sum of Sale column values in a data partition, partitioned
by the columns Product, Country, Region, and Year. The next clause ratio_to_report(sale) over(
partition by product, country, region) is different as the Year column is not included in the
partitioning columns. So, the ratio is calculated for all years.
The ratio_to_report function will return a null value if the expression or column specified in the
function returns null values. But other null values in the data partition will be handled as either zero
values or empty strings, similar to aggregation functions.
1 select year, week,sale,
2 trunc(100*
3 ratio_to_report(sale) over(partition by product, country, region ,year)
4 ,2) sales_yr,
5 trunc(100*
6 ratio_to_report(sale) over(partition by product, country, region)
7 ,2) sales_prod
8 from sales_fact
9 where country in ('Australia') and product ='Xtend Memory'
10 order by product, country,year, week
/
YEAR WEEK SALE SALES_YR SALES_PROD
---- ---- ---------- ---------- ----------
2000 1 46.70 1.31 .35
2000 3 93.41 2.63 .7
2000 4 46.54 1.31 .35
2000 5 46.70 1.31 .35
2000 7 70.80 2 .53
2000 8 46.54 1.31 .35
...
Percent_rank
The percent_rank function returns the rank of a value in a data partition, expressed as a fraction between 0 and 1. Percent_rank is calculated as (rank -1)/(N-1) whereas N is the number of elements in the data partition if the partitioning clause is specified or the total number of rows in the result set if the
partitioning clause is not specified. The percent_rank function is useful to compute the relative standing
of a value in a result set, as a percentile.
percent_rank 返回数据分区中的秩(rank)值,用一个0到1之间分数表示。percent_rank的计算公式是(rank -1)/(N-1),这里的N,如果分区子句指定了则是数据分区中的元素数量,若没有指定分区子句则是结果集中的所有行数。 percent_rank函数 在计算一个结果集中的值的相对位置时是有用的,用百分率表示。
This relative rank can be calculated relative to a partition or the whole result set. For example,
computing the sales percentile of a retail outlet in a district or region helps find the top performing
outlets or the worst performing outlet.
相对秩能相对于分区或整个结果集来计算。例如,计算在一地区或区域某零售商店的销售百分比有助于找出业绩最佳和最差的商店。
In Listing 8-15, you calculate the top fifty sale percentile by year using the percent_rank function.
The clause percent_rank() over(partition by product, country, region , year order by sale desc)
calculates the percent rank of the Sale column in a data partition defined by the partitioning columns
Product, Country, Region, and Year. Rows are ordered by the Sale column in descending order. Function
output is multiplied by 100 to compute percentile.
在列表8-15中,你使用percent_rank函数按年计算前50名销售额的百分比值。子句percent_rank() over(partition by product, country, region , year order by sale desc) 计算分区列所定义数据分区中Sale列的百分比秩。
Listing 8-15. percent_rank Function
1 select * from (
2 select year, week,sale,
3 100 * percent_rank() over(
4 partition by product, country, region, year
5 order by sale desc
6 ) pr
7 from sales_fact
8 where country in ('Australia') and product ='Xtend Memory'
9 ) where pr <50
10* order by year, sale desc
YEAR WEEK SALE PR
----- ---- ---------- -------
2001 16 278.44 .00
2001 4 256.70 2.27
2001 21 233.70 4.55
2001 48 182.96 6.82
...
Percentile_cont
The percentile_cont function is useful to compute the interpolated values, such as the median
household income per region or city. The percentile_cont function takes a probability value between 0
and 1 and returns an interpolated percentile value that would equal the percent_rank value with respect
to the sort specification. In fact, percentile_cont function performs inverse of percent_rank function
and it is easier to understand the percentile_cont function in conjunction with the output of
percent_rank function.
percentile_cont函数用于计算内插值,诸如每一区域或城市家庭收入的中位数。percentile_cont函数接收一个0到1的概率值, 返回一内插的百分比值,它等于percent_rank相对于排序规范的值。事实上,percentile_cont函数是 percent_rank函数的倒数,结合percent_rank函数的输出就容易理解percentile_cont函数了。
The percentile_cont function retrieves the column value matching (or interpolated) with the
percent_rank of the argument. For example, the clause percentile_cont(0.25) retrieves the value that
has percent_rank of 0.25, assuming matching sort order for these two functions. Another example is
computing the median household income in a city or region. The median value will have a percent_rank
of 0.5 by the definition of median value. The clause percentile_cont(0.5) will return the median value
as the percentile_cont function is calculating the value with a percent_rank of 0.5. In fact, median
function is a specific case of the percentile_cont function with a default value of 0.5.
percentile_cont函数检索匹配(或者差值替换)percent_rank参数的列值。例如,子句 percentile_cont(0.25) 检索percent_rank为0.25的值,假定这两个函数的排序顺序一致。另一例子是计算某一城市或区域家庭收入的中位数。依照中位数值的定义中位数对应的percent_rank值是0.5。 子句 percentile_cont(0.5) 将返回中位数,因为percentile_cont函数计算的是percent_rank为0.5的值。事实上,median函数是 percentile_cout 函数默认值为0.5的特例。
Nulls are ignored by the function. This function does not support windowing clauses either.
percentile_cont函数忽略Null。该函数也不支持开窗子句。
Syntax for the percentile_cont function is: percentile_cont 函数的语法是:
Percentile_cont(expr) within group (sort-clause)
over (partition-clause order-by-clause)
The syntax for the percentile_cont function is slightly different from the analytic functions
discussed so far. A new clause within group (order by sale desc) replaces the order-by clause, and it
is functionally same as specifying an order-by clause. In Listing 8-16, the clause percentile_cont (0.5)
within group (order by sale desc) over( partition by product, country, region , year) is calling
the percentile_cont function and passing a probability value of 0.5. Sort order is defined by the clause
within group (order by sale desc) . The partition-by clause over( partition by product, country,
region , year) is specifying the partitioning columns.
percentile_cont函数的句法不同于现在为止所讨论的那些分析函数。新子句group (order by sale desc)代替了order-by子句,且它的功能和order-by子句一样。 列表8-16中,子句percentile_cont (0.5) within group (order by sale desc) over( partition by product, country, region , year) 调用percentile_cont函数传入概率值0.5。排序顺序由子句group (order by sale desc) 定义。分区(partition-by)子句 over( partition by product, country, region , year) 指定分区列。
Listing 8-16 shows the output of percent_rank in a side-by-side comparison to that from
percentile_cont , with a similar partition-by clause and order-by clause. Notice that for the column
values year=2001 and week=5, the Sale column value is 93.44 and the percent_rank of that value is 0.5.
Essentially, value of 93.44 is occurring with a percent_rank of 0.5 in the descending order of Sale column
values in that data partition. In a nutshell, the value of 93.44 is a median value and thus the percent_rank is 0.5. Hence the percent_rank function with an argument of 0.5 returns a value of 93.44.
列表8-16展示了,带有相似分区(partition-by)子句和排序(order-by)子句的,percent_rank一对一 percentile_cont 的输出。注意列值year=2001 and week=5,Sale列值是93.44,percent_rank值是0.5 。实质上,值93.44出现在那个数据分区降序排列的Sale列值的percent_rank为0.5处。简而言之,值93.44是中位数且 percent_rank是0.5。因此percent_rank函数带有参数0.5返回的是值93.44 。
Further, note that the output row for the column values with year=2000. There is no Sale column
value with a percent_rank matching exactly to 0.5 in the data partition. If there is no value matching
exactly, then the percentile_cont function computes an interpolated value using the nearest values.
Note that there is a row in that data partition with a percent_rank of 0.48 for the Sale column value of
79.36 and the next row in that sort order has a percent_rank of 0.51 for the Sale column value of 78.82.
Since the specified percent_rank of 0.5 is between 0.48 and 0.51, the percentile_cont function
interpolated these two corresponding Sale column values 79.36 and 78.82 and calculated
percentile_cont (0.5) as 79.09, an average of those two Sale column values. Values are averaged as this
function assumes continuous distribution.
还有,注意列值为year=2000所在输出行。在数据分区中没有percent_rank准确匹配0.5的Sale列值。如果没有准确的匹配值,则 percentile_cout函数计算一个最接近的内插值。 注意在数据分区中有一行的percent_rank值是0.48和Sale列值为79.36 且顺序的下一行的percent_rank值是0.51和Sale列值是78.82。因此指定的percent_rank值0.5介于0.48和0.51 之间,percentile_cont函数内插于这两者对应的Sale列值79.36和78.82中且计算percentile_cont (0.5) 是79.09,这两个Sale列值的平均值。因为函数假定是连续分布所以取的平均值。
Notice that output rows are not sorted in the Listing 8-16. Reason is that, even though there is an
order-by clause specified in the analytic function specification (line 3 and line 7), there is no order-by
clause in the main body of the query. Should you need rows to be sorted, you need to specify sorting
order explicitly in the main body of the query also.
注意在列表8-16中输出行没有排序。因为,即使在分析函数规范中指定order-by子句(行3和行7),在查询的主体中没有order-by子句。如果你需要行有序,就需要在查询主体中明确的指定排序顺序。
Listing 8-16. The percentile_cont Function
1 select year, week,sale,
2 percentile_cont (0.5) within group
3 (order by sale desc)
4 over( partition by product, country, region , year ) pc,
5 percent_rank () over (
6 partition by product, country, region , year
7 order by sale desc ) pr
8 from sales_fact
9* where country in ('Australia') and product ='Xtend Memory'
YEAR WEEK SALE PC PR
---- ---- ---------- ---------- ----------
...
2001 27 94.48 93.44 .454545455
2001 46 93.58 93.44 .477272727
2001 5 93.44 93.44 .5
2001 37 93.16 93.44 .522727273
2001 9 92.67 93.44 .545454545
...
2000 40 89.56 79.09 .435897436
2000 28 88.96 79.09 .461538462
2000 38 79.36 79.09 .487179487
2000 35 78.82 79.09 .512820513
2000 7 70.80 79.09 .538461538
2000 15 70.47 79.09 .564102564
...
Percentile_disc
The percentile_disc function is functionally similar to percentile_cont
except that the percentile_cont function uses a continuous distribution
model and
the percentile_disc function assumes a discrete distribution
model.
As discussed in the earlier section, when there is no value
matching exactly with the specified percent_rank, then the
percentile_cont(0.5) computes an average of two nearest values. In
contrast, the percentile_disc function retrieves the value with a
percent_rank just greater than the passed argument, in the case of
ascending order. In the case of descending order, the percentile_cont
function retrieves the value that has a percent_rank just smaller than
the passed argument.
percentile_dise函数在功能上相似于percentile_cont函数,只不过percentile_cont函数使用连续分布模型而
percentile_disc函数假定离散分布模型。
正如前面一节讨论的,当没有值准确的匹配指定的percent_rank,则
percentile_cont(0.5)计算两个最接近的平均值。相比而言,若升序percentile_disc函数检索其percent_rank
大于传入参数的值。若降序,percentile_disc函数检索其percent_rank小于传入参数的值。
In Listing
8-17, the percentile_cont function is replaced by two calls to the
percentile_disc function.
The first call to the function starting in
line 2 specifies decending sort order, and the next call in line 4
specifies no sort order, so it defaults to ascending sort order. In both
calls to the percentile_disc function, and argument of 0.5 is passed.
as there is no row with a percent_rank of 0.5, the percentile_disc
function with the descending sort order specification returns a value of
79.36, as this value has a percent_rank of 0.48 just below the
specified argument 0.5. For the ascending order, this function return a
value of 78.82 as this value has a percent_rank of 0.51 ---- just above
0.5 .
在列表8-17中,percentile_cont
函数被两次percentile_disc函数的调用所取代。
第一次函数调用开始于第2行,指定为降序。下一次调用在第4行没有指定排序顺序,所以取默认
值升序排列。两次percentile_disc函数调用,传入的参数都是0.5.
因为没有任何行的percent_rank为0.5,指定降序排列的percentile_disc函数返回值79.36,因为这个值的
percent_rank是0.48低于指定的参数0.5。对于升序,函数返回值78.82该值的percent_rank是0.51 ----
正好高于0.5 。
Listing 8-17. The percentile_disc Function
1 select year, week, sale,
2 percentile_disc(0.5) within group( order by sale desc )
3 over( partition by product, country, region, year) pd_desc,
4 percentile_disc(0.5) within group( order by sale )
5 over( partition by product, country, region, year) pd_asc,
6 percent_rank() over(
7 partition by product, country, region, year
8 order by sale desc ) pr
9 from sales_fact
10* where country in ('Australia') and product = 'Xtend Memory'
YEAR WEEK SALE PD_DESC PD_ASC PR
---- ---- ---------- ---------- ---------- ----------
2000 3 93.41 79.36 78.82 .41025641
2000 40 89.56 79.36 78.82 .435897436
2000 28 88.96 79.36 78.82 .461538462
2000 38 79.36 79.36 78.82 .487179487
2000 35 78.82 79.36 78.82 .512820513
2000 7 70.80 79.36 78.82 .538461538
2000 15 70.47 79.36 78.82 .564102564
2000 45 67.62 79.36 78.82 .58974359
2000 52 67.45 79.36 78.82 .615384615
发表评论
-
《Pro Oracle SQL》Chapter8--8.9 Advanced topics
2012-03-05 21:41 1102Advanced topics 高级 ... -
《Pro Oracle SQL》Chapter8--8.8 Performance Tuning
2012-03-02 23:11 705Performance Tuning 性能调优 ... -
《Pro Oracle SQL》Chapter 8 -- 8.7 Other Analytic Functions 之三
2012-02-29 22:09 1020NTILE (page 263) ... -
《Pro Oracle SQL》Chapter 8 -- 8.7 Other Analytic Functions 之一
2012-02-26 14:35 1055Other Analytic Functions ... -
《Pro Oracle SQL》Chapter 8--8.6 First_ value & Last_value
2012-02-21 23:52 897First_value & Last_value ... -
《Pro Oracle SQL》Chapter 8 -- 8.5 Lead and Lag
2012-02-19 16:07 1040Lead and Lag (page 2 ... -
《Pro Oracle SQL》Chapter 8 -- 8.4Aggregation Functions
2012-02-17 01:47 1007Aggregation Functions 聚合 ... -
《Pro Oracle SQL》Chapter 8 -- 8.1-8.2 Anatomy of Analytic Functions -8.3
2012-02-11 11:16 1151Riyaj Shamsudeen (page 243) ...
相关推荐
Pro Oracle SQL unlocks the power of SQL in the Oracle Database—one of the most potent SQL implementations on the market today. To master it requires a three-pronged approach: learn the language ...
本课程"Advanced-SQL-MYSQL-for-Analytic-Business-Intelligence"深入探讨了如何利用高级SQL技术与MySQL的强大功能来提升数据分析能力,以支持更高效、更精准的商业决策。 首先,课程会介绍SQL的基本语法,包括数据...
根据给定的文件信息,我们可以深入探讨与Vertica数据库及其SQL参考手册相关的多个关键知识点。 ### SQL Reference Manual -- Vertica #### 1. **Vertica® Analytic Database 3.5 Preview** - **版本**: 此处提及...
二、Oracle SQL高级特性 1. 分区(Partitioning):Oracle支持基于不同方式(如范围、列表、哈希等)的表分区,以提高查询性能和管理大规模数据。 2. 索引(Indexes):索引能加快查询速度,Oracle支持B树、位图、...
### Oracle Hyperion Essbase - 存储分析数据 #### 一、Essbase 数据存储选项 在 Oracle Hyperion Essbase 中,提供了两种主要的数据存储选项:块存储(Block Storage)和聚合存储(Aggregate Storage)。这两种...
- **标准接口**:支持通过 ODBC、JDBC、ODP.NET、OCI 和 Pro*C/C++ 等多种方式访问 SQL 和 PL/SQL。 ### 关键优势 #### 实时性能 (Real-Time Performance) - **改变数据存储位置的假设**:与传统基于磁盘的数据库...
4. **窗口函数(Analytic Functions)**:Oracle的窗口函数允许你在结果集的特定“窗口”内执行计算,如RANK()、ROW_NUMBER()、LAG()、LEAD()等,它们在数据分析和排名问题中非常有用。 5. **集合操作(UNION、...
### ORACLE SQL性能调整 #### 1. 选用适合的ORACLE优化器 ORACLE提供了三种优化器选项:基于规则的优化器(RULE)、基于成本的优化器(COST)和选择性优化器(CHOOSE)。默认情况下,ORACLE使用CHOOSE优化器,其...
分析函数(Analytic Functions)是Oracle SQL中的高级特性,它们在数据集上执行计算,并返回基于分组或排序的数据结果。与聚合函数(如SUM, AVG, COUNT等)不同,分析函数可以在每个行级别上返回结果,而不只是返回...
首先需要获取AnalyticDB PostgreSQL版连接信息 获得AnalyticDB连接串. 可以加入下方钉钉群1元试用. 运行demo cd $PROJ_DIR pip install -r requirements.txt cp config/config_template config/config.yml # 将链接...
serving informational or analytic needs. ■■ The supporting technology for operational processing is fundamentally different from the technology used to support informational or analytical needs. ■...
大规模对数分析和 DRS-DPM 使用 Logstash 和 MongoDB 进行大规模日志分析第 1 部分:从数据中心的 VM 收集实时日志。 这些日志使用 Logstash 结构化、解析并转发到 MongoDB。 使用谷歌图表和 Tableau 可视化的日志。...
Oracle SQL优化是一个重要的数据库管理技能,对于提升数据库性能和系统响应速度至关重要。本文将深入探讨Oracle SQL优化的几个关键方面,以帮助初学者理解和实践。 首先,了解Oracle的优化器是优化过程的基础。...
Oracle JDK >= 1.7.x Apache Maven >= 3.0.5 克隆这个 repo 并作为一个现有的 Maven 项目导入到 Eclipse IDE 或 IntelliJ IDEA。 此应用程序使用使生活变得简单,同时使用集合和其他通用内容。 此应用程序还...
藏经阁-StreamSets and Spark_ Analytic.pdf
- **分析函数(Analytic Functions)**:如ROW_NUMBER()、RANK()等,用于处理分组数据或执行复杂的数据排序和窗口运算。 - **分组函数(Grouping Functions)**:如GROUP_CONCAT,用于组合多行数据成一个字符串。 - *...
程序员考试刷题 分析计算旧金山大学的 MSDS 501 秋季 2020 课程说明 本课程是该课程的一部分,专门设计用于促进那些还不是熟练程序员的人提高计算机编程技能。 本课程将侧重于与数据科学直接相关的应用计算机编程...