`
oxidy
  • 浏览: 166009 次
  • 性别: Icon_minigender_1
  • 来自: 吉林长春
社区版块
存档分类
最新评论

在数据库中如何分类、分组并总计SQL数据[builder.com.cn]

SQL 
阅读更多

您需要了解如何使用某些SQL子句和运算符来安排SQL数据,从而对它进行高效分析。下面这些建议告诉您如何建立语句,获得您希望的结果。

以有意义的方式安排数据可能是一种挑战。有时您只需进行简单分类。通常您必须进行更多处理——进行分组以利于分析与总计。可喜的是,SQL提供了大 量用于分类、分组和总计的子句及运算符。下面的建议将有助于您了解何时进行分类、何时分组、何时及如何进行总计。欲了解每个子句和运算符的详细信息,请查 看在线书籍

#1 :分类排序

通常,我们确实需要对所有数据进行排序。SQL的ORDER BY子句将数据按字母或数字顺序进行排列。因此,同类数据明显分类到各个组中。然而,这些组只是分类的结果,它们并不是真正的组。ORDER BY显示每一个记录,而一个组可能代表多个记录。

#2 :减少组中的相似数据

分类与分组的最大不同在于:分类数据显示(任何限定标准内的)所有记录,而分组数据不显示这些记录。GROUP BY子句减少一个记录中的相似数据。例如,GROUP BY能够从重复那些值的源文件中返回一个唯一的邮政编码列表:

SELECT ZIP

FROM Customers

GROUP BY ZIP

仅包括那些在GROUP BY和SELECT列列表中字义组的列。换句话说,SELECT列表必须与GROUP列表相匹配。只有一种情况例外:SELECT列表能够包含聚合函数。(而GROUP BY不支持聚合函数。)

记住,GROUP BY不会对作为结果产生的组分类。要对组按字母或数字顺序排序,增加一个ORDER BY子句(#1)。另外,在GROUP BY子句中您不能引用一个有别名的域。组列必须在根本数据中,但它们不必出现在结果中。

#3 :分组前限定数据

您可以增加一个WHERE子句限定由GROUP BY分组的数据。例如,下面的语句仅返回肯塔基地区顾客的邮政编码列表。

SELECT ZIP

FROM Customers

WHERE State = 'KY'

GROUP BY ZIP

在GROUP BY子句求数据的值之前,WHERE对数据进行过滤,记住这一点很重要。

和GROUP BY一样,WHERE不支持聚合函数。

#4 :返回所有组

当您用WHERE过滤数据时,得到的组只显示那些您指定的记录。符合组定义但不满足子句条件的数据将不会出现在组中。不管WHERE条件如何,如果 您想包括所有数据,增加一个ALL子句。例如,在前面的语句中增加一个ALL子句会返回所有邮政编码组,而不仅仅是肯塔基地区的组。

SELECT ZIP

FROM Customers

WHERE State = 'KY'

GROUP BY ALL ZIP

照这个样子,这两个子句会造成冲突,您可能不会以这种方式使用ALL子句。当您用聚合求一个列的值时,应用ALL子句很方便。例如,下面的语句计算每个肯塔基邮政编码的顾客数目,同时显示其它邮政编码值。

SELECT ZIP, Count(ZIP) AS KYCustomersByZIP

FROM Customers

WHERE State = 'KY'

GROUP BY ALL ZIP

得到的组由根本数据中的所有邮政编码值构成。但是,聚合列(KYCustomerByZIP)显示为0,因为除肯塔基邮政编码组外没有别的组。

远程查询不支持GROUP BY ALL。

#5 :分组后限定数据

WHERE子句(#3)在 GROUP BY子句之前求数据的值。当您希望在分组以后限定数据时,使用HAVING。通常,不管您使用WHERE还是HAVING,得到的结果相同。但要记住,这 两个子句不能互换,这点很重要。如果您存在疑问,这里有一条应用指南:过滤记录时使用WHERE;过滤组时使用HAVING。

一般,您会用HAVING,利用聚合来求一个组的值。例如,下面的语句返回一个邮政编码列表,但这个表内可能不包含根本数据源中的每个邮政编码:

SELECT ZIP, Count(ZIP) AS CustomersByZIP

FROM Customers

GROUP BY ZIP

HAVING Count(ZIP) = 1

仅仅那些只有一名顾客的组出现在结果中。

#6 :详细了解WHERE 和HAVING

如果您仍然对WHERE和HAVING的用法感到迷惑,应用下面的指导方法:

  • WHERE出现在GROUP BY之前;SQL在它分组记录前求WHERE子句的值。
  • HAVING出现在GROUP BY之后;SQL在它分组记录后求HAVING子句的值。

#7 :用聚合总计分组值

分组数据有助于对数据进行分析,但有时您还需要组本身以外的其它信息。您可以增加一个聚合函数来总计分组数据。例如,下面的语句为每次排序显示一个小计:

SELECT OrderID, Sum(Cost * Quantity) AS OrderTotal

FROM Orders

GROUP BY OrderID

与其它的组一样,SELECT和GROUP BY列表必须相匹配。在SELECT子句中包含一个聚合是这一规则的唯一例外。

#8 :总计聚合

您可以通过显示每个组的小计进一步总计数据。SQL的ROLLUP运算符为每个组显示一个额外的记录,一个小计。那个记录是用聚合函数在每个组中求所有记录的值的结果。下面的语句为每个组合计OrderTotal列。

SELECT Customer, OrderNumber, Sum(Cost * Quantity) AS OrderTotal

FROM Orders

GROUP BY Customer, OrderNumber

WITH ROLLUP

一个包含20和25这两个OrderTotal值的组的ROLLUP行将显示OrderTotal值45。ROLLUP结果的第一个值是唯一的,因为它求所有组记录的值。那个值是整个记录集的总和。

ROLLUP不支持聚合函数中的DISTINCT或GROUP BY ALL子句。

#9 :总计每一列

CUBE运算符比ROLLUP更进一步,它返回每个组中每个值的总数。得到的结果与ROLLUP相似,但CUBE包括组中每一列的一个额外记录。下面的语句显示每个组的小计和每名顾客的一个额外总数。

SELECT Customer, OrderNumber, Sum(Cost * Quantity) AS OrderTotal

FROM Orders

GROUP BY Customer, OrderNumber

WITH CUBE

用CUBE得到的总计最为复杂。不仅完成聚合与ROLLUP的工作,而且还求定义组的其它列的值。也就是说,CUBE总计每一个可能的列组合。

CUBE不支持GROUP BY ALL。

#10 :给总计排序

当CUBE的结果杂乱无章时(一般都是这样),可以增加一个GROUPING函数,如下所示:

SELECT GROUPING(Customer), OrderNumber, Sum(Cost * Quantity) AS OrderTotal

FROM Orders

GROUP BY Customer, OrderNumber

WITH CUBE

其结果包括每一行的两个额外的值。

  • 值1表明左边的值是一个总计值——ROLLUP或CUBE的运算符的结果。
  • 值0表明左边的值是一个原始GROUP BY子句产生的详细记录。

英文原版

10 tips for sorting, grouping, and summarizing SQL data

Takeaway: Arranging SQL data so that you can effectively analyze it requires an understanding of how to use certain SQL clauses and operators. These tips will help you figure out how to build statements that will give you the results you want.

This article is also available as a PDF download .

Arranging data in a manner that's meaningful can be a challenge. Sometimes all you need is a simple sort. Often, you need more--you need groups you can analyze and summarize. Fortunately, SQL offers a number of clauses and operators for sorting, grouping, and summarizing. The following tips will help you discern when to sort, when to group, and when and how to summarize. For detailed information on each clause and operator, see Books Online .

#1: Bring order with a sort

More often than not, all your data really needs is a little order. SQL's ORDER BY clause organizes data in alphabetic or numeric order. Consequently, similar values sort together in what appear to be groups. However, the apparent groups are a result of the sort; they aren't true groups. ORDER BY displays each record whereas a group may represent multiple records.

#2: Reduce similar values into a group

The biggest difference between sorting and grouping is this: Sorted data displays all the records (within the confines of any limiting criteria) and grouped data doesn't. The GROUP BY clause reduces similar values into one record. For instance, a GROUP BY clause can return a unique list of ZIP codes from a source that repeats those values:

SELECT ZIP

FROM Customers

GROUP BY ZIP

Include only those columns that define the group in both the GROUP BY and SELECT column lists. In other words, the SELECT list must match the GROUP BY list, with one exception: The SELECT list can include aggregate functions. (GROUP BY doesn't allow aggregate functions.)

Keep in mind that GROUP BY won't sort the resulting groups. To arrange groups alphabetically or numerically, add an ORDER BY clause (# 1). In addition, you can't refer to an aliased field in the GROUP BY clause. Group columns must be in the underlying data, but they don't have to appear in the results.

#3: Limit data before it's grouped

You can limit the data that GROUP BY groups by adding a WHERE clause. For instance, the following statement returns a unique list of ZIP codes for just Kentucky customers:

SELECT ZIP

FROM Customers

WHERE State = 'KY'

GROUP BY ZIP

It's important to remember that WHERE filters data before the GROUP BY clause evaluates it.

Like GROUP BY, WHERE doesn't support aggregate functions.

#4: Return all groups

When you use WHERE to filter data, the resulting groups display only those records you specify. Data that fits the group's definition but does not meet the clause's conditions won't make it to a group. Include ALL when you want to include all data, regardless of the WHERE condition. For instance, adding ALL to the previous statement returns all of the ZIP groups, not just those in Kentucky:

SELECT ZIP

FROM Customers

WHERE State = 'KY'

GROUP BY ALL ZIP

As is, the two clauses are in conflict, and you probably wouldn't use ALL in this way. ALL comes in handy when you use an aggregate to evaluate a column. For example, the following statement counts the number of customers in each Kentucky ZIP, while also displaying other ZIP values:

SELECT ZIP, Count(ZIP) AS KYCustomersByZIP

FROM Customers

WHERE State = 'KY'

GROUP BY ALL ZIP

The resulting groups comprise all ZIP values in the underlying data. However, the aggregate column (KYCustomersByZIP) would display 0 for any group other than a Kentucky ZIP.

Remote queries don't support GROUP BY ALL.

#5: Limit data after it's grouped

The WHERE clause (# 3) evaluates data before the GROUP BY clause does. When you want to limit data after it's grouped, use HAVING. Often, the result will be the same whether you use WHERE or HAVING, but it's important to remember that the clauses are not interchangeable. Here's a good guideline to follow when you're in doubt: Use WHERE to filter records; use HAVING to filter groups.

Usually, you'll use HAVING to evaluate a group using an aggregate. For instance, the following statement returns a unique list of ZIP codes, but the list might not include every ZIP code in the underlying data source:

SELECT ZIP, Count(ZIP) AS CustomersByZIP

FROM Customers

GROUP BY ZIP

HAVING Count(ZIP) = 1

Only those groups with just one customer make it to the results.

#6: Get a closer look at WHERE and HAVING

If you're still confused about when to use WHERE and when to use HAVING, apply the following guidelines:

  • WHERE comes before GROUP BY; SQL evaluates the WHERE clause before it groups records.
  • HAVING comes after GROUP BY; SQL evaluates HAVING after it groups records.

#7: Summarize grouped values with aggregates

Grouping data can help you analyze your data, but sometimes you'll need a bit more information than just the groups themselves. You can add an aggregate function to summarize grouped data. For instance, the following statement displays a subtotal for each order:

SELECT OrderID, Sum(Cost * Quantity) AS OrderTotal

FROM Orders

GROUP BY OrderID

As with any other group, the SELECT and GROUP BY lists must match. Including an aggregate in the SELECT clause is the only exception to this rule.

#8: Summarize the aggregate

You can further summarize data by displaying a subtotal for each group. SQL's ROLLUP operator displays an extra record, a subtotal, for each group. That record is the result of evaluating all the records within each group using an aggregate function. The following statement totals the OrderTotal column for each group:

SELECT Customer, OrderNumber, Sum(Cost * Quantity) AS OrderTotal

FROM Orders

GROUP BY Customer, OrderNumber

WITH ROLLUP

The ROLLUP row for a group with two OrderTotal values of 20 and 25 would display an OrderTotal of 45. The first record in a ROLLUP result is unique because it evaluates all of the group records. That value is a grand total for the entire recordset.

ROLLUP doesn't support DISTINCT in aggregate functions or the GROUP BY ALL clause.

#9: Summarize each column

The CUBE operator goes a step further than ROLLUP by returning totals for each value in each group. The results are similar to ROLLUP, but CUBE includes an additional record for each column in the group. The following statement displays a subtotal for each group and an additional total for each customer:

SELECT Customer, OrderNumber, Sum(Cost * Quantity) AS OrderTotal

FROM Orders

GROUP BY Customer, OrderNumber

WITH CUBE

CUBE gives the most comprehensive summarization. It not only does the work of both the aggregate and ROLLUP, but also evaluates the other columns that define the group. In other words, CUBE summarizes every possible column combination.

CUBE doesn't support GROUP BY ALL.

#10: Bring order to summaries

When the results of a CUBE are confusing (and they usually are), add the GROUPING function as follows:

SELECT GROUPING(Customer), OrderNumber, Sum(Cost * Quantity) AS OrderTotal

FROM Orders

GROUP BY Customer, OrderNumber

WITH CUBE

The results include two additional values for each row:

  • The value 1 indicates that the value to the left is a summary value--the result of the ROLLUP or CUBE operator.
  • The value 0 indicates that the value to the left is a detail record produced by the original GROUP BY clause.
分享到:
评论

相关推荐

    reportbuilder.pdf

    通过这项功能,用户可以在报表中嵌入地图,并在地图上显示地理位置相关的数据。此外,还可以与 Microsoft Virtual Earth 结合使用,实现更为生动的数据可视化。 ##### 5.2 创建地图 创建地图的过程涉及几个步骤:...

    PowerBuilder9.0数据窗口技术

    在PowerBuilder 9.0版本中,数据窗口仍然是核心组件,用于构建用户界面并处理与数据库的交互。下面将详细介绍数据窗口技术及其在PowerBuilder 9.0中的应用。 1. **数据窗口的概念** 数据窗口是PowerBuilder的一个...

    Sybase 公司的 datawindow.net 应用

    数据窗口(DataWindow)是PowerBuilder中的一个核心组件,由Sybase公司开发,它提供了一种强大而灵活的方式来处理数据库查询、展示数据以及与用户进行交互。DataWindow.NET是这个概念在.NET框架下的延伸,旨在为.NET...

    在powerbuilder中自定义报表,自由查询。

    PowerBuilder的数据窗口还提供了许多高级特性,如子数据窗口、嵌套查询、SQL脚本支持等,这使得在PowerBuilder中创建复杂报表和实现高级查询成为可能。 总的来说,在PowerBuilder中自定义报表和自由查询是一项关键...

    Rave 报表结合数据库的使用实例..rar

    通过实践这些步骤并参考提供的示例,你将能够熟练地在Delphi项目中运用Rave报表处理和展示数据库数据。在实际开发过程中,还可以探索更多高级特性,如交互式报表、自定义事件处理和自定义函数,以满足更复杂的需求。

    PowerBuilder 8.0实用程序设计100例

    - 学习如何在PowerBuilder中处理数据库事务,确保数据的一致性和完整性。 - 实现错误捕获和处理机制,提升程序的健壮性。 6. **报表与打印功能** - 利用数据窗口的报告模式创建复杂的报表,包括多列、分页、总计...

    delphi report builder小实例

    2. **数据源设置**: 在Report Builder中,你需要定义数据源,这可以是数据库(如SQL Server、Oracle、MySQL等)、内存表格或者其他数据提供者。数据源的配置涉及连接字符串、用户名、密码以及查询或表名。 3. **...

    PowerBuilder9数据窗口技术

    数据窗口包含多个对象,如Item(项)表示单个数据字段,Column(列)定义字段的显示属性,Filter(过滤器)用于筛选数据,Group(分组)用于数据分组,Summary(汇总)则用于计算总计或平均值等统计信息。...

    PowerBuilder9_0数据窗口技术

    在PowerBuilder 9.0版本中,数据窗口作为核心组件之一,提供了一种高效、灵活的方式来处理和展示数据库中的数据。数据窗口不仅能够简化数据的访问,还能进行复杂的查询、排序、过滤以及格式化,使得开发者可以更专注...

    SQL2008. reporting Services 示例

    1. **数据报表设计**:用户可以使用 Report Builder 或 SQL Server Data Tools (SSDT) 设计复杂的报表,支持多种数据源,如 SQL Server、Oracle、Excel 等。报表设计包括表格、图表、矩阵、图像、地图等多种元素,可...

    《PowerBuilder 9.0课程设计案例精编》源代码.

    这本书的核心在于通过一系列精心设计的案例,帮助读者深入理解PowerBuilder的工作原理,提高编程技能,并在实际项目中应用这些知识。源代码是学习过程中不可或缺的部分,它提供了直接操作和学习的机会,让理论与实践...

    powerbuilder实用技巧

    PowerBuilder是一款强大的数据库应用开发工具,尤其在20世纪90年代至21世纪初,它在企业级应用开发领域有着广泛的应用。本压缩包文件包含的“powerbuilder实用技巧”是一系列关于PB编程的经验分享,对于正在学习或...

    使用数据窗口控件.rar

    数据窗口控件是PowerBuilder中的核心组件之一,它在企业级应用开发中扮演着至关重要的角色。本章节将深入探讨如何有效地使用数据窗口控件,包括其基本概念、功能特性、设计方法以及实际操作技巧。 数据窗口控件是...

    Power builder 9.0精彩变成百例(源代码)

    1. **数据窗口(DataWindow)**:PowerBuilder的核心组件之一,用于显示、编辑、检索和管理数据库中的数据。通过DataWindow,开发者可以快速创建各种报表和界面,支持多种数据库访问方式,如SQL Select、存储过程等...

    powerbuilder源代码——自来水收费系统

    在IT行业中,PowerBuilder是一款强大的数据库应用程序开发工具,尤其在企业级应用开发中具有广泛的应用。本篇文章将深入探讨使用PowerBuilder开发自来水收费系统的相关知识点,旨在帮助开发者理解如何利用这一工具...

    report builder

    还会介绍如何组织和分组数据,以及添加计算字段和总计。 4. **查询和数据源**:详细阐述如何编写SQL查询来选择和过滤数据,以及如何连接多个数据源。可能还会涉及子查询、联接和聚合函数的使用。 5. **数据绑定**...

    Report Builder 3.0帮助文件

    - **分组和汇总**: 可以对数据进行分组,并进行总计、平均、最大值、最小值等统计计算。 2. **报表设计** - **样式和格式化**: 用户可以调整字体、颜色、边框等样式,以提升报表的视觉效果。还可以设置单元格的...

    PB9.0数据窗口技术

    - 编辑(Edit):用户可以直接在数据窗口中编辑数据,更改后的数据会自动提交到数据库。 - 排序(Sort):支持对数据进行升序或降序排列。 - 分组(Group):根据一个或多个字段对数据进行分组,便于聚合计算。 ...

    SQL2008_Report_Services基础教程

    为了能够顺利进行后续步骤,首先需要下载并安装SQL Server 2008 示例和示例数据库。 - **下载地址**: [http://www.CodePlex.com/MSFTDBProdSample](http://www.CodePlex.com/MSFTDBProdSample) - **安装说明**: ...

Global site tag (gtag.js) - Google Analytics