`
hot66hot
  • 浏览: 457262 次
  • 性别: Icon_minigender_1
  • 来自: 北京
社区版块
存档分类
最新评论

How to select the first least max top N rows from each group in SQL

阅读更多
从分组中选择极值查询/前N项查询 是经常会遇到的问题 ,下面通过简单举例展示这种SQL的写法

举例表
type variety price
apple gala 2.79
apple fuji 0.24
apple limbertwig 2.87
orange valencia 3.59
orange navel 9.36
pear bradford 6.05
pear bartlett 2.14
cherry bing 2.55
cherry chelan 6.33


Selecting the one minimum row from each group

期望结果
type variety price
apple fuji 0.24
orange valencia 3.59
pear bartlett 2.14
cherry bing 2.55



方法一 通过分组子查询实现

select f.type, f.variety, f.price
from (
select type, min(price) as minprice
from fruits group by type
) as x inner join fruits as f on f.type = x.type and f.price = x.minprice;


结果:

type variety price
apple fuji 0.24
cherry bing 2.55
orange valencia 3.59
pear bartlett 2.14


方法二 通过关联子查询实现


select type, variety, price
from fruits
where price = (select min(price) from fruits as f where f.type = fruits.type);


结果:

type variety price
apple fuji 0.24
orange valencia 3.59
pear bartlett 2.14
cherry bing 2.55


以上两个查询是等价的.

Select the top N rows from each group

每组前N个查询是比较痛苦的问题,因为聚合函数只返回一个值,所以通过聚集函数分组查询前几个数据是不可能的.

比方说,我要选择每个类型最便宜的两个水果。
可以通过变换SQL写法实现

方法一:

select type, variety, price
from fruits
where price = (select min(price) from fruits as f where f.type = fruits.type)
or price = (select min(price) from fruits as f where f.type = fruits.type
and price > (select min(price) from fruits as f2 where f2.type = fruits.type));


结果:
type variety price
apple gala 2.79
apple fuji 0.24
orange valencia 3.59
orange navel 9.36
pear bradford 6.05
pear bartlett 2.14
cherry bing 2.55
cherry chelan 6.33

这种大量子查询方法性能很差,如果是前3个,前4个等等 这种查询变得不可实现


方法二 从每个品种的水果,品种不超过第二便宜的查询,通过关联子查询实现

select type, variety, price
from fruits
where (
 select count(1) from fruits as f
 where f.type = fruits.type and f.price < fruits.price
) <= 2;


第二种方法在fruits表很大时效果不佳

方法三

可以使用union all 实现 (union all 与 union 的区别是 前者不会通过排序消除重复)

(select * from fruits where type = 'apple' order by price limit 2)
union all
(select * from fruits where type = 'orange' order by price limit 2)
union all
(select * from fruits where type = 'pear' order by price limit 2)
union all
(select * from fruits where type = 'cherry' order by price limit 2)


如果分组(这里是水果种类)数量不大/分页的情况下,可以使用union把数据切成多段分开查询,在好的索引支持下效率很高.

(测试中发现如果去掉每段查询的圆括号则limit 限制整个结果集的返回行数 而不是每段.)

使用union all 联合查询是解决N+1 问题的利器(特别是1 - n 中 n方数据量特别大时),不过JPA 框架在处理union all 查询时有bug (悲催啊)

h3. 实际项目使用情况:

   CRM中商家页面 商家与分店是典型的N+1查询问题,由于有些商家分店数比较多,页面中只展示前3个分店,此时可以使用union all 查询优化 , 通过一条SQL 返回所有商家的前3家分店.

实例参考自Baron Schwartz的博客:http://www.xaprb.com/blog
分享到:
评论

相关推荐

    微软内部资料-SQL性能优化5

    In a nonclustered index, the leaf level contains each index key, plus a bookmark that tells SQL Server where to find the data row corresponding to the key in the index. A bookmark can take one of two ...

    Expert T-SQL Window Functions in SQL Server(Apress,2015)

    Expert T-SQL Window Functions in SQL Server takes you from any level of knowledge of windowing functions and turns you into an expert who can use these powerful functions to solve many T-SQL queries....

    微软内部资料-SQL性能优化3

    An intent lock indicates that SQL Server wants to acquire a shared (S) lock or exclusive (X) lock on some of the resources lower down in the hierarchy. For example, a shared intent lock placed at the ...

    SQL.and.Relational.Theory.How.to.Write.Accurate.SQL.Code.3rd.Edition

    Could you write an SQL query to find employees who have worked at least once in every programming department in the company? And be sure it’s correct? Why is proper column naming so important? Nulls ...

    SAS.9.2.SQL.Procedure.Users.Guide

    The SELECT statement is the core of any SQL query and is used to retrieve data from one or more tables. The basic structure of a SELECT statement includes specifying the columns to be retrieved and ...

    MMATrans.rar_Grid map_The First_z map_z-map

    These are the rows of Z values of the grid, organized in row order. Each row has a constant Y coordinate. Grid row 1 corresponds to ylo and the last grid row corresponds to yhi. Within each row, the ...

    Foundations for Analytics with Python O-Reilly-2016-Clinton W. Brownley

    The example also shows how to store information separately in both a list and a dictionary in order to create the header row and the data rows for the output file. This is a reminder that you can ...

    Beginning T-SQL with Microsoft SQL Server 2005 and 2008

    Transact-SQL, or T-SQL, is Microsoft Corporation’s powerful implementation of the ANSI standard SQL database query language, which was designed to retrieve, manipulate, and add data to relational ...

    Introducing SQL Server(Apress,2015)

    You'll learn how databases work and how to use the T-SQL language by practicing on one of the most widely-used and powerful database engines in the corporate world: Microsoft SQL Server. Do you ...

    sqlServer删除重复数据

    DELETE FROM tableName WHERE id NOT IN (SELECT TOP 1 id FROM tableName a WHERE name = tableName.name) ``` 或 ```sql DELETE FROM tableName WHERE id NOT IN (SELECT MAX(id) FROM tableName WHERE name = ...

    SSD7 选择题。Multiple-Choice

    (b) placing SQL keywords, such as select, under the column names they want to retrieve (c) typing a syntactically correct SQL query that uses column and table names similar to the correct column and...

    七种数据库中Select Top的用法

    SELECT * FROM table_name ORDER BY column_name DESC FETCH FIRST 10 ROWS ONLY; ``` #### 七、MS Access 中的 TOP 在 MS Access 中,同样支持 `TOP` 关键字。 ```sql SELECT TOP 10 * FROM table_name ORDER BY...

    The Triangle

    The first line contains one integer N: the number of rows in the triangle. The following N lines describe the data of the triangle. The number of rows in the triangle is &gt; 1 but &lt;= 100. The numbers ...

    v$sqlarea top sql 查询举例

    根据提供的文件信息,本文将详细解释如何通过 `v$sqlarea` 视图查询实例中的 Top SQL,以便快速定位性能瓶颈或资源消耗高的 SQL 语句。以下内容将深入解析每种查询方法及其背后的原理。 ### 知识点一:通过 Buffer ...

    用SQL删除重复记录的N种方法

    根据提供的标题、描述以及部分内容,本文将详细介绍使用SQL删除数据库表中的重复记录的不同方法。在实际操作中,去除重复数据是数据库管理中的常见需求之一,尤其是当数据来源于多个不同的源或者由于历史原因导致...

    Sybex - Mastering Visual Basic .NET Transact SQL (VBL).pdf

    ment, which allows you to calculate totals on groups of the rows retrieved from the database. This statement looks and feels very much like the straight SQL statements, yet it’s not part of standard ...

    oracle advanced sql 高级SQL教程 ORACLE官方教材

    Walking the Tree: From the Top Down 5-9 Ranking Rows with the LEVEL Pseudocolumn 5-10 Formatting Hierarchical Reports Using LEVEL and LPAD 5-11 Pruning Branches 5-13 Summary 5-14 Practice 5 Overview 5...

    删除表中重复数据sql语句

    SELECT * FROM #Tmp WHERE autoID IN (SELECT autoID FROM #tmp2); ``` 以上SQL语句提供了多种处理表中重复数据的方式,可以根据实际情况选择最合适的方案进行应用。在实际操作中,建议先在测试环境中尝试这些SQL...

    mysql SELECT FOUND_ROWS()与COUNT()用法区别1

    mysql&gt; SELECT SQL_CALC_FOUND_ROWS * FROM tbl_name WHERE id &gt; 100 LIMIT 10; mysql&gt; SELECT FOUND_ROWS(); ``` 在第一个查询中,`SQL_CALC_FOUND_ROWS`告诉MySQL计算所有符合条件的行,即使只返回10行。然后,第...

Global site tag (gtag.js) - Google Analytics