如何基于多行中的最大值选择单行 [英] How to select single row based on the max value in multiple rows

查看:95
本文介绍了如何基于多行中的最大值选择单行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

可能重复:
SQL:查找每个组的最大记录

Possible Duplicate:
SQL: Find the max record per group

我有一个包含四列的表格:

I have a table with four columns as such:

name   major    minor  revision
p1     0        4      3
p1     1        0      0
p1     1        1      4
p2     1        1      1
p2     2        5      0
p3     3        4      4

这基本上是ca表,其中包含程序每个版本的记录.我想进行选择以获取所有程序及其最新版本,因此结果将如下所示:

This is basically ca table containing records for each version of a program. I want to do a select to get all of the programs and their latest version so the results would look like this:

name   major    minor  revision
p1     1        1      4
p2     2        5      0
p3     3        4      4

我不能仅按名称分组并获得每一列的最大值,因为那样我最终只会得到每一列的最高编号,而不是具有最高版本的特定行.我该如何设置?

I can't just group by the name and get the max of each column because then i would just end up with the highest number from each column, but not the specific row with the highest version. How can I set this up?

推荐答案

我尝试解决SQL问题的方法是逐步解决问题.

The way I try to solve SQL problems is to take things step by step.

  • 您想要与每个产品的最大主版本相对应的最大次要版本的最大修订版本.

每种产品的最大主要编号为:

The maximum major number for each product is given by:

SELECT Name, MAX(major) AS Major FROM CA GROUP BY Name;

与每个产品的最大主要号码相对应的最大次要号码由下式给出:

The maximum minor number corresponding to the maximum major number for each product is therefore given by:

SELECT CA.Name, CA.Major, MAX(CA.Minor) AS Minor
  FROM CA
  JOIN (SELECT Name, MAX(Major) AS Major
          FROM CA
         GROUP BY Name
       ) AS CB
    ON CA.Name = CB.Name AND CA.Major = CB.Major
 GROUP BY CA.Name, CA.Major;

因此,最大修订版(与每个产品的最大主要版本相对应的最大次要版本号)由下式给出:

And the maximum revision (for the maximum minor version number corresponding to the maximum major number for each product), therefore, is given by:

SELECT CA.Name, CA.Major, CA.Minor, MAX(CA.Revision) AS Revision
  FROM CA
  JOIN (SELECT CA.Name, CA.Major, MAX(CA.Minor) AS Minor
          FROM CA
          JOIN (SELECT Name, MAX(Major) AS Major
                  FROM CA
                 GROUP BY Name
               ) AS CB
            ON CA.Name = CB.Name AND CA.Major = CB.Major
         GROUP BY CA.Name, CA.Major
       ) AS CC
    ON CA.Name = CC.Name AND CA.Major = CC.Major AND CA.Minor = CC.Minor
 GROUP BY CA.Name, CA.Major, CA.Minor;

经过测试-它的工作原理和产生的答案与 Andomar

Tested - it works and produces the same answer as Andomar's query does.

我创建了大量数据(11616行数据),并对Andomar针对我的查询进行了基准测试-目标DBMS是在MacOS X 10.7.2上运行的IBM Informix Dynamic Server(IDS)版本11.70.FC2.我使用了Andomar的两个查询中的第一个,因为IDS在第二个查询中不支持比较表示法.我加载了数据,更新了统计数据,并依次运行了我的查询和Andomar的查询,并依次运行了Andomar的查询和我的查询.我还记录了IDS优化器报告的基本费用.这两个查询的结果数据是相同的(因此,查询都是准确的-或同样不准确).

I created a bigger volume of data (11616 rows of data), and ran a benchmark timing of Andomar's query against mine - target DBMS was IBM Informix Dynamic Server (IDS) version 11.70.FC2 running on MacOS X 10.7.2. I used the first of Andomar's two queries since IDS does not support the comparison notation in the second one. I loaded the data, updated statistics, and ran the queries both with mine followed by Andomar's and with Andomar's followed by mine. I also recorded the basic costs reported by the IDS optimizer. The result data from both queries were the same (so the queries are both accurate - or equally inaccurate).

表未编制索引:

Andomar's query                           Jonathan's query
Time: 22.074129                           Time: 0.085803
Estimated Cost: 2468070                   Estimated Cost: 22673
Estimated # of Rows Returned: 5808        Estimated # of Rows Returned: 132
Temporary Files Required For: Order By    Temporary Files Required For: Group By

在(名称,主要,次要,修订)上具有唯一索引的表:

Table with unique index on (name, major, minor, revision):

Andomar's query                           Jonathan's query
Time: 0.768309                            Time: 0.060380
Estimated Cost: 31754                     Estimated Cost: 2329
Estimated # of Rows Returned: 5808        Estimated # of Rows Returned: 139
                                          Temporary Files Required For: Group By

如您所见,该索引极大地提高了Andomar查询的性能,但在该系统上它似乎仍然比我的查询贵.该索引为我的查询节省了25%的时间.我很好奇,看到有和没有索引的可比较数据量下,两个版本的Andomar查询的可比数据. (如果需要,可以提供我的测试数据;共有132个产品-问题中列出的3个产品和129个新产品;每个新产品都有(相同)90个版本条目.)

As you can seen, the index dramatically improves the performance of Andomar's query, but it still seems to be more expensive on this system than my query. The index gives a 25% time saving for my query. I'd be curious to see comparable figures for the two versions of Andomar's query on comparable volumes of data, with and without the index. (My test data can be supplied if you need it; there were 132 products - the 3 listed in the question and 129 new ones; each new product had (the same) 90 version entries.)

之所以出现差异,是因为Andomar的查询中的子查询是一个关联的子查询,这是一个相对昂贵的过程(在缺少索引的情况下,这是非常昂贵的).

The reason for the discrepancy is that the sub-query in Andomar's query is a correlated sub-query, which is a relatively expensive process (dramatically so when the index is missing).

这篇关于如何基于多行中的最大值选择单行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆