删除所有行,但每组中值最大的行 [英] Delete all rows but one with the greatest value per group

查看:130
本文介绍了删除所有行,但每组中值最大的行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以,我最近才问一个问题:在Postgres中使用带有聚合和groupby的子查询进行更新

,结果证明我正在处理逻辑错误的问题。

So, I just recently asked a question: Update using a subquery with aggregates and groupby in Postgres
and it turns out I was going about my issue with flawed logic.

在上述问题的相同场景中,我不想删除没有最大数量(以及任何重复的最大数量)的行,而是更新所有行的最大数量

In the same scenario in the question above, instead of updating all the rows to have the max quantity, I'd like to delete the rows that don't have the max quantity (and any duplicate max quantities).

本质上,我只需要将以下内容转换为删除语句,该语句仅保留每个item_name的最大数量。我猜我这里需要不存在,但是我不确定如何使用聚合函数来做到这一点。

Essentially I need to just convert the below to a delete statement that preserves only the largest quantities per item_name. I'm guessing I'm going to need NOT EXISTS here but I'm not sure how to do that with aggregate functions.

UPDATE transaction t
SET    quantity = sub.max_quantity
FROM  (
     SELECT item_name, max(quantity) AS max_quantity
     FROM   transaction
     GROUP  BY 1
) sub
WHERE t.item_name = sub.item_name
AND   t.quantity IS DISTINCT FROM sub.max_quantity;


推荐答案

由于可能存在共享相同最大数量的对等体,安全路径是带有窗口函数的子查询row_number()

Since there can be peers sharing the same maximum quantity, the safe route is a subquery with the window function row_number():

DELETE FROM transaction t
USING (
   SELECT some_unique_id, row_number() OVER (PARTITION BY item_name
                                             ORDER BY quantity DESC) AS rn
   FROM   transaction
   GROUP  BY 1
   ) sub
WHERE t.some_unique_id = sub.some_unique_id
AND   sub.rn > 1;

其中 some_unique_id 可以是任何唯一列,或者列的组合(反映在 GROUP BY 子句中。)

Where some_unique_id can be any unique column or combination of columns (mirrored in the GROUP BY clause).

最终成为非常相似今天的问题:

删除两个字段上重复的行

Ends up to be very similar to this question from today:
Delete rows with duplicates on two fields

如果表为 big ,并且要删除其中的大部分,请在此处考虑高级建议:

如何删除重复项?

If your table is big and you are going to delete large parts of it, consider advanced advice here:
How to delete duplicate entries?

这篇关于删除所有行,但每组中值最大的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆