获取SQL中另一列的每个值的最通用值 [英] Get most common value for each value of another column in SQL
问题描述
我有一个这样的表:
Column | Type | Modifiers
---------+------+-----------
country | text |
food_id | int |
eaten | date |
对于每个国家,我都想得到最常吃的食物.我能想到的最好的(我正在使用postgres)是:
And for each country, I want to get the food that is eaten most often. The best I can think of (I'm using postgres) is:
CREATE TEMP TABLE counts AS
SELECT country, food_id, count(*) as count FROM munch GROUP BY country, food_id;
CREATE TEMP TABLE max_counts AS
SELECT country, max(count) as max_count FROM counts GROUP BY country;
SELECT country, max(food_id) FROM counts
WHERE (country, count) IN (SELECT * from max_counts) GROUP BY country;
在最后一条语句中,需要使用GROUP BY和max()打破平局,在这种情况下,两种不同的食物具有相同的计数.
In that last statement, the GROUP BY and max() are needed to break ties, where two different foods have the same count.
对于概念上简单的事情,这似乎需要大量工作.有更直接的方法吗?
This seems like a lot of work for something conceptually simple. Is there a more straight forward way to do it?
推荐答案
PostgreSQL引入了对窗口功能.值得注意的是,它今天可以通过以下方式解决:
PostgreSQL introduced support for window functions in 8.4, the year after this question was asked. It's worth noting that it might be solved today as follows:
SELECT country, food_id
FROM (SELECT country, food_id, ROW_NUMBER() OVER (PARTITION BY country ORDER BY freq DESC) AS rn
FROM ( SELECT country, food_id, COUNT('x') AS freq
FROM country_foods
GROUP BY 1, 2) food_freq) ranked_food_req
WHERE rn = 1;
以上内容将中断联系.如果您不想打破平局,可以改用DENSE_RANK().
The above will break ties. If you don't want to break ties, you could use DENSE_RANK() instead.
这篇关于获取SQL中另一列的每个值的最通用值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!