如何选择最常出现的值? [英] how to select the most frequently appearing values?

查看:68
本文介绍了如何选择最常出现的值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我看过一些示例,其中查询按计数排序并排在第一行,但是在这种情况下,可以有多个最频繁"的值,所以我可能想返回的不仅仅是一个结果.

I've seen examples where the query orders by count and takes the top row, but in this case there can be multiple "most frequent" values, so I might want to return more than just a single result.

在这种情况下,我想在用户表中找到最常出现的姓氏,这是我到目前为止所拥有的:

In this case I want to find the most frequently appearing last names in a users table, here's what I have so far:

select last_name from users group by last_name having max(count(*));

不幸的是,在此查询中,我收到了一个错误,提示我的max函数嵌套得太深了.

Unfortunately with this query I get an error that my max function is nested too deeply.

推荐答案

select
  x.last_name,
  x.name_count
from
  (select
    u.last_name,
    count(*) as name_count,
    rank() over (order by count(*) desc) as rank
  from
    users u
  group by
    u.last_name) x
where
  x.rank = 1

使用分析功能rank.它将基于count(*) desc的顺序分配编号.如果两个名称具有相同的计数,它们将具有相同的排名,并且下一个数字将被跳过(因此您可能会获得行列为1、1、3的行). dense_rank是一种替代方案,如果两行具有相同的排名,则不会跳过下一个数字(因此,您将获得1、1、2),但是如果只希望排名为1的行,则不会有所不同.

Use the analytical function rank. It will assign a numbering based on the order of count(*) desc. If two names got the same count, they get the same rank, and the next number is skipped (so you might get rows having ranks 1, 1 and 3). dense_rank is an alternative which doesn't skip the next number if two rows got the same rank, (so you'd get 1, 1, 2), but if you want only the rows with rank 1, there is not much of a difference.

如果只需要一行,则希望每一行都有不同的编号.在这种情况下,请使用row_number.除了这一微小但重要的区别之外,这些功能是相似的,并且可以以相同的方式使用.

If you want only one row, you'd want each row to have a different number. In that case, use row_number. Apart from this small-but-important difference, these functions are similar and can be used in the same way.

这篇关于如何选择最常出现的值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆