在 Redshift/Postgres 中,如何计算满足条件的行数? [英] In Redshift/Postgres, how to count rows that meet a condition?

查看:22
本文介绍了在 Redshift/Postgres 中,如何计算满足条件的行数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试编写一个查询,只计算满足条件的行数.

I'm trying to write a query that count only the rows that meet a condition.

例如,在 MySQL 中我会这样写:

For example, in MySQL I would write it like this:

SELECT
    COUNT(IF(grade < 70), 1, NULL)
FROM
    grades
ORDER BY
    id DESC;

但是,当我尝试在 Redshift 上执行此操作时,它返回以下错误:

However, when I attempt to do that on Redshift, it returns the following error:

错误:函数 if(boolean, integer, "unknown") 不存在

ERROR: function if(boolean, integer, "unknown") does not exist

提示:没有函数匹配给定的名称和参数类型.您可能需要添加显式类型转换.

Hint: No function matches the given name and argument types. You may need to add explicit type casts.

我检查了条件语句的文档,我发现

I checked the documentation for conditional statements, and I found

NULLIF(value1, value2)

但它只比较 value1 和 value2,如果这些值相等,则返回 null.

but it only compares value1 and value2 and if such values are equal, it returns null.

我找不到一个简单的 IF 语句,乍一看我找不到做我想做的事情的方法.

I couldn't find a simple IF statement, and at first glance I couldn't find a way to do what I want to do.

我尝试使用 CASE 表达式,但没有得到我想要的结果:

I tried to use the CASE expression, but I'm not getting the results I want:

SELECT 
    CASE
        WHEN grade < 70 THEN COUNT(rank)
        ELSE COUNT(rank)
    END
FROM
   grades

这就是我想数数的方式:

This is the way I want to count things:

  • 失败(等级 <70)

  • failed (grade < 70)

平均(70 <= 等级 <80)

average (70 <= grade < 80)

好(80 <= 等级 <90)

good (80 <= grade < 90)

优秀(90 <= 等级 <= 100)

excellent (90 <= grade <= 100)

这就是我希望看到的结果:

and this is how I expect to see the results:

+========+=========+======+===========+
| failed | average | good | excellent |
+========+=========+======+===========+
|   4    |    2    |  1   |     4     |
+========+=========+======+===========+

但我得到了这个:

+========+=========+======+===========+
| failed | average | good | excellent |
+========+=========+======+===========+
|  11    |   11    |  11  |    11     |
+========+=========+======+===========+

我希望有人能指出我正确的方向!

I hope someone could point me to the right direction!

如果这有帮助,这里有一些示例信息

If this helps here's some sample info

CREATE TABLE grades(
  grade integer DEFAULT 0,
);

INSERT INTO grades(grade) VALUES(69, 50, 55, 60, 75, 70, 87, 100, 100, 98, 94);

推荐答案

首先,您在这里遇到的问题是您所说的是如果成绩小于 70,则此 case 表达式的值是count(rank).否则,此表达式的值为count(rank)."因此,无论哪种情况,您总是获得相同的价值.

First, the issue you're having here is that what you're saying is "If the grade is less than 70, the value of this case expression is count(rank). Otherwise, the value of this expression is count(rank)." So, in either case, you're always getting the same value.

SELECT 
    CASE
        WHEN grade < 70 THEN COUNT(rank)
        ELSE COUNT(rank)
    END
FROM
   grades

count() 只计算非空值,所以通常你会看到完成你正在尝试的模式是这样的:

count() only counts non-null values, so typically the pattern you'll see to accomplish what you're trying is this:

SELECT 
    count(CASE WHEN grade < 70 THEN 1 END) as grade_less_than_70,
    count(CASE WHEN grade >= 70 and grade < 80 THEN 1 END) as grade_between_70_and_80
FROM
   grades

这样 case 表达式只会在测试表达式为真时计算为 1,否则为 null.然后 count() 将只计算非空实例,即当测试表达式为真时,它应该给你你需要的东西.

That way the case expression will only evaluate to 1 when the test expression is true and will be null otherwise. Then the count() will only count the non-null instances, i.e. when the test expression is true, which should give you what you need.

作为旁注,请注意这与您最初使用 count(if(test, true-value, false-value)) 编写的完全相同,只是重新- 写为 count(case when test then true-value end) (并且 null 是假值的代表,因为 else 没有提供给案例).

As a side note, notice that this is exactly the same as how you had originally written this using count(if(test, true-value, false-value)), only re-written as count(case when test then true-value end) (and null is the stand in false-value since an else wasn't supplied to the case).

postgres 9.4 在这个原始交换几个月后发布.该版本引入了聚合过滤器,它可以使这样的场景看起来更好更清晰.这个答案仍然偶尔会得到一些赞成,所以如果你在这里偶然发现并使用更新的 postgres(即 9.4+),你可能需要考虑这个等效版本:

postgres 9.4 was released a few months after this original exchange. That version introduced aggregate filters, which can make scenarios like this look a little nicer and clearer. This answer still gets some occasional upvotes, so if you've stumbled upon here and are using a newer postgres (i.e. 9.4+) you might want to consider this equivalent version:

SELECT
    count(*) filter (where grade < 70) as grade_less_than_70,
    count(*) filter (where grade >= 70 and grade < 80) as grade_between_70_and_80
FROM
   grades

这篇关于在 Redshift/Postgres 中,如何计算满足条件的行数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆