在Redshift / Postgres中,如何计算满足条件的行? [英] In Redshift/Postgres, how to count rows that meet a condition?

查看:105
本文介绍了在Redshift / Postgres中,如何计算满足条件的行?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图编写一个仅对满足条件的行进行计数的查询。

I'm trying to write a query that count only the rows that meet a condition.

例如,在MySQL中,我会这样写:

For example, in MySQL I would write it like this:

SELECT
    COUNT(IF(grade < 70), 1, NULL)
FROM
    grades
ORDER BY
    id DESC;

但是,当我尝试在Redshift上执行此操作时,它返回以下错误:

However, when I attempt to do that on Redshift, it returns the following error:

错误:函数if(boolean,integer, unknown)不存在

ERROR: function if(boolean, integer, "unknown") does not exist

提示:没有函数与给定名称匹配和参数类型。您可能需要添加显式类型转换。

Hint: No function matches the given name and argument types. You may need to add explicit type casts.

我检查了条件声明的文档,发现

I checked the documentation for conditional statements, and I found

NULLIF(value1 ,value2)

,但它仅比较value1和value2,如果这些值相等,则返回null。

but it only compares value1 and value2 and if such values are equal, it returns null.

我找不到简单的IF语句,乍一看,我也找不到解决方法。

I couldn't find a simple IF statement, and at first glance I couldn't find a way to do what I want to do.

我尝试使用CASE表达式,但没有得到想要的结果:

I tried to use the CASE expression, but I'm not getting the results I want:

SELECT 
    CASE
        WHEN grade < 70 THEN COUNT(rank)
        ELSE COUNT(rank)
    END
FROM
   grades

这是我要计数的方式:


  • 失败(等级<70 )

  • failed (grade < 70)

平均值(70 <==等级<80)

average (70 <= grade < 80)

好( 80< =等级< 90)

good (80 <= grade < 90)

非常好(90 <==等级< = 100)

excellent (90 <= grade <= 100)

这就是我期望看到的结果:

and this is how I expect to see the results:

+========+=========+======+===========+
| failed | average | good | excellent |
+========+=========+======+===========+
|   4    |    2    |  1   |     4     |
+========+=========+======+===========+

但是我得到了:

+========+=========+======+===========+
| failed | average | good | excellent |
+========+=========+======+===========+
|  11    |   11    |  11  |    11     |
+========+=========+======+===========+

我希望有人能指出我正确的方向!

I hope someone could point me to the right direction!

如果这帮助这里的一些示例信息

If this helps here's some sample info

CREATE TABLE grades(
  grade integer DEFAULT 0,
);

INSERT INTO grades(grade) VALUES(69, 50, 55, 60, 75, 70, 87, 100, 100, 98, 94);


推荐答案

首先,您在这里遇到的问题是您的意思是如果分数小于70,则此case表达式的值为count(rank)。否则,此表达式的值为count(rank)。因此,无论哪种情况,您总是得到相同的值。

First, the issue you're having here is that what you're saying is "If the grade is less than 70, the value of this case expression is count(rank). Otherwise, the value of this expression is count(rank)." So, in either case, you're always getting the same value.

SELECT 
    CASE
        WHEN grade < 70 THEN COUNT(rank)
        ELSE COUNT(rank)
    END
FROM
   grades

count()仅计算非空值,因此通常您会看到要完成的模式是:

count() only counts non-null values, so typically the pattern you'll see to accomplish what you're trying is this:

SELECT 
    count(CASE WHEN grade < 70 THEN 1 END) as grade_less_than_70,
    count(CASE WHEN grade >= 70 and grade < 80 THEN 1 END) as grade_between_70_and_80
FROM
   grades

仅在测试表达式为true时求值为1,否则为null。然后count()将仅对非null实例进行计数,即当测试表达式为true时,它将为您提供所需的内容。

That way the case expression will only evaluate to 1 when the test expression is true and will be null otherwise. Then the count() will only count the non-null instances, i.e. when the test expression is true, which should give you what you need.

编辑:请注意,请注意,这与您最初使用 count(if(test,true-value,false-value))编写的方式完全相同作为 count(在测试时为真值末尾的情况)(null为自<< c $ c> else

As a side note, notice that this is exactly the same as how you had originally written this using count(if(test, true-value, false-value)), only re-written as count(case when test then true-value end) (and null is the stand in false-value since an else wasn't supplied to the case).

编辑:postgres 9.4是在原始交换后的几个月发布的。该版本引入了聚合过滤器,可以使这种情况看起来更好,更清晰。这个答案仍然偶尔会受到反对,因此,如果您偶然发现这里并使用较新的postgres(即9.4+),则可能要考虑此等效版本:

postgres 9.4 was released a few months after this original exchange. That version introduced aggregate filters, which can make scenarios like this look a little nicer and clearer. This answer still gets some occasional upvotes, so if you've stumbled upon here and are using a newer postgres (i.e. 9.4+) you might want to consider this equivalent version:

SELECT
    count(*) filter (where grade < 70) as grade_less_than_70,
    count(*) filter (where grade >= 70 and grade < 80) as grade_between_70_and_80
FROM
   grades

这篇关于在Redshift / Postgres中,如何计算满足条件的行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆