COUNT(*) vs. COUNT(1) vs. COUNT(pk):哪个更好? [英] COUNT(*) vs. COUNT(1) vs. COUNT(pk): which is better?

查看:39
本文介绍了COUNT(*) vs. COUNT(1) vs. COUNT(pk):哪个更好?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我经常发现这三种变体:

I often find these three variants:

SELECT COUNT(*) FROM Foo;
SELECT COUNT(1) FROM Foo;
SELECT COUNT(PrimaryKey) FROM Foo;

据我所知,它们都做同样的事情,我发现自己在我的代码库中使用了这三个.但是,我不喜欢以不同的方式做同样的事情.我应该坚持哪一个?他们中的任何一个都比其他两个更好吗?

As far as I can see, they all do the same thing, and I find myself using the three in my codebase. However, I don't like to do the same thing different ways. To which one should I stick? Is any one of them better than the two others?

推荐答案

底线

使用 COUNT(field)COUNT(*),并坚持使用它,如果您的数据库允许 COUNT(tableHere)> 或 COUNT(tableHere.*),使用它.

Bottom Line

Use either COUNT(field) or COUNT(*), and stick with it consistently, and if your database allows COUNT(tableHere) or COUNT(tableHere.*), use that.

简而言之,不要将 COUNT(1) 用于任何事情.这是一个一招式的小马,它很少做你想做的事,在那些罕见的情况下相当于 count(*)

In short, don't use COUNT(1) for anything. It's a one-trick pony, which rarely does what you want, and in those rare cases is equivalent to count(*)

使用 * 用于需要计算所有内容的所有查询,即使对于连接,也使用 *

Use * for all your queries that need to count everything, even for joins, use *

SELECT boss.boss_id, COUNT(subordinate.*)
FROM boss
LEFT JOIN subordinate on subordinate.boss_id = boss.boss_id
GROUP BY boss.id

但是不要将 COUNT(*) 用于 LEFT 连接,因为即使从属表与父表中的任何内容都不匹配,它也会返回 1

But don't use COUNT(*) for LEFT joins, as that will return 1 even if the subordinate table doesn't match anything from parent table

SELECT boss.boss_id, COUNT(*)
FROM boss
LEFT JOIN subordinate on subordinate.boss_id = boss.boss_id
GROUP BY boss.id

不要被那些建议在 COUNT 中使用 * 时,它从您的表中获取整行的人所愚弄,说 * 很慢.SELECT COUNT(*)SELECT * 上的 * 没有任何关系,它们是完全不同的东西,它们只是共享一个共同点令牌,即*.

Don't be fooled by those advising that when using * in COUNT, it fetches entire row from your table, saying that * is slow. The * on SELECT COUNT(*) and SELECT * has no bearing to each other, they are entirely different thing, they just share a common token, i.e. *.

事实上,如果不允许将字段命名为与其表名相同的名称,RDBMS 语言设计者可以赋予 COUNT(tableNameHere)COUNT(*)<相同的语义/代码>.示例:

In fact, if it is not permitted to name a field as same as its table name, RDBMS language designer could give COUNT(tableNameHere) the same semantics as COUNT(*). Example:

为了计算行数,我们可以这样:

For counting rows we could have this:

SELECT COUNT(emp) FROM emp

他们可以让事情变得更简单:

And they could make it simpler:

SELECT COUNT() FROM emp

对于 LEFT JOIN,我们可以这样:

And for LEFT JOINs, we could have this:

SELECT boss.boss_id, COUNT(subordinate)
FROM boss
LEFT JOIN subordinate on subordinate.boss_id = boss.boss_id
GROUP BY boss.id

但他们不能这样做 (COUNT(tableNameHere)) 因为 SQL 标准允许命名一个与表名相同的字段:

But they cannot do that (COUNT(tableNameHere)) since SQL standard permits naming a field with the same name as its table name:

CREATE TABLE fruit -- ORM-friendly name
(
fruit_id int NOT NULL,
fruit varchar(50), /* same name as table name, 
                and let's say, someone forgot to put NOT NULL */
shape varchar(50) NOT NULL,
color varchar(50) NOT NULL
)

以空计数

此外,如果字段名称与表名称匹配,则将字段设为可空也不是一个好习惯.假设您在 fruit 字段上有值 'Banana'、'Apple'、NULL、'Pears'.这不会计算所有行,它只会产生 3,而不是 4

Counting with null

And also, it is not a good practice to make a field nullable if its name matches the table name. Say you have values 'Banana', 'Apple', NULL, 'Pears' on fruit field. This will not count all rows, it will only yield 3, not 4

SELECT count(fruit) FROM fruit

虽然一些 RDBMS 执行这种原则(为了计算表的行数,它接受表名作为 COUNT 的参数),这将在 Postgresql 中工作(如果在任何下面两张表,即只要字段名和表名没有名字冲突):

Though some RDBMS do that sort of principle (for counting the table's rows, it accepts table name as COUNT's parameter), this will work in Postgresql (if there is no subordinate field in any of the two tables below, i.e. as long as there is no name conflict between field name and table name):

SELECT boss.boss_id, COUNT(subordinate)
FROM boss
LEFT JOIN subordinate on subordinate.boss_id = boss.boss_id
GROUP BY boss.id

但是如果我们在表中添加一个 subordinate 字段,这可能会在以后引起混乱,因为它将计算字段(可以为空),而不是表行.

But that could cause confusion later if we will add a subordinate field in the table, as it will count the field(which could be nullable), not the table rows.

为了安全起见,请使用:

So to be on the safe side, use:

SELECT boss.boss_id, COUNT(subordinate.*)
FROM boss
LEFT JOIN subordinate on subordinate.boss_id = boss.boss_id
GROUP BY boss.id

count(1):一招小马

特别是对于COUNT(1),它是一个一招式的小马,它只适用于一个表查询:

count(1): The one-trick pony

In particular to COUNT(1), it is a one-trick pony, it works well only on one table query:

SELECT COUNT(1) FROM tbl

但是当你使用连接时,这个技巧在没有语义被混淆的情况下不适用于多表查询,特别是你不能写:

But when you use joins, that trick won't work on multi-table queries without its semantics being confused, and in particular you cannot write:

-- count the subordinates that belongs to boss
SELECT boss.boss_id, COUNT(subordinate.1)
FROM boss
LEFT JOIN subordinate on subordinate.boss_id = boss.boss_id
GROUP BY boss.id

那么这里的 COUNT(1) 是什么意思?

So what's the meaning of COUNT(1) here?

SELECT boss.boss_id, COUNT(1)
FROM boss
LEFT JOIN subordinate on subordinate.boss_id = boss.boss_id
GROUP BY boss.id

这是……?

-- counting all the subordinates only
SELECT boss.boss_id, COUNT(subordinate.boss_id)
FROM boss
LEFT JOIN subordinate on subordinate.boss_id = boss.boss_id
GROUP BY boss.id

或者这个……?

-- or is that COUNT(1) will also count 1 for boss regardless if boss has a subordinate
SELECT boss.boss_id, COUNT(*)
FROM boss
LEFT JOIN subordinate on subordinate.boss_id = boss.boss_id
GROUP BY boss.id

仔细想想,你可以推断 COUNT(1)COUNT(*) 是一样的,不管连接的类型如何.但是对于 LEFT JOINs 结果,我们不能将 COUNT(1) 塑造为:COUNT(subordinate.boss_id), COUNT(subordinate.*)>

By careful thought, you can infer that COUNT(1) is the same as COUNT(*), regardless of type of join. But for LEFT JOINs result, we cannot mold COUNT(1) to work as: COUNT(subordinate.boss_id), COUNT(subordinate.*)

所以只需使用以下任一方法:

So just use either of the following:

-- count the subordinates that belongs to boss
SELECT boss.boss_id, COUNT(subordinate.boss_id)
FROM boss
LEFT JOIN subordinate on subordinate.boss_id = boss.boss_id
GROUP BY boss.id

适用于Postgresql,很明显你要计算集合的基数

Works on Postgresql, it's clear that you want to count the cardinality of the set

-- count the subordinates that belongs to boss
SELECT boss.boss_id, COUNT(subordinate.*)
FROM boss
LEFT JOIN subordinate on subordinate.boss_id = boss.boss_id
GROUP BY boss.id

另一种计算集合基数的方法,非常像英文(只是不要让列的名称与其表名相同):http://www.sqlfiddle.com/#!1/98515/7

Another way to count the cardinality of the set, very English-like (just don't make a column with a name same as its table name) : http://www.sqlfiddle.com/#!1/98515/7

select boss.boss_name, count(subordinate)
from boss
left join subordinate on subordinate.boss_code = boss.boss_code
group by boss.boss_name

您不能这样做:http://www.sqlfiddle.com/#!1/98515/8

select boss.boss_name, count(subordinate.1)
from boss
left join subordinate on subordinate.boss_code = boss.boss_code
group by boss.boss_name

你可以这样做,但这会产生错误的结果:http://www.sqlfiddle.com/#!1/98515/9

You can do this, but this produces wrong result: http://www.sqlfiddle.com/#!1/98515/9

select boss.boss_name, count(1)
from boss
left join subordinate on subordinate.boss_code = boss.boss_code
group by boss.boss_name

这篇关于COUNT(*) vs. COUNT(1) vs. COUNT(pk):哪个更好?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆