带有SUM的Postgres LEFT JOIN,缺少记录 [英] Postgres LEFT JOIN with SUM, missing records
问题描述
我正在尝试获取相关表中某些记录类型的计数.我正在使用左联接.
I am trying to get the count of certain types of records in a related table. I am using a left join.
因此,我有一个查询不太正确,而该查询返回了正确的结果.正确的结果查询具有较高的执行成本.如果可以纠正结果,我想使用第一种方法. (请参见 http://sqlfiddle.com/#!15/7c20b/5/2)
So I have a query that isn't quite right and one that is returning the correct results. The correct results query has a higher execution cost. Id like to use the first approach, if I can correct the results. (see http://sqlfiddle.com/#!15/7c20b/5/2)
CREATE TABLE people(
id SERIAL,
name varchar not null
);
CREATE TABLE pets(
id SERIAL,
name varchar not null,
kind varchar not null,
alive boolean not null default false,
person_id integer not null
);
INSERT INTO people(name) VALUES
('Chad'),
('Buck'); --can't keep pets alive
INSERT INTO pets(name, alive, kind, person_id) VALUES
('doggio', true, 'dog', 1),
('dog master flash', true, 'dog', 1),
('catio', true, 'cat', 1),
('lucky', false, 'cat', 2);
我的目标是和所有的人以及他们活着的宠物的种类一起找一张桌子:
My goal is to get a table back with ALL of the people and the counts of the KINDS of pets they have alive:
| ID | ALIVE_DOGS_COUNT | ALIVE_CATS_COUNT |
|----|------------------|------------------|
| 1 | 2 | 1 |
| 2 | 0 | 0 |
我使这个例子变得微不足道.在我们的生产应用程序中(不是真正的宠物),每人大约有100,000只死狗和猫.我知道搞砸了,但是这个例子更容易传递;)我希望在计数之前过滤掉所有死"的东西.我现在在生产中查询的速度较慢(从上面的sqlfiddle中查询),但是很想让LEFT JOIN版本正常工作.
I made the example more trivial. In our production app (not really pets) there would be about 100,000 dead dogs and cats per person. Pretty screwed up I know, but this example is simpler to relay ;) I was hoping to filter all the 'dead' stuff out before the count. I have the slower query in production now (from sqlfiddle above), but would love to get the LEFT JOIN version working.
推荐答案
如果获取所有或大多数行,通常最快:
SELECT pp.id
, COALESCE(pt.a_dog_ct, 0) AS alive_dogs_count
, COALESCE(pt.a_cat_ct, 0) AS alive_cats_count
FROM people pp
LEFT JOIN (
SELECT person_id
, count(kind = 'dog' OR NULL) AS a_dog_ct
, count(kind = 'cat' OR NULL) AS a_cat_ct
FROM pets
WHERE alive
GROUP BY 1
) pt ON pt.person_id = pp.id;
此处的索引无关紧要,全表扫描将是最快的. 除了,如果活着的宠物是稀有动物案件,则
Indexes are irrelevant here, full table scans will be fastest. Except if alive pets are a rare case, then a partial index should help. Like:
CREATE INDEX pets_alive_idx ON pets (person_id, kind) WHERE alive;
我包括了查询(person_id, kind)
所需的所有列,以允许仅索引扫描.
I included all columns needed for the query (person_id, kind)
to allow index-only scans.
对于小的子集或单行,通常最快:
SELECT pp.id
, count(kind = 'dog' OR NULL) AS alive_dogs_count
, count(kind = 'cat' OR NULL) AS alive_cats_count
FROM people pp
LEFT JOIN pets pt ON pt.person_id = pp.id
AND pt.alive
WHERE <some condition to retrieve a small subset>
GROUP BY 1;
为此,您至少应在pets.person_id
上有一个索引(或上面的部分索引)-可能还要更多,具体取决于WHERE
条件.
You should at least have an index on pets.person_id
for this (or the partial index from above) - and possibly more, depending ion the WHERE
condition.
相关答案:
- Query with LEFT JOIN not returning rows for count of 0
- GROUP or DISTINCT after JOIN returns duplicates
- Get count of foreign key from multiple tables
这篇关于带有SUM的Postgres LEFT JOIN,缺少记录的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!