聚合多个联接表上的函数 [英] Aggregate functions on multiple joined tables

查看:98
本文介绍了聚合多个联接表上的函数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有三个表:

CREATE TABLE foo (
    id bigint PRIMARY KEY,
    name text NOT NULL
);

CREATE TABLE foo_bar (
    id bigint PRIMARY KEY,
    foo_id bigint NOT NULL
);

CREATE TABLE tag (
    name text NOT NULL,
    target_id bigint NOT NULL,
    PRIMARY KEY (name, target_id)
);

我试图创建一个视图,以便获得表foo的所有字段,foo_barfoo.id = foo_bar.foo_id的项数以及所有标签的文本数组,其中foo.id = tag.target_id.如果我们有:

I'm trying to create a view such that I get all of the fields of table foo, the count of items in foo_bar where foo.id = foo_bar.foo_id, and a text array of all tags where foo.id = tag.target_id. If we have:

INSERT INTO foo VALUES (1, 'one');
INSERT INTO foo VALUES (2, 'two');
INSERT INTO foo_bar VALUES (1, 1);
INSERT INTO foo_bar VALUES (2, 1);
INSERT INTO foo_bar VALUES (3, 2);
INSERT INTO foo_bar VALUES (4, 1);
INSERT INTO foo_bar VALUES (5, 2);
INSERT INTO tag VALUES ('a', 1);
INSERT INTO tag VALUES ('b', 1);
INSERT INTO tag VALUES ('c', 2);

结果应返回:

foo.id    | foo.name     | count       | array_agg
--------------------------------------------------
1         | one          | 3           | {a, b}
2         | two          | 2           | {c}

这是我到目前为止所拥有的:

This is what I have so far:

SELECT DISTINCT f.id, f.name, COUNT(b.id), array_agg(t.name)
FROM foo AS f, foo_bar AS b, tag AS t
WHERE f.id = t.target_id AND f.id = b.foo_id
GROUP BY f.id, b.id;

这些是我得到的结果(请注意count是不正确的):

These are the results I'm getting (note the count is incorrect):

foo.id    | foo.name     | count       | array_agg
--------------------------------------------------
1         | one          | 2           | {a, b}
2         | two          | 1           | {c}

count始终是标记的计数,而不是不同的foo_bar值的计数.我试过重新排序/修改GROUP BYSELECT子句,它们返回不同的结果,但不是我想要的结果.我想我在使用array_agg()函数时遇到了麻烦,但是我不确定是否是这种情况或如何解决.

The count is always the count of tags instead of the count of distinct foo_bar values. I've tried reordering/modifying the GROUP BY and the SELECT clauses which returns different results but not the ones that I'm looking for. I think I'm having trouble with the array_agg() function, but I'm not sure if that's the case or how to resolve it.

推荐答案

SELECT f.id, f.name, b.fb_ct, t.tag_names
FROM   foo f
LEFT JOIN  (
    SELECT foo_id AS id, count(*) AS fb_ct
    FROM   foo_bar
    GROUP  BY 1
    ) b USING (id)
LEFT JOIN  (
    SELECT target_id AS id, array_agg(name) AS tag_names
    FROM   tag
    GROUP  BY 1
    ) t USING (id)
ORDER  BY f.id;

产生所需的结果.

  • 使用显式的JOIN语法重写.使其更易于阅读和理解(和调试).

  • Rewrite with explicit JOIN syntax. Makes it so much easier to read and understand (and debug).

通过连接到多个与1:n相关的表,行将彼此相乘,从而产生笛卡尔产品-这是非常昂贵的废话.这是代理服务器意外的CROSS JOIN.相关:

By joining to multiple 1:n related tables, rows would multiply each other producing a Cartesian product - which is very expensive nonsense. It's an unintended CROSS JOIN by proxy. Related:

为避免这种情况,请在汇总(GROUP BY)之前将最多一个 n-表与1-表连接.您可以进行两次汇总,但是将n -tables分别 之前(将它们加入1 -table)进行汇总会更干净,更快捷.

To avoid this, join at most one n-table to the 1-table before you aggregate (GROUP BY). You could aggregate two times, but it's cleaner and faster to aggregate n-tables separately before joining them to the 1-table.

与原始版本相反(带有隐式INNER JOIN).我使用 LEFT JOIN 来避免丢失foo_bartag中没有匹配行的foo行.

As opposed to your original (with implicit INNER JOIN). I use LEFT JOIN to avoid losing rows from foo that have no matching row in foo_bar or tag.

一旦从查询中删除了意外的CROSS JOIN,则无需再添加DISTINCT-假设foo.id是唯一的.

Once the unintended CROSS JOIN is removed from the query, there is no need for adding DISTINCT any more - assuming that foo.id is unique.

这篇关于聚合多个联接表上的函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆