用聚合联接两个表 [英] Joining two tables with aggregates

查看:131
本文介绍了用聚合联接两个表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下两个表格:

CREATE TABLE categories
(
  id integer NOT NULL,
  category integer NOT NULL,
  name text,
  CONSTRAINT kjhfskfew PRIMARY KEY (id)
)
WITH (
  OIDS=FALSE
);

CREATE TABLE products_
(
  id integer NOT NULL,
  date date,
  id_employee integer,
  CONSTRAINT grh PRIMARY KEY (id)
)
WITH (
  OIDS=FALSE
);

现在我必须做报告,其中需要以下信息: category.category,category.name(所有这些都可以,所以string_agg可以)-可以将多个分配给一个category_product_.id_employee->,但不能像上面那样用逗号作为类别名,但是将其分配最新日期(在这里)是我的问题);

Now I have to do report in which I need following information: categories.category, categories.name (all of them, so string_agg is ok) - could be many assigned to one category and products_.id_employee -> but not with comma as above with category name but the one with newest date assigned (and here is my problem);

我已经尝试过以下构造:

I've tried already constructions as:

SELECT
  DISTINCT ON (category ) category,
  string_agg(name, ','),
  (SELECT
     id_employee
   FROM products_
   WHERE date = (SELECT
                   max(date)
                 FROM products_
                 WHERE id IN (SELECT
                                id
                              FROM categories
                              WHERE id = c.id)))
FROM categories c
ORDER BY category;

但是PostgreSQL说子查询返回了很多行... 请帮忙!

But PostgreSQL says that subquery is returning to many rows... Please help!

示例插入:

INSERT INTO categories(
            id, category, name)
    VALUES (1,22,'car'),(2,22,'bike'),(3,22,'boat'),(4,33,'soap'),(5,44,'chicken');

INSERT INTO products_(
            id, date, id_employee)
    VALUES (1,'2009-11-09',11),(2,'2010-09-09',2),(3,'2013-01-01',4),(5,'2014-09-01',90);

好的,我已经解决了这个问题. 这个很好用:

OK, I've solved this problem. This one works just fine:

WITH max_date AS (
    SELECT
      category,
      max(date)             AS date,
      string_agg(name, ',') AS names
    FROM test.products_
      JOIN test.categories c
      USING (id)
    GROUP BY c.category
)
SELECT
  max(id_employee) AS id_employee,
  md.category,
  names
FROM test.products_ p
  LEFT JOIN max_date md
  USING (date)
  LEFT JOIN test.categories
  USING (category)
WHERE p.date = md.date AND p.id IN (SELECT
                                      id
                                    FROM test.categories
                                    WHERE category = md.category)
GROUP BY category, names;

推荐答案

似乎正在使用id来连接两个表,这对我来说很奇怪.

It seems that id is being used to join the two tables, which seems strange to me.

在任何情况下,类别名称的基本查询为:

In any case, the base query for the category names is:

SELECT c.category, string_agg(c.name, ','),
FROM categories c
group by c.category;

问题是:如何获得最近的名字?这种方法使用row_number()函数:

The question is: how to get the most recent name? This approach uses the row_number() function:

SELECT c.category, string_agg(c.name, ','), cp.id_employee
FROM categories c left outer join
     (select c.category, c.name, p.id_employee,
             row_number() over (partition by c.category order by date desc) as seqnum
      from categories c left outer join
           products_ p
           on c.id = p.id
     ) cp
     on cp.category = c.category and
        cp.seqnum = 1
group by c.category, cp.id_employee;

这篇关于用聚合联接两个表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆