使用 id 和树状表的后代连接两个表 [英] Join two tables using id and descendants from tree like table

查看:64
本文介绍了使用 id 和树状表的后代连接两个表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下表格:

CREATE TABLE element (
  element_id serial PRIMARY KEY,
  local_id integer,
  name varchar,
  CONSTRAINT fk_element_local_id FOREIGN KEY (local_id)
      REFERENCES local (local_id) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE NO ACTION
);

CREATE TABLE local (
  local_id serial PRIMARY KEY,
  parent_id integer,
  name varchar,
  CONSTRAINT fk_local_parent_id_local_id FOREIGN KEY (parent_id)
      REFERENCES local (local_id) MATCH SIMPLE
      ON UPDATE CASCADE ON DELETE SET NULL
);

CREATE TABLE category (
  category_id serial PRIMARY KEY,
  name varchar
);

CREATE TABLE action (
  action_id serial PRIMARY KEY,
  local_id integer,
  category_id integer,
  CONSTRAINT fk_action_local_id FOREIGN KEY (local_id)
      REFERENCES local (local_id) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE NO ACTION,
  CONSTRAINT fk_action_element_id FOREIGN KEY (element_id)
      REFERENCES element (element_id) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE NO ACTION
);

我想从一个动作中选择所有元素.如果元素的局部是动作局部的后代,它也应该出现.
例子:

I want to select all elements from an action. If the local from the element is descendant from the local of the action it should appear as well.
example:

local:

|local_id | parent_id | name |
|---------+-----------+------|
|1        |NULL       |A     |
|2        |1          |B     |
|3        |1          |C     |
|4        |3          |D     |
|5        |NULL       |E     |
|6        |5          |F     |
|_________|___________|______|

表格category:

| category_id | name |
|-------------+------|
|1            |A     |
|2            |B     |
|2            |C     |
|_____________|______|

表格元素:

|element_id | local_id | name | category_id |
|-----------+----------+------+-------------|
|1          |1         |A     | 1           |
|2          |2         |B     | 2           |
|3          |2         |C     | 1           |
|4          |4         |D     | 2           |
|5          |5         |E     | 2           |
|6          |6         |F     | 1           |
|7          |6         |G     | 1           |
|___________|__________|______|_____________|

表格action:

|action_id | local_id | category_id |
|----------+----------+-------------|
| 1        | 1        | 2           |
| 2        | 3        | 1           |
| 3        | 5        | 1           |
| 4        | 6        | 1           |
|__________|__________|_____________|

我想要的查询结果:

CASE: action_id = 1
return: element_id: 2,4

CASE: action_id = 2
return: element_id: null

CASE: action_id = 3
return: element_id: 6,7

我已经创建了一个函数,该函数返回包括实际节点在内的所有后代,但由于调用该函数数千次时的性能,我遇到了困难.我的函数如下所示:

I've made a function that returns all the descendants including the actual node but I'm having a hard time because of the performance when calling the function thousands of times. My function looks like this:

CREATE OR REPLACE FUNCTION fn_local_get_childs(_parent_id integer)
  RETURNS SETOF integer AS
$BODY$
DECLARE
   r integer;
BEGIN
   FOR r IN SELECT local_id FROM local WHERE local_id IN ( 
      (WITH RECURSIVE parent AS
      (
         SELECT local_id , parent_id  from local WHERE local_id = _parent_id
         UNION ALL 
         SELECT t.local_id , t.parent_id FROM parent
         INNER JOIN local t ON parent.local_id =  t.parent_id
      )
      SELECT local_id FROM  parent
      ) 
   )
   LOOP
      RETURN NEXT r;
   END LOOP;
   RETURN;        
END;
$BODY$
  LANGUAGE plpgsql VOLATILE
  COST 100
  ROWS 1000;

我的超慢查询如下所示:

And my ultra slow query looks like these:

select e.element_id, a.action_id
from action a
join element e on (
                   e.local_id=any(select fn_local_get_childs(a.local_id)) AND 
                   e.category_id=a.category_id)

有没有办法在单个查询中组合函数中使用的递归?

Is there a way of combining the recursion used in the function in a single query?

推荐答案

集成查询

改进多处逻辑,您可以将整个操作集成在一个查询中.包装成 SQL 函数是可选的:

Integrate query

Improving the logic in several places, you can integrate the whole operation in a single query. Wrapping into an SQL function is optional:

CREATE OR REPLACE FUNCTION f_elems(_action_id integer)
  RETURNS SETOF integer AS
$func$
   WITH RECURSIVE l AS (
      SELECT a.category_id, l.local_id
      FROM   action a
      JOIN   local  l USING (local_id)
      WHERE  a.action_id = $1

      UNION ALL 
      SELECT l.category_id, c.local_id
      FROM   l
      JOIN   local c ON c.parent_id = l.local_id  -- c for "child"
      )
   SELECT e.element_id
   FROM   l
   JOIN   element e USING (category_id, local_id);
$func$  LANGUAGE sql STABLE;

检索给定 action_id 的相同和子本地的所有 element_id.

Retrieves all element_id for same and child-locals of a given action_id.

调用:

SELECT * FROM f_elem(3);

element_id
-----------
6
7

db<>fiddle 这里
OLD sqlfiddle

出于多种原因,这应该大大更快.最明显的是:

This should be substantially faster already for several reasons. The most obvious ones being:

  • 用纯 SQL 替换 plpgsql 中的慢循环.
  • 缩小递归查询的起始集.
  • 删除不必要且众所周知的缓慢IN 构造.

我使用 SELECT * FROM ... 而不是 SELECT 调用,即使该行只有一列,以获取 SELECT 的列名code>OUT 参数 (element_id) 我在函数头中声明.

I am calling with SELECT * FROM ... instead of just SELECT, even though the row has only a single column, to get the column name of the OUT parameter (element_id) I declared in the function header.

action.action_id 上的索引由主键提供.

An index on action.action_id is provided by the primary key.

但您可能错过了 local.parent_id 上的索引.同时,将覆盖多列索引(Postgres 9.2+)与 parent_id 作为第一个元素,local_id 作为第二个元素.如果表 local 很大,这应该会有很大帮助.对于一张小桌子来说,没有那么多或根本没有:

But you may have missed the index on local.parent_id. While being at it, make that a covering multi-column index (Postgres 9.2+) with parent_id as first element and local_id as second. This should help a lot if the table local is big. Not so much or not at all for a small table:

CREATE INDEX l_mult_idx ON local(parent_id, local_id);

为什么?见:

最后,表上的多列索引element 应该有更多帮助:

Finally, a multi-column index on table element should help some more:

CREATE INDEX e_mult_idx ON element (category_id, local_id, element_id);

第三列element_id 仅用于使其成为覆盖索引.如果您的查询从表 element 中检索更多列,您可能需要向索引添加更多列或删除 element_id.要么会加快速度.

The third column element_id is only useful to make it a covering index. If your query retrieves more columns from table element, you may want to add more columns to the index or drop element_id. Either will make it faster.

如果您的表收到很少或没有更新,则提供共享相同类别的所有对 (action_id, element_id) 的预计算集的物化视图将使这快如闪电.将 (action_id, element_id)(按此顺序)设为主键.

If your tables receive few or no updates, a materialized view providing the pre-computed set of all pairs (action_id, element_id) sharing the same category would make this lightening-fast. Make (action_id, element_id) (in that order) the primary key.

这篇关于使用 id 和树状表的后代连接两个表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆