使用 id 和树状表的后代连接两个表 [英] Join two tables using id and descendants from tree like table
问题描述
我有以下表格:
CREATE TABLE element (
element_id serial PRIMARY KEY,
local_id integer,
name varchar,
CONSTRAINT fk_element_local_id FOREIGN KEY (local_id)
REFERENCES local (local_id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION
);
CREATE TABLE local (
local_id serial PRIMARY KEY,
parent_id integer,
name varchar,
CONSTRAINT fk_local_parent_id_local_id FOREIGN KEY (parent_id)
REFERENCES local (local_id) MATCH SIMPLE
ON UPDATE CASCADE ON DELETE SET NULL
);
CREATE TABLE category (
category_id serial PRIMARY KEY,
name varchar
);
CREATE TABLE action (
action_id serial PRIMARY KEY,
local_id integer,
category_id integer,
CONSTRAINT fk_action_local_id FOREIGN KEY (local_id)
REFERENCES local (local_id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION,
CONSTRAINT fk_action_element_id FOREIGN KEY (element_id)
REFERENCES element (element_id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION
);
我想从一个动作中选择所有元素.如果元素的局部是动作局部的后代,它也应该出现.
例子:
I want to select all elements from an action. If the local from the element is descendant from the local of the action it should appear as well.
example:
表local
:
|local_id | parent_id | name |
|---------+-----------+------|
|1 |NULL |A |
|2 |1 |B |
|3 |1 |C |
|4 |3 |D |
|5 |NULL |E |
|6 |5 |F |
|_________|___________|______|
表格category
:
| category_id | name |
|-------------+------|
|1 |A |
|2 |B |
|2 |C |
|_____________|______|
表格元素
:
|element_id | local_id | name | category_id |
|-----------+----------+------+-------------|
|1 |1 |A | 1 |
|2 |2 |B | 2 |
|3 |2 |C | 1 |
|4 |4 |D | 2 |
|5 |5 |E | 2 |
|6 |6 |F | 1 |
|7 |6 |G | 1 |
|___________|__________|______|_____________|
表格action
:
|action_id | local_id | category_id |
|----------+----------+-------------|
| 1 | 1 | 2 |
| 2 | 3 | 1 |
| 3 | 5 | 1 |
| 4 | 6 | 1 |
|__________|__________|_____________|
我想要的查询结果:
CASE: action_id = 1
return: element_id: 2,4
CASE: action_id = 2
return: element_id: null
CASE: action_id = 3
return: element_id: 6,7
我已经创建了一个函数,该函数返回包括实际节点在内的所有后代,但由于调用该函数数千次时的性能,我遇到了困难.我的函数如下所示:
I've made a function that returns all the descendants including the actual node but I'm having a hard time because of the performance when calling the function thousands of times. My function looks like this:
CREATE OR REPLACE FUNCTION fn_local_get_childs(_parent_id integer)
RETURNS SETOF integer AS
$BODY$
DECLARE
r integer;
BEGIN
FOR r IN SELECT local_id FROM local WHERE local_id IN (
(WITH RECURSIVE parent AS
(
SELECT local_id , parent_id from local WHERE local_id = _parent_id
UNION ALL
SELECT t.local_id , t.parent_id FROM parent
INNER JOIN local t ON parent.local_id = t.parent_id
)
SELECT local_id FROM parent
)
)
LOOP
RETURN NEXT r;
END LOOP;
RETURN;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100
ROWS 1000;
我的超慢查询如下所示:
And my ultra slow query looks like these:
select e.element_id, a.action_id
from action a
join element e on (
e.local_id=any(select fn_local_get_childs(a.local_id)) AND
e.category_id=a.category_id)
有没有办法在单个查询中组合函数中使用的递归?
Is there a way of combining the recursion used in the function in a single query?
推荐答案
集成查询
改进多处逻辑,您可以将整个操作集成在一个查询中.包装成 SQL 函数是可选的:
Integrate query
Improving the logic in several places, you can integrate the whole operation in a single query. Wrapping into an SQL function is optional:
CREATE OR REPLACE FUNCTION f_elems(_action_id integer)
RETURNS SETOF integer AS
$func$
WITH RECURSIVE l AS (
SELECT a.category_id, l.local_id
FROM action a
JOIN local l USING (local_id)
WHERE a.action_id = $1
UNION ALL
SELECT l.category_id, c.local_id
FROM l
JOIN local c ON c.parent_id = l.local_id -- c for "child"
)
SELECT e.element_id
FROM l
JOIN element e USING (category_id, local_id);
$func$ LANGUAGE sql STABLE;
检索给定 action_id
的相同和子本地的所有 element_id
.
Retrieves all element_id
for same and child-locals of a given action_id
.
调用:
SELECT * FROM f_elem(3);
element_id
-----------
6
7
出于多种原因,这应该大大更快.最明显的是:
This should be substantially faster already for several reasons. The most obvious ones being:
- 用纯 SQL 替换 plpgsql 中的慢循环.
- 缩小递归查询的起始集.
- 删除不必要且众所周知的缓慢
IN
构造.
我使用 SELECT * FROM ...
而不是 SELECT
调用,即使该行只有一列,以获取 SELECT
的列名code>OUT 参数 (element_id
) 我在函数头中声明.
I am calling with SELECT * FROM ...
instead of just SELECT
, even though the row has only a single column, to get the column name of the OUT
parameter (element_id
) I declared in the function header.
action.action_id
上的索引由主键提供.
An index on action.action_id
is provided by the primary key.
但您可能错过了 local.parent_id
上的索引.同时,将覆盖多列索引(Postgres 9.2+)与 parent_id
作为第一个元素,local_id
作为第二个元素.如果表 local
很大,这应该会有很大帮助.对于一张小桌子来说,没有那么多或根本没有:
But you may have missed the index on local.parent_id
. While being at it, make that a covering multi-column index (Postgres 9.2+) with parent_id
as first element and local_id
as second. This should help a lot if the table local
is big. Not so much or not at all for a small table:
CREATE INDEX l_mult_idx ON local(parent_id, local_id);
为什么?见:
最后,表上的多列索引element
应该有更多帮助:
Finally, a multi-column index on table element
should help some more:
CREATE INDEX e_mult_idx ON element (category_id, local_id, element_id);
第三列element_id
仅用于使其成为覆盖索引.如果您的查询从表 element
中检索更多列,您可能需要向索引添加更多列或删除 element_id
.要么会加快速度.
The third column element_id
is only useful to make it a covering index. If your query retrieves more columns from table element
, you may want to add more columns to the index or drop element_id
. Either will make it faster.
如果您的表收到很少或没有更新,则提供共享相同类别的所有对 (action_id, element_id)
的预计算集的物化视图将使这快如闪电.将 (action_id, element_id)
(按此顺序)设为主键.
If your tables receive few or no updates, a materialized view providing the pre-computed set of all pairs (action_id, element_id)
sharing the same category would make this lightening-fast. Make (action_id, element_id)
(in that order) the primary key.
这篇关于使用 id 和树状表的后代连接两个表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!