如何在 PostgreSQL 中有效地设置减去连接表? [英] How to efficiently set subtract a join table in PostgreSQL?
问题描述
我有以下表格:
work_units
- 不言自明workers
- 不言自明skills
- 如果您想从事工作,每个工作单元都需要一些技能.每个工人都精通多种技能.work_units_skills
- 加入表workers_skills
- 加入表
工作人员可以请求分配给她的下一个合适的免费最高优先级(无论这意味着什么)工作单元.
<小时>目前我有:
SELECT work_units.*FROM work_units-- 一些连接不存在的地方(选择技能_idFROM work_units_skillsWHERE work_unit_id = work_units.id除了选择技能_idFROM workers_skillsWHERE worker_id = 1 -- 发出请求的工人 ID)-- 以及一堆其他条件-- 按复杂的东西排序限制 1更新跳过锁定;
这个条件使查询速度慢了 8-10 倍.
是否有更好的方式来表达 work_units
的技能应该是 workers
技能的子集,或者可以改进当前查询的内容?><小时>
更多上下文:
skills
表相当小.work_units
和workers
的相关技能往往很少.work_units_skills
在work_unit_id
上有索引.- 我尝试将
workers_skills
上的查询移动到 CTE.这带来了轻微的改进(10-15%),但仍然太慢. - 没有技能的工作单元可以被任何用户拾取.也就是空集是每个集的子集.
一个简单的加速方法是使用 EXCEPT ALL
而不是 EXCEPT
.后者删除重复项,这在这里是不必要的,而且速度可能很慢.
另一种可能更快的替代方法是使用进一步的 NOT EXISTS
而不是 EXCEPT
:
<代码>...不存在的地方(选择技能_idFROM work_units_skills wusWHERE work_unit_id = work_units.id并且不存在 (选择技能_idFROM workers_skills wsWHERE worker_id = 1 -- 发出请求的工人 IDAND ws.skill_id = wus.skill_id))
演示
http://rextester.com/AGEIS52439 - 带有 LIMIT代码> 删除以供测试
I have the following tables:
work_units
- self explanatoryworkers
- self explanatoryskills
- every work unit requires a number of skills if you want to work on it. Every worker is proficient in a number of skills.work_units_skills
- join tableworkers_skills
- join table
A worker can request the next appropriate free highest priority (whatever that means) unit of work to be assigned to her.
Currently I have:
SELECT work_units.*
FROM work_units
-- some joins
WHERE NOT EXISTS (
SELECT skill_id
FROM work_units_skills
WHERE work_unit_id = work_units.id
EXCEPT
SELECT skill_id
FROM workers_skills
WHERE worker_id = 1 -- the worker id that made the request
)
-- AND a bunch of other conditions
-- ORDER BY something complex
LIMIT 1
FOR UPDATE SKIP LOCKED;
This condition makes the query 8-10 times slower though.
Is there a better way to express that a work_units
's skills should be a subset of the workers
's skills or something to improve the current query?
Some more context:
- The
skills
table is fairly small. - Both
work_units
andworkers
tend to have very few associated skills. work_units_skills
has index onwork_unit_id
.- I tried moving the query on
workers_skills
into a CTE. This gave a slight improvement (10-15%), but it's still too slow. - A work unit with no skill can be picked up by any user. Aka an empty set is a subset of every set.
One simple speed-up would be to use EXCEPT ALL
instead of EXCEPT
. The latter removes duplicates, which is unnecessary here and can be slow.
An alternative that would probably be faster is to use a further NOT EXISTS
instead of the EXCEPT
:
...
WHERE NOT EXISTS (
SELECT skill_id
FROM work_units_skills wus
WHERE work_unit_id = work_units.id
AND NOT EXISTS (
SELECT skill_id
FROM workers_skills ws
WHERE worker_id = 1 -- the worker id that made the request
AND ws.skill_id = wus.skill_id
)
)
Demo
http://rextester.com/AGEIS52439 - with the the LIMIT
removed for testing
这篇关于如何在 PostgreSQL 中有效地设置减去连接表?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!