如何在 PostgreSQL 中有效地设置减去连接表? [英] How to efficiently set subtract a join table in PostgreSQL?

查看:55
本文介绍了如何在 PostgreSQL 中有效地设置减去连接表?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下表格:

  • work_units - 不言自明
  • workers - 不言自明
  • skills - 如果您想从事工作,每个工作单元都需要一些技能.每个工人都精通多种技能.
  • work_units_skills - 加入表
  • workers_skills - 加入表

工作人员可以请求分配给她的下一个合适的免费最高优先级(无论这意味着什么)工作单元.

<小时>

目前我有:

SELECT work_units.*FROM work_units-- 一些连接不存在的地方(选择技能_idFROM work_units_skillsWHERE work_unit_id = work_units.id除了选择技能_idFROM workers_skillsWHERE worker_id = 1 -- 发出请求的工人 ID)-- 以及一堆其他条件-- 按复杂的东西排序限制 1更新跳过锁定;

这个条件使查询速度慢了 8-10 倍.

是否有更好的方式来表达 work_units 的技能应该是 workers 技能的子集,或者可以改进当前查询的内容?<小时>

更多上下文:

  • skills 表相当小.
  • work_unitsworkers 的相关技能往往很少.
  • work_units_skillswork_unit_id 上有索引.
  • 我尝试将 workers_skills 上的查询移动到 CTE.这带来了轻微的改进(10-15%),但仍然太慢.
  • 没有技能的工作单元可以被任何用户拾取.也就是空集是每个集的子集.

解决方案

一个简单的加速方法是使用 EXCEPT ALL 而不是 EXCEPT.后者删除重复项,这在这里是不必要的,而且速度可能很慢.

另一种可能更快的替代方法是使用进一步的 NOT EXISTS 而不是 EXCEPT:

<代码>...不存在的地方(选择技能_idFROM work_units_skills wusWHERE work_unit_id = work_units.id并且不存在 (选择技能_idFROM workers_skills wsWHERE worker_id = 1 -- 发出请求的工人 IDAND ws.skill_id = wus.skill_id))

演示

http://rextester.com/AGEIS52439 - 带有 LIMIT 删除以供测试

I have the following tables:

  • work_units - self explanatory
  • workers - self explanatory
  • skills - every work unit requires a number of skills if you want to work on it. Every worker is proficient in a number of skills.
  • work_units_skills - join table
  • workers_skills - join table

A worker can request the next appropriate free highest priority (whatever that means) unit of work to be assigned to her.


Currently I have:

SELECT work_units.*
FROM work_units
-- some joins
WHERE NOT EXISTS (
        SELECT skill_id
        FROM work_units_skills
        WHERE work_unit_id = work_units.id

        EXCEPT

        SELECT skill_id
        FROM workers_skills
        WHERE worker_id = 1 -- the worker id that made the request
      )
-- AND a bunch of other conditions
-- ORDER BY something complex
LIMIT 1
FOR UPDATE SKIP LOCKED;

This condition makes the query 8-10 times slower though.

Is there a better way to express that a work_units's skills should be a subset of the workers's skills or something to improve the current query?


Some more context:

  • The skills table is fairly small.
  • Both work_units and workers tend to have very few associated skills.
  • work_units_skills has index on work_unit_id.
  • I tried moving the query on workers_skills into a CTE. This gave a slight improvement (10-15%), but it's still too slow.
  • A work unit with no skill can be picked up by any user. Aka an empty set is a subset of every set.

解决方案

One simple speed-up would be to use EXCEPT ALL instead of EXCEPT. The latter removes duplicates, which is unnecessary here and can be slow.

An alternative that would probably be faster is to use a further NOT EXISTS instead of the EXCEPT:

...
WHERE NOT EXISTS (
        SELECT skill_id
        FROM work_units_skills wus
        WHERE work_unit_id = work_units.id
        AND NOT EXISTS (
            SELECT skill_id
            FROM workers_skills ws
            WHERE worker_id = 1 -- the worker id that made the request
              AND ws.skill_id = wus.skill_id
        )
      )

Demo

http://rextester.com/AGEIS52439 - with the the LIMIT removed for testing

这篇关于如何在 PostgreSQL 中有效地设置减去连接表?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆