带有 NOT IN (SELECT ...) 的 DELETE 的性能 [英] Performance of DELETE with NOT IN (SELECT ...)

查看:73
本文介绍了带有 NOT IN (SELECT ...) 的 DELETE 的性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有这两个表,想从 ms_author 中删除所有作者,这些作者不在 author 中.

I have these two tables and want to delete all authors from ms_author, who are not present in author.

author(160 万行)

author (1.6M rows)

+-------+-------------+------+-----+-------+
| Field | Type        | Null | Key | index |
+-------+-------------+------+-----+-------+
| id    | text        | NO   | PRI | true  |
| name  | text        | YES  |     |       |
+-------+-------------+------+-----+-------+

ms_author(120M 行)

ms_author (120M rows)

+-------+-------------+------+-----+-------+
| Field | Type        | Null | Key | index |
+-------+-------------+------+-----+-------+
| id    | text        | NO   | PRI |       |
| name  | text        | YES  |     | true  |
+-------+-------------+------+-----+-------+

这是我的查询:

    DELETE
FROM ms_author AS m
WHERE m.name NOT IN
                   (SELECT a.name
                    FROM author AS a);

我尝试估计查询持续时间:~ 130 小时.
有没有更快的方法来实现这一目标?

I tried to estimate the query duration: ~ 130 hours.
Is there a faster way to achieve this?

详细解释输出

Delete on public.ms_author m  (cost=0.00..2906498718724.75 rows=59946100 width=6)"
  ->  Seq Scan on public.ms_author m  (cost=0.00..2906498718724.75 rows=59946100 width=6)"
        Output: m.ctid"
        Filter: (NOT (SubPlan 1))"
        SubPlan 1"
          ->  Materialize  (cost=0.00..44334.43 rows=1660295 width=15)"
                Output: a.name"
                ->  Seq Scan on public.author a  (cost=0.00..27925.95 rows=1660295 width=15)"
                      Output: a.name"

索引作者(name):

create index author_name on author(name);

索引 ms_author(name):

Indexing ms_author(name):

create index ms_author_name on ms_author(name);

推荐答案

我是反加入"的忠实粉丝.这对大型和小型数据集都有效:

I'm a big fan of the "anti-join." This works efficiently for both large and small datasets:

delete from ms_author ma
where not exists (
  select null
  from author a
  where ma.name = a.name
)

这篇关于带有 NOT IN (SELECT ...) 的 DELETE 的性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆