在 PostgreSQL 中的大表中执行删除查询的最快方法 [英] Fastest way to execute a delete query in large table in PostgreSQL

查看:63
本文介绍了在 PostgreSQL 中的大表中执行删除查询的最快方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个大表和一个简单的删除查询,从表中删除 13k 条匹配记录大约需要 84 秒.以下是迄今为止测试过的 JPA 查询.

I have a large table and a simple delete query that takes about 84 seconds to delete 13k matching records from the table. Below are the JPA queries tested so far.

delete from PlanItems p where p.jobId = :jobid

测试的其他变体:

delete from PlanItems p where p.jobId in (select pi.jobId from PlanItems pi where pi.jobId = :jobid)

鉴于表 PlanItems 已经在 jobId 列上定义了一个索引,并且该表中只有一个外键,有什么选项可以加快我的删除询问?我使用 postgres 和 spring jpa 作为我的后端服务.请注意,我不想截断并插入到该表中,因为有许多用户并行操作相同的数据.我针对此事查看了多个帖子,尝试了上述所有可能的选项.

Given that the table PlanItems already have an index defined on column jobId and there's only one foreign key exist in this table, what are the options to fasten up my delete query? I'm using postgres with spring jpa for my backend service. Note that I don't want to truncate and insert into this table since there are many users who operate on same data in parallel. I checked multiple posts for this matter, tried out all possible options as mentioned.

查询执行计划详情:查询 1->查询 2->

Query execution plan details: Query 1-> Query 2->

这是我测量延迟的方式:

This is how I measure the latency:

long start = System.currentTimeMillis();
planItemsRepository.deleteByJobId(jobId);
end = System.currentTimeMillis();
log.info("Took around "+(end-start)+" milliseconds to delete all plan items records by jobId "+jobId);

存储库逻辑如下所示:

@Transactional
@Modifying
@Query(value = "delete from PlanItems p where p.jobId in (select pi.jobId from PlanItems pi where pi.jobId = :jobid) ")
int deleteByJobId(@Param("jobid") long jobId);

我想知道如何将延迟减少到一半甚至更低.

I would like to know how to reduce the latency to half way mark or even lesser.

推荐答案

如果这个查询需要很长时间:

If this query is taking a long time:

delete from PlanItems p where p.jobId = :jobid

and 你在 PlanItmes(jobId) 上有一个索引——其中 jobId 是索引中的第一列——那么你需要考虑其他问题.

and you have an index on PlanItmes(jobId) -- where jobId is the first column in the index -- then you need to consider other issues.

  1. 是否有使用 jobId 的级联外键约束?如果是这样,级联可能会影响这些表——如果它们是级联删除,那么删除这些行可能会影响更多表.

  1. Are there cascading foreign key constraints that use jobId? If so the cascades may impact those tables -- and if they are cascading deletes, then deleting those rows may affect more tables.

表上有删除触发器吗?如果是这样,您的简单"删除操作可能比您想象的要多得多.

Is there a delete trigger on the table? If so, your "simple" deletes may be doing a lot more work than you think.

系统负载是否过重?如果是这样,时间可能只是在等待删除行的机会.

Is there a heavy load on the system? If so, the time may simply be waiting for an opportunity to delete the rows.

物化视图是否建立在表上?如果是这样,它们可能正在刷新.

Are materialized views built on the table? If so, they might be being refreshed.

如果这些都不是,那么可能是您如何测量经过时间的问题.

If none of these are the case, then it might be an issue on how you are measuring the elapsed time.

这篇关于在 PostgreSQL 中的大表中执行删除查询的最快方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆