Postgres 8.4.4(在Win7 x64上为x32)在小表上的更新速度非常慢 [英] Postgres 8.4.4 (x32 on Win7 x64) very slow UPDATE on small table

查看:121
本文介绍了Postgres 8.4.4(在Win7 x64上为x32)在小表上的更新速度非常慢的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个非常简单的更新语句:

I have a very simple update statement:

UPDATE W SET state='thing'
WHERE state NOT IN ('this','that') AND losttime < CURRENT_TIMESTAMP;

表W仅具有90行,尽管每行的Losttime和state列均更新为每约10秒秒。有关于状态和丢失时间的索引(以及主索引)。

The table W only has 90 rows, though the losttime and state columns for each row are updated each about every ~10s seconds. There are indexes on state and losttime (as well as the primary index).

我注意到大型数据库(即其他表有很多条目,表W)在一段时间内,查询变得越来越慢。运行48小时后,我通过在PqAdminIII的查询窗口中运行它来对其进行计时,执行该过程需要17分钟!

I'm noticing with large databases (i.e. the other tables have a lot of entries, not table W) over a period of time, the query gets slower and slower and slower. After running for 48 hours, I'm timing it by running it in the query window of PqAdminIII and it's taken 17 minutes to execute!

我对另一个查询有类似的查询表显示了相同的问题:

I have a similar query on another table that's showing the same problem:

UPDATE H SET release='1' 
WHERE a NOT IN (SELECT id from A WHERE state!='done') AND release!='1';

H没有任何索引,但是我试图在H(release)上放入和删除索引行为没有改变。数据库运行48小时并且表H具有约10万行后,此查询将花费27分钟。 Postgres服务器将在查询期间完全固定一个线程(100%CPU使用率),因此看起来网络,磁盘等都没有争用。

H does not have any indexes, but I have tried putting and removing an index on H(release) with no change in behaviour. This query, after the database has been up for 48 hours and the table H has ~100k rows, is taking 27 minutes. Postgres server will have a thread completely pegged (100% CPU utilization) for the duration of the query, so it doesn't look like there's any contention for network, disk, etc.

因此,我大致上的行为是我的数据库按预期运行了大约5分钟,然后随着与维护相关的基本UPDATE命令开始花费越来越长的时间,一切逐渐停止。到第二天,要花一个小时进行一个简单的维护周期(少量的UPDATES),而这些周期一开始的运行时间约为100ms。在我看来,性能下降与数据库中的信息量呈超线性关系-可能是N ^ 2或诸如此类。

So in broad strokes, the behaviour I see is that my database runs as expected for about 5 minutes, then gradually everything grinds to a halt as basic maintenance-related UPDATE commands start to take longer and longer to run. By the second day, it's taking an hour to do a simple maintenance cycle (a handful of UPDATES) which were running ~100ms at the outset. It seems clear to me that the performance degrade is super-linear with the amount of information in the database -- maybe N^2 or some such.

Autovacuum正在使用默认值。我再次通读了手册,没有发现任何让我惊讶的东西。

Autovacuum is on using the defaults. I read through the manual (again) and didn't see anything that jumped out at me.

我在这里挠头。我看不到9.0.1和9.0.2发行说明中似乎相关的任何错误修复。谁能帮助我了解发生了什么事?谢谢,M

I'm scratching my head here. I don't see any bug fixes that seem relevant in 9.0.1 and 9.0.2 release notes. Can anyone help me to understand what is happening? Thanks, M

-xxxx-

好的,所以我这里可能有两个问题。

Okay, so I may have two problems here.

第一个更新现在运行速度很快。不知道发生了什么,因此我假设我需要更频繁地运行VACUUM / ANALYZE或某种组合-每分钟左右。我真的很想知道为什么autovacuum不能帮我做到这一点。

The first update appears to run fast now. Not sure what happened, so I'll proceed there with the assumption that I need to run VACUUM / ANALYZE or some combination more frequently -- say every minute or so. I would really like to know why autovacuum isn't doing this for me.

第二次更新继续缓慢进行。查询计划表明索引没有得到有效使用,并且发生了80k * 30k的交叉,这可能说明了我似乎正在观察的超线性运行时。 (每个人都同意该计划的解释吗?)

The second update continues to run slowly. The query plan suggests that indexes are not being used effectively and that there is a 80k*30k cross occurring, which could account for super-linear runtime that I seem to be observing. (Does everyone agree with this interpretation of the plan?)

我可以将UPDATE转换为SELECT:

I can convert the UPDATE to a SELECT:

SELECT * from H
where a not in (SELECT id from A where state='done') AND release!='1';

,运行时间类似(27分钟)。

with a similar runtime (27 minutes).

如果我不信任postgres优化器,请执行以下操作:

If I don't trust the postgres optimizer and do this:

WITH r as (select id from A where state='done')
SELECT a from H 
JOIN on H.a=r.id 
WHERE H.released='0';

然后查询运行时间约为500毫秒。

then the query runs in ~500ms.

如何将这些知识转换回以可接受的速度运行的UPDATE?
我的尝试:

How do I translate this knowledge back into an UPDATE that runs with acceptable speed? My attempt:

UPDATE H SET release='1'
FROM A
where A.state!='done' AND release!='1' AND A.id=H.a;

运行大约140秒,虽然更快,但仍然非常慢。

runs in about 140 seconds, which is faster, but still very very slow.

我可以从这里去哪里?

-xxxx-

VACUUM ANALYZE已添加为例行维护的一部分,其中应用程序大约每分钟运行一次,而不依赖于正在运行的任何自动清理。

VACUUM ANALYZE has been added as part of "routine maintenance" where the application will run it approximately once every minute or so independently of any autovacuum that is running.

此外,重写第二个查询以消除已知的慢NOT NOT子句,将其替换为左反半连接(是吗?)

Also, rewrote the second query to eliminate the known-to-be-slow NOT IN clause, replacing it with a "Left Anti-Semi Join" (huh?)

UPDATE H SET release='1' 
WHERE release='0' AND NOT EXISTS (SELECT * FROM A WHERE id=H.a AND state!='done');


推荐答案

PostgreSQL 实现 MVCC

这意味着每次进行更新时,都会复制一个新的行

This means that each time you make an update, a new copy of row is created and the old one is marked as deleted (but is not physically deleted).

这会减慢查询速度。

您应该及时运行 VACUUM

PostgreSQL 8.4.4 运行 autovacuum 守护程序来执行此操作,但是安装过程中可能会出现一些问题。

PostgreSQL 8.4.4 runs autovacuum daemon to do this, but it may have some problems on your installation.

手动运行 VACUUM 会改善这种情况吗?

Does the situation improve when you run VACUUM manually?

这篇关于Postgres 8.4.4(在Win7 x64上为x32)在小表上的更新速度非常慢的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆