为什么在更新后我的postgres表变得更大了? [英] Why does my postgres table get much bigger under update?

查看:132
本文介绍了为什么在更新后我的postgres表变得更大了?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个表格,该表格分为两列(销售点和产品ID)。唯一的索引在这两列中,并且表聚集在这些列上。

I have a table, clustered on two columns (point of sale and product ID). The only index is in those two columns, and the table is clustered on those columns.

每周,我都会更新表中的其他列。当我这样做时,表和关系的大小将增加大约5倍。然后,我将表聚簇,并且大小恢复为更新前的大小。

On a weekly basis, I update other columns in the table. When I do that, the size of the table and relations increases by about 5 times. I then cluster the table, and the size reverts to what it was pre-update.

这对我来说似乎很奇怪。如果我要更新索引列,我希望可以通过清理处理一些膨胀,但是由于索引列没有被任何更新修改,所以我不明白为什么更新表会导致

This seems strange to me. If I were updating the indexed columns, I'd expect some bloat that I'd need to deal with by vacuuming, but since the indexed columns are not modified by any of the updates, I don't understand why updating the table would lead to an increase in size.

这是按预期工作的,还是表明我的配置有问题?有没有办法阻止这种情况?

Is this working as expected, or does this point to a problem in my configuration? Is there a way to stop this?

[Windows 7上的Postgres 9.1]

[Postgres 9.1 on Windows 7]

推荐答案

即使没有索引列,PostgreSQL仍必须执行 MVCC 更新,在其中写入新行,然后进行清理并丢弃旧行。否则,如果途中发生错误或崩溃,它将无法回滚事务。 (PostgreSQL没有撤消日志,而是使用堆。)

Even without indexed columns, PostgreSQL still has to do an MVCC update where it writes a new row then later vacuums and discards the old one. Otherwise it couldn't roll back a transaction if there was an error midway through or it crashed. (PostgreSQL doesn't have an undo log, it uses the heap instead).

HOT更新仅在页面中有足够的可用空间时才可以执行,以避免不必将新行写入另一页,然后在新页中必须输入新索引被创建。因此,即使您不更新索引列,PostgreSQL仍必须在表末尾的新页面上写入新行,因为没有地方可以在当前页面上放置新行版本。

HOT updates can only be done if there's enough free space in a page to avoid having to write the new row to a different page, where new index entries must then be created. So PostgreSQL still has to write new rows to new pages on the end of the table, even though you aren't updating indexed columns, because there's just nowhere to put the new row versions on the current pages.

我通常只希望空间增加一倍,但是如果您要进行一系列更新而又没有真空,那么可以预期会有更多的增长。尝试一次完成所有更新,或者两次之间进行 VACUUM

I'd usually only expect a doubling of space, but if you're doing a series of updates without vacuum catching up in between then more increases would be expected. Try to do all your updates in one pass or VACUUM between passes.

要以成本为代价使更新更快磁盘空间中的 ALTER TABLE 设置为在 FILLFACTOR >群集。我建议 45 ,为每行一个新版本留出足够的空间,再加上一点摆动空间。这样一来,表格的大小将增加一倍,但会减少所有重写的工作量。

To make the updates faster at the cost of some disk space, ALTER TABLE to set a non-100 FILLFACTOR on your table before you CLUSTER it. I suggest 45, enough room for one new version of each row plus a little wiggle space. That'll make the table twice the size but reduce the churn of all that rewriting. It'll let HOT updates occur and also speed up updates because there's no need to extend the relation all the time.

最好的方法-尝试找到一种避免这种情况的方法,因为它可以使HOT更新发生并加快更新速度,因为不需要一直扩展关系。必须定期批量更新整个表。

Best of all - try to find a way to avoid having to bulk update the whole table periodically.

这篇关于为什么在更新后我的postgres表变得更大了?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆