PostgreSQL 在包含数组和大量更新的大表上运行缓慢 [英] PostgreSQL slow on a large table with arrays and lots of updates

查看:34
本文介绍了PostgreSQL 在包含数组和大量更新的大表上运行缓慢的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个非常大的表(20M 记录),它有一个 3 列索引和一个数组列.数组列每天更新(通过附加新值)所有行.也有插入,但没有更新那么多.

I have a pretty large table (20M records) which has a 3 column index and an array column. The array column is updated daily (by appending new values) for all rows. There is also inserts, but not as much as there are updates.

数组中的数据代表三个键对应的每日测量值,类似于:[[date_id_1, my_value_for_date_1], [date_id_2, my_value_for_date_2]].它用于绘制这些每日值的图表.假设我想随着时间的推移可视化键 (a, b, c) 的值,我执行 SELECT values FROM t WHERE a = my_a AND b = my_b AND c = my_c.然后我使用 values 数组来绘制图形.

The data in the array represents daily measurements corresponding to the three keys, something like this: [[date_id_1, my_value_for_date_1], [date_id_2, my_value_for_date_2]]. It is used to draw a graph of those daily values. Say I want to visualize the value for the key (a, b, c) over time, I do SELECT values FROM t WHERE a = my_a AND b = my_b AND c = my_c. Then I use the values array to draw the graph.

更新的性能(每天批量发生一次)随着时间的推移显着恶化.

Performance of the updates (which happen in a bulk once a day) has worsened considerably over time.

使用 PostgreSQL 8.3.8.

Using PostgreSQL 8.3.8.

你能给我一些关于在哪里寻找解决方案的提示吗?从调整 postgres 中的一些参数到甚至移动到另一个数据库(我想非关系数据库更适合这个特定的表,但我对这些没有太多经验),它可能是任何事情.

Can you give me any hints of where to look for a solution? It could be anything from tweaking some parameters in postgres to even moving to another database (I guess a non-relational database would be better suited for this particular table, but I don't have much experience with those).

推荐答案

我会看看表格的 FILLFACTOR.默认情况下它设置为 100,您可以将其降低到 70(开始时).在此之后,您必须执行 VACUUM FULL 以重建表.

I would take a look at the FILLFACTOR for the table. By default it's set to 100, you could lower it to 70 (to start with). After this, you have to do a VACUUM FULL to rebuild the table.

ALTER TABLE tablename SET (FILLFACTOR = 70);
VACUUM FULL tablename;
REINDEX TABLE tablename;

这让 UPDATE 有机会将行的更新副本放置在与原始页面相同的页面上,这比将其放置在不同的页面上效率更高.或者,如果您的数据库已经从许多以前的更新中变得有些碎片化,那么它可能已经足够空闲了.现在您的数据库还可以选择执行 HOT 更新,假设您要更新的列不涉及任何索引.

This gives UPDATE a chance to place the updated copy of a row on the same page as the original, which is more efficient than placing it on a different page. Or if your database is already somewhat fragmented from lots of previous updated, it might already be sparese enough. Now your database also has the option to do HOT updates, assuming the column you are updating is not one involved in any index.

这篇关于PostgreSQL 在包含数组和大量更新的大表上运行缓慢的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆