PostgreSQL在带有数组和大量更新的大型表上运行缓慢 [英] PostgreSQL slow on a large table with arrays and lots of updates

查看:281
本文介绍了PostgreSQL在带有数组和大量更新的大型表上运行缓慢的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个很大的表(20M条记录),其中有3列索引和一个数组列.阵列列会每天(通过添加新值)更新所有行.也有插入,但不如更新多.

I have a pretty large table (20M records) which has a 3 column index and an array column. The array column is updated daily (by appending new values) for all rows. There is also inserts, but not as much as there are updates.

数组中的数据表示与三个键相对应的每日测量值,如下所示:[[date_id_1, my_value_for_date_1], [date_id_2, my_value_for_date_2]].它用于绘制这些每日值的图形.假设我想随时间可视化键(a,b,c)的值,请执行SELECT values FROM t WHERE a = my_a AND b = my_b AND c = my_c.然后,我使用values数组绘制图形.

The data in the array represents daily measurements corresponding to the three keys, something like this: [[date_id_1, my_value_for_date_1], [date_id_2, my_value_for_date_2]]. It is used to draw a graph of those daily values. Say I want to visualize the value for the key (a, b, c) over time, I do SELECT values FROM t WHERE a = my_a AND b = my_b AND c = my_c. Then I use the values array to draw the graph.

随着时间的流逝,更新的性能(每天发生一次)大幅度恶化.

Performance of the updates (which happen in a bulk once a day) has worsened considerably over time.

使用PostgreSQL 8.3.8.

Using PostgreSQL 8.3.8.

您能给我一些在哪里寻找解决方案的提示吗?从调整postgres中的一些参数甚至到移动到另一个数据库,它都可以是任何东西(我猜非关系数据库将更适合于此特定表,但我对此没有太多经验).

Can you give me any hints of where to look for a solution? It could be anything from tweaking some parameters in postgres to even moving to another database (I guess a non-relational database would be better suited for this particular table, but I don't have much experience with those).

推荐答案

我将看看该表的FILLFACTOR.默认情况下,它设置为100,您可以将其降低到70(开始).此后,您必须进行VACUUM FULL重建表.

I would take a look at the FILLFACTOR for the table. By default it's set to 100, you could lower it to 70 (to start with). After this, you have to do a VACUUM FULL to rebuild the table.

ALTER TABLE tablename SET (FILLFACTOR = 70);
VACUUM FULL tablename;
REINDEX TABLE tablename;

这使UPDATE有机会将行的更新副本与原始副本放置在同一页面上,这比将其放置在另一页面上更为有效.或者,如果您的数据库已经从许多先前的更新中分离出来了,那么它可能已经足够备用了.现在,您的数据库还可以选择执行 HOT更新,假设您要更新的列不涉及任何索引.

This gives UPDATE a chance to place the updated copy of a row on the same page as the original, which is more efficient than placing it on a different page. Or if your database is already somewhat fragmented from lots of previous updated, it might already be sparese enough. Now your database also has the option to do HOT updates, assuming the column you are updating is not one involved in any index.

这篇关于PostgreSQL在带有数组和大量更新的大型表上运行缓慢的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆