为什么PostgresQL查询性能会随着时间的推移而下降,但在重建索引时会恢复 [英] Why does PostgresQL query performance drop over time, but restored when rebuilding index

查看:918
本文介绍了为什么PostgresQL查询性能会随着时间的推移而下降,但在重建索引时会恢复的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

根据手册中的页面索引不需要维护。但是,我们运行PostgresQL表,其连续速率更新删除插入随着时间的推移(几天),查看显着的查询质量下降。如果我们删除并重新创建索引,则会恢复查询性能。

According to this page in the manual, indexes don't need to be maintained. However, we are running with a PostgresQL table that has a continuous rate of updates, deletes and inserts that over time (a few days) sees a significant query degradation. If we delete and recreate the index, query performance is restored.

我们使用开箱即用的设置。

我们测试中的表是目前开始空,并增长到五十万行。
它有一个相当大的行(很多文本字段)。

We are using out of the box settings.
The table in our test is currently starting out empty and grows to half a million rows. It has a fairly large row (lots of text fields).

我们基于索引搜索,而不是主键(我已经确认索引正在使用,至少在正常情况下使用)

We are searching based of an index, not the primary key (I've confirmed the index is being used, at least under normal conditions)

该表被用作持久存储的一个单一过程。
在Windows上使用带有Java客户端的PostgresQL。

The table is being used as a persistent store for a single process. Using PostgresQL on Windows with a Java client.

我愿意放弃插入和更新性能保持查询性能。

I'm willing to give up insert and update performance to keep up the query performance.

我们正在考虑重新架构应用程序,以便数据以允许我们删除和重建索引的方式分布在各种动态表中定期不影响应用程序。然而,和往常一样,有一个时间紧迫让这个工作,我怀疑我们缺少一些基本的配置或用法。

We are considering rearchitecting the application so that data is spread across various dynamic tables in a manner that allows us to drop and rebuild indexes periodically without impacting the application. However, as always, there is a time crunch to get this to work and I suspect we are missing something basic in our configuration or usage.

我们考虑过强制吸尘重建以在特定时间运行,但我怀疑这个动作的锁定期限会导致我们的查询阻止。这可能是一个选项,但有一些实时(3-5秒的窗口)含义需要我们的代码中的其他更改。

We have considered forcing vacuuming and rebuild to run at certain times, but I suspect the locking period for such an action would cause our query to block. This may be an option, but there are some real-time (windows of 3-5 seconds) implications that require other changes in our code.

其他信息:
表和索引

Additional information: Table and index

CREATE TABLE icl_contacts
(
  id bigint NOT NULL,
  campaignfqname character varying(255) NOT NULL,
  currentstate character(16) NOT NULL,
  xmlscheduledtime character(23) NOT NULL,
...
25 or so other fields.  Most of them fixed or varying character fiel  
...
  CONSTRAINT icl_contacts_pkey PRIMARY KEY (id)
)
WITH (OIDS=FALSE);
ALTER TABLE icl_contacts OWNER TO postgres;

CREATE INDEX icl_contacts_idx
  ON icl_contacts
  USING btree
  (xmlscheduledtime, currentstate, campaignfqname);

分析:

Limit  (cost=0.00..3792.10 rows=750 width=32) (actual time=48.922..59.601 rows=750 loops=1)
  ->  Index Scan using icl_contacts_idx on icl_contacts  (cost=0.00..934580.47 rows=184841 width=32) (actual time=48.909..55.961 rows=750 loops=1)
        Index Cond: ((xmlscheduledtime < '2010-05-20T13:00:00.000'::bpchar) AND (currentstate = 'SCHEDULED'::bpchar) AND ((campaignfqname)::text = '.main.ee45692a-6113-43cb-9257-7b6bf65f0c3e'::text))

而且,是的,我知道有很多东西我们可以做到规范化并改进这个表的设计。我们可以使用其中一些选项。

And, yes, I am aware there there are a variety of things we could do to normalize and improve the design of this table. Some of these options may be available to us.

我对此问题的关注关于理解 PostgresQL如何管理索引和查询随着时间的推移(理解原因,而不仅仅是修复)。如果要完成或重新进行重构,那么会有很多变化。

My focus in this question is about understanding how PostgresQL is managing the index and query over time (understand why, not just fix). If it were to be done over or significantly refactored, there would be a lot of changes.

推荐答案

自动真空应该做的伎俩,只要你为你想要的性能配置它。

Auto vacuum should do the trick, provided you configured it for your desired performance.

注意:
VACUUM FULL:这将重建表统计信息并回收磁盘空间负载。它会锁定整个表。

Notes: VACUUM FULL: this will rebuild table statistics and reclaim loads of disk space. It locks the whole table.

VACUUM:这将重建表统计信息并回收一些磁盘空间。它可以与生产系统并行运行,但会产生大量可能影响性能的IO。

VACUUM: this will rebuild table statistics and reclaim some disk space. It can be run in parallel with production system, but generates lots of IO which can impact performance.

分析:这将重建查询计划程序统计信息。这是由VACUUM触发的,但可以单独运行。

ANALYZE: this will rebuild query planner statistics. This is triggered by VACUUM, but can be run on its own.

更多此处的详细说明

这篇关于为什么PostgresQL查询性能会随着时间的推移而下降,但在重建索引时会恢复的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆