慢的Postgres 9.3查询 [英] Slow Postgres 9.3 queries
问题描述
我正在尝试确定是否可以加快对存储电子邮件的数据库的两个查询。表格如下:
I'm trying to figure out if I can speed up two queries on a database storing email messages. Here's the table:
\d messages;
Table "public.messages"
Column | Type | Modifiers
----------------+---------+-------------------------------------------------------
id | bigint | not null default nextval('messages_id_seq'::regclass)
created | bigint |
updated | bigint |
version | bigint |
threadid | bigint |
userid | bigint |
groupid | bigint |
messageid | text |
date | bigint |
num | bigint |
hasattachments | boolean |
placeholder | boolean |
compressedmsg | bytea |
revcount | bigint |
subject | text |
isreply | boolean |
likes | bytea |
isspecial | boolean |
pollid | bigint |
username | text |
fullname | text |
Indexes:
"messages_pkey" PRIMARY KEY, btree (id)
"idx_unique_message_messageid" UNIQUE, btree (groupid, messageid)
"idx_unique_message_num" UNIQUE, btree (groupid, num)
"idx_group_id" btree (groupid)
"idx_message_id" btree (messageid)
"idx_thread_id" btree (threadid)
"idx_user_id" btree (userid)
的输出选择relname,relpages,reltuples :: numeric,pg_size_pretty(pg_table_size(oid))FROM pg_class WHERE oid ='messages':: regclass;
是
relname | relpages | reltuples | pg_size_pretty
----------+----------+-----------+----------------
messages | 1584913 | 7337880 | 32 GB
一些可能相关的postgres配置值:
Some possibly relevant postgres config values:
shared_buffers = 1536MB
effective_cache_size = 4608MB
work_mem = 7864kB
maintenance_work_mem = 384MB
以下是解释分析输出:
explain analyze SELECT * FROM messages WHERE groupid=1886 ORDER BY id ASC LIMIT 20 offset 4440;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=479243.63..481402.39 rows=20 width=747) (actual time=14167.374..14167.408 rows=20 loops=1)
-> Index Scan using messages_pkey on messages (cost=0.43..19589605.98 rows=181490 width=747) (actual time=14105.172..14167.188 rows=4460 loops=1)
Filter: (groupid = 1886)
Rows Removed by Filter: 2364949
Total runtime: 14167.455 ms
(5 rows)
第二个查询:
explain analyze SELECT * FROM messages WHERE groupid=1886 ORDER BY created ASC LIMIT 20 offset 4440;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=538650.72..538650.77 rows=20 width=747) (actual time=671.983..671.992 rows=20 loops=1)
-> Sort (cost=538639.62..539093.34 rows=181490 width=747) (actual time=670.680..671.829 rows=4460 loops=1)
Sort Key: created
Sort Method: top-N heapsort Memory: 7078kB
-> Bitmap Heap Scan on messages (cost=7299.11..526731.31 rows=181490 width=747) (actual time=84.975..512.969 rows=200561 loops=1)
Recheck Cond: (groupid = 1886)
-> Bitmap Index Scan on idx_unique_message_num (cost=0.00..7253.73 rows=181490 width=0) (actual time=57.239..57.239 rows=203423 loops=1)
Index Cond: (groupid = 1886)
Total runtime: 672.787 ms
(9 rows)
这是在8GB固态硬盘上例如,平均负载通常为0.15。
This is on an SSD, 8GB Ram instance, load average is usually around 0.15.
我绝对不是专家。这是否只是数据散布在整个磁盘上的情况?我唯一使用CLUSTER的解决方案吗?
I'm definitely no expert. Is this a case of the data just being spread throughout the disk? Is my only solution to use CLUSTER?
我不明白的一件事是为什么使用 idx_unique_message_num
作为第二个查询的索引。为什么按ID排序这么慢?
One thing I don't understand is why is it using idx_unique_message_num
as the index for the second query. And why is ordering by ID so much slower?
推荐答案
如果有很多记录的 groupid = 1886
(注释:有200,563),要在行的已排序子集的偏移处获取记录,将需要进行排序(或等效的堆算法),这很慢。
If there are many records with groupid=1886
(from comment: there are 200,563), to get to records at an OFFSET of a sorted subset of rows, would require sorting (or an equivalent heap algorithm) which is slow.
这可以通过添加索引来解决。在这种情况下,一个在(groupid,id)
上,另一个在(groupid,created)
上。
This could be solved by adding an index. In this case, one on (groupid,id)
and another on (groupid,created)
.
来自评论:这确实有所帮助,将运行时间降低到5ms-10ms。
From comment: This indeed helped, taking down the runtime to 5ms-10ms.
这篇关于慢的Postgres 9.3查询的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!