Linux上PostgreSQL中的配置参数work_mem [英] Configuration parameter work_mem in PostgreSQL on Linux
问题描述
我必须通过调整基本PostgreSQL服务器配置参数来优化查询。在文档中,我遇到了 work_mem
参数。然后,我检查了更改此参数将如何影响查询性能(使用排序)。我用各种 work_mem
设置测量了查询执行时间,对此感到非常失望。
I have to optimize queries by tuning basic PostgreSQL server configuration parameters. In documentation I've came across the work_mem
parameter. Then I checked how changing this parameter would influence performance of my query (using sort). I measured query execution time with various work_mem
settings and was very disappointed.
执行查询的表包含10,000,000行,并且有430 MB的数据要排序。 (排序方法:外部合并磁盘:430112kB
)。
The table on which I perform my query contains 10,000,000 rows and there are 430 MB of data to sort. (Sort Method: external merge Disk: 430112kB
).
其中 work_mem = 1MB
, EXPLAIN
输出为:
Total runtime: 29950.571 ms (sort takes about 19300 ms).
Sort (cost=4032588.78..4082588.66 rows=19999954 width=8)
(actual time=22577.149..26424.951 rows=20000000 loops=1)
Sort Key: "*SELECT* 1".n
Sort Method: external merge Disk: 430104kB
其中 work_mem = 5MB
:
Total runtime: 36282.729 ms (sort: 25400 ms).
Sort (cost=3485713.78..3535713.66 rows=19999954 width=8)
(actual time=25062.383..33246.561 rows=20000000 loops=1)
Sort Key: "*SELECT* 1".n
Sort Method: external merge Disk: 430104kB
其中 work_mem = 64MB
:
Total runtime: 42566.538 ms (sort: 31000 ms).
Sort (cost=3212276.28..3262276.16 rows=19999954 width=8)
(actual time=28599.611..39454.279 rows=20000000 loops=1)
Sort Key: "*SELECT* 1".n
Sort Method: external merge Disk: 430104kB
谁能解释为什么获得性能更差?还是建议其他方法来通过更改服务器参数来使查询执行更快?
Can anyone explain why performance gets worse? Or suggest any other methods to makes queries execution faster by changing server parameters?
我的查询(我知道这不是最佳方法,但我必须对这种查询进行基准测试):
My query (I know it's not optimal, but I have to benchmark this kind of query):
SELECT n
FROM (
SELECT n + 1 AS n FROM table_name
EXCEPT
SELECT n FROM table_name) AS q1
ORDER BY n DESC;
完整的执行计划:
Sort (cost=5805421.81..5830421.75 rows=9999977 width=8) (actual time=30405.682..30405.682 rows=1 loops=1)
Sort Key: q1.n
Sort Method: quicksort Memory: 25kB
-> Subquery Scan q1 (cost=4032588.78..4232588.32 rows=9999977 width=8) (actual time=30405.636..30405.637 rows=1 loops=1)
-> SetOp Except (cost=4032588.78..4132588.55 rows=9999977 width=8) (actual time=30405.634..30405.634 rows=1 loops=1)
-> Sort (cost=4032588.78..4082588.66 rows=19999954 width=8) (actual time=23046.478..27733.020 rows=20000000 loops=1)
Sort Key: "*SELECT* 1".n
Sort Method: external merge Disk: 430104kB
-> Append (cost=0.00..513495.02 rows=19999954 width=8) (actual time=0.040..8191.185 rows=20000000 loops=1)
-> Subquery Scan "*SELECT* 1" (cost=0.00..269247.48 rows=9999977 width=8) (actual time=0.039..3651.506 rows=10000000 loops=1)
-> Seq Scan on table_name (cost=0.00..169247.71 rows=9999977 width=8) (actual time=0.038..2258.323 rows=10000000 loops=1)
-> Subquery Scan "*SELECT* 2" (cost=0.00..244247.54 rows=9999977 width=8) (actual time=0.008..2697.546 rows=10000000 loops=1)
-> Seq Scan on table_name (cost=0.00..144247.77 rows=9999977 width=8) (actual time=0.006..1079.561 rows=10000000 loops=1)
Total runtime: 30496.100 ms
推荐答案
我将您的查询计划发布在 explain.depesz.com,看看。
I posted your query plan on explain.depesz.com, have a look.
查询计划者的估算在有的地方。
您最近运行过 ANALYZE
吗?
The query planner's estimates are terribly wrong in some places.
Have you run ANALYZE
recently?
阅读计划人员使用的统计信息和计划者成本常数。请特别注意 random_page_cost
和 default_statistics_target
中的章节。
您可以尝试:
Read the chapters in the manual on Statistics Used by the Planner and Planner Cost Constants. Pay special attention to the chapters on random_page_cost
and default_statistics_target
.
You might try:
ALTER TABLE diplomas ALTER COLUMN number SET STATISTICS 1000;
ANALYZE diplomas;
对于具有1000万行的表,甚至更高。这取决于数据分布和实际查询。实验。默认值为100,最大值为10000。
Or go even a higher for a table with 10M rows. It depends on data distribution and actual queries. Experiment. Default is 100, maximum is 10000.
对于该大小的数据库,只有1或5 MB的> work_mem
通常还不够。阅读调整@greroot链接的Postgres的Postgres Wiki页面。
For a database of that size, only 1 or 5 MB of work_mem
are generally not enough. Read the Postgres Wiki page on Tuning Postgres that @aleroot linked to.
由于查询需要 430104kB磁盘上的内存根据 EXPLAIN
的输出,您必须将 work_mem
设置为 500MB 或更多以允许在内存中排序。数据的内存中表示比磁盘上的表示需要更多的空间。您可能对 Tom Lane最近在此问题上发表的内容感兴趣。
As your query needs 430104kB of memory on disk according to EXPLAIN
output, you have to set work_mem
to something like 500MB or more to allow in-memory sorting. In-memory representation of data needs some more space than on-disk representation. You may be interested in what Tom Lane posted on that matter recently.
将 work_mem
稍微增加一点,就像您尝试过的那样,不会有太大帮助或可以甚至慢下来。全局将其设置为较高可能会造成伤害,尤其是在并发访问时。多个会话可能会彼此消耗资源。如果资源有限,则为某个目的分配更多资源会占用另一目的的内存。最好的设置取决于完整的情况。
Increasing work_mem
by just a little, like you tried, won't help much or can even slow down. Setting it to high globally can even hurt, especially with concurrent access. Multiple sessions might starve one another for resources. Allocating more for one purpose takes away memory from another if the resource is limited. The best setup depends on the complete situation.
为避免产生副作用,请仅在会话中将其设置为足够高,并临时针对查询:
To avoid side effects, only set it high enough locally in your session, and temporarily for the query:
SET work_mem = '500MB';
之后将其重置为默认值:
Reset it to your default afterwards:
RESET work_mem;
或使用 SET LOCAL
设置该值仅用于当前事务开始
Or use SET LOCAL
to set it just for the current transaction to begin with.
这篇关于Linux上PostgreSQL中的配置参数work_mem的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!