使用 pg_stat_statements 收集大型统计集? [英] Collecting large statistical sets with pg_stat_statements?

查看:43
本文介绍了使用 pg_stat_statements 收集大型统计集?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

根据 Postgres pg_stat_statements 文档:

According to Postgres pg_stat_statements documentation:

模块需要额外的共享内存pg_stat_statements.max.请注意,每当加载模块,即使 pg_stat_statements.track 设置为 none.

The module requires additional shared memory proportional to pg_stat_statements.max. Note that this memory is consumed whenever the module is loaded, even if pg_stat_statements.track is set to none.

还有:

代表性查询文本保存在外部磁盘文件中,并且不消耗共享内存.因此,即使是很长的查询文本可以成功保存.但是,如果许多长查询文本是累积,外部文件可能会变得无法管理.

The representative query texts are kept in an external disk file, and do not consume shared memory. Therefore, even very lengthy query texts can be stored successfully. However, if many long query texts are accumulated, the external file might grow unmanageably large.

从这些中还不清楚高 pg_stat_statements.max 的实际内存成本是多少 - 比如说 100k 或 500k(默认为 5k).设置这么高的水平是否安全,可能会产生如此高的负面影响吗?通过 logstash/fluentd 将统计信息聚合到外部数据库中是否是高于特定大小的首选方法?

From these it is unclear what the actual memory cost of a high pg_stat_statements.max would be - say at 100k or 500k (default is 5k). Is it safe to set the levels that high, would could be the negative repercussions of such high levels? Would aggregating statistics into an external database via logstash/fluentd be a preferred approach above certain sizes?

推荐答案

1.

从我读过的内容来看,它对查询进行哈希处理并将其保存在数据库中,将文本保存到 FS.所以下一个问题比共享内存过载更值得关注:

from what I have read, it hashes the query and keeps it in DB, saving the text to FS. So next concern is more expected then overloaded shared memory:

如果积累了很多长查询文本,外部文件可能会增长无法控制的大

if many long query texts are accumulated, the external file might grow unmanageably large

文本的散列比文本小得多,我认为您不应该担心比较长查询的扩展内存消耗.特别是知道该扩展程序使用查询分析器(适用于 EVERY 查询 ANYWAY):

the hash of text is so much smaller then text, that I think you should not worry about extension memory consumption comparing long queries. Especially knowing that extension uses Query Analyser (which will work for EVERY query ANYWAY):

queryid 哈希值是在解析后分析上计算的查询的表示

the queryid hash value is computed on the post-parse-analysis representation of the queries

pg_stat_statements.max 设置为 10 倍,我相信应该需要多 10 倍的共享内存.增长应该是线性.文档中没有这样说,但逻辑上应该如此.

Setting pg_stat_statements.max 10 times bigger should take 10 times more shared memory I believe. The grows should be linear. It does not say so in documentation, but logically should be so.

如果将设置设置为不同的值是否安全,没有答案,因为没有关于其他配置值和您拥有的硬件的数据.但是由于增长应该是线性的,请考虑以下答案:如果将其设置为 5K,并且查询运行时几乎没有增长,那么将其设置为 50K 几乎不会将其延长十倍".顺便说一句,我的问题 - 谁会挖 50000 条慢语句?:)

There is no answer if it is safe or not to set setting to distinct value, because there is no data on other configuration values and HW you have. But as growth should be linear, consider this answer: "if you set it to 5K, and query runtime has grown almost nothing, then setting it to 50K will prolong it almost nothing times ten". BTW, my question - who is gong to dig 50000 slow statements? :)

2.

这个扩展已经为dis-valued"语句做了一个预聚合.您可以直接在 DB 上选择它,因此将数据移动到其他 db 并在那里选择它只会给您卸载原始 DB 并加载另一个 DB 的好处.换句话说,您可以为原始查询节省 50MB,但在另一个查询上花费相同.是否有意义?对我来说 - 是的.这是我自己做的.但我也保存语句的执行计划(它不是 pg_stat_statements 扩展的一部分).我相信这取决于你拥有什么和你拥有什么.绝对没有必要仅仅因为一些查询.同样,除非你有这么大的文件,扩展名可以

This extension already makes a pre-aggregation for "dis-valued" statement. You can select it straight on DB, so moving data to other db and selecting it there will only give you the benefit of unloading the original DB and loading another. In other words you save 50MB for a query on original, but spend same on another. Does it make sense? For me - yes. This is what I do myself. But I also save execution plans for statement (which is not a part of pg_stat_statements extension). I believe it depends on what you have and what you have. Definitely there is no need for that just because of a number of queries. Again unless you have so big file that extension can

如果发生这种情况,作为一种恢复方法,pg_stat_statements 可能会选择丢弃查询文本,因此所有现有条目pg_stat_statements 视图将显示空查询字段

As a recovery method if that happens, pg_stat_statements may choose to discard the query texts, whereupon all existing entries in the pg_stat_statements view will show null query fields

这篇关于使用 pg_stat_statements 收集大型统计集?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆