数据库缓存大小策略 [英] DB cache size strategies

查看:97
本文介绍了数据库缓存大小策略的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



这里有一些关于pgsql性能调优的当前概念,因为它需要b / b
与专用linux或
hpux服务器。我特别关注shared_buffers设置。我

邀请任何纠正或确认反馈。我意识到在这个范围之外还有很多其他非常重要的性能因素。


pgsql性能调优的一个关键方面是调整内存

消费设置(shared_buffers,sort_mem等)大到足以在共享内存中保存尽可能多的数据库而不会导致页面

swap-ins 。据我所知,页面换出和交换空间使用都是正常的,但是很好,但很多页面交换都很糟糕。换句话说,对于

绝对最快的性能,我们希望数据库缓冲区缓存命中率

尽可能接近100%。


现在,我们如何在任何一点准确测量此缓存命中率?

嗯,这就是我目前所知道的给定群集的最佳衡量指标

数据库:


SELECT SUM(pg_stat_get_db_blocks_hit(d.oid))

/ SUM(pg_stat_get_db_blocks_fetched(d.oid))AS hit_rate

来自pg_database d;


我们如何确定何时我们在这方面具有次优的pgsql配置参数

?我的理解是:无论在什么程度上命中率低于100%,我们的共享缓冲区都不足以保存我们要访问的数据。


我们如何解决这个问题?我们想增加shared_buffers设置

和/或某些组合sort_mem,wal_buffers,vacuum_mem,重置​​我们的

统计数据,并恢复监控。我假设我们中的哪一个增加了

最大的好处取决于缓存溢出的来源(例程

查询,排序,抽真空),但是shared_buffers是最有影响力的

在实践中。同样,我们希望在不导致页面交换的情况下执行此操作。

当你看到掉期时,你已经走得太远了。如果我们经历掉期交易和

低于100%的缓存命中率,那么我们需要更多的内存或更少的RAM消耗。


最终结果是页面交换(sar -W)和缓存命中率(查询

以上)是用作警告铃声的两个非常关键的措施。 />

很高兴听到你的想法。


Ed

------------- --------------(广播结束)---------------------------

提示2:您可以使用取消注册命令立即取消所有列表

(将注册YourEmailAddressHere发送到 ma ******* @ postgresql.org

解决方案

星期五2004年1月30日2:33,Ed L.写道:


pgsql性能调优的一个关键方面是调整内存消耗设置(shared_buffers,sort_mem等)大足以容纳
尽可能多的数据库在共享内存中尽可能不会导致页面交换。我知道页面交换和交换空间的使用都是正常的,但很好,但很多页面交换都很糟糕)。换句话说,对于绝对最快的性能,我们希望数据库缓冲区的缓存命中率尽可能接近100%。




我也很好奇DB共享缓冲区缓存与

linux / hpux内核缓存的关系。特别是,如果在
pgsql中检索的块在内核的缓存中而不在数据库缓存中,从而强制执行

read()系统调用,与来自

缓存的块检索相比,可以预期性能有多大的数量差异?b $ b?我认为他们的区别仅在于
微秒的顺序。 Linux内核磁盘缓存通常是数据库缓存中的重复数据吗?对于linux,内核缓存是否仅使用

" available"内存直到程序需要它,而pgsql数据库缓存

内存在启动时保证?


TIA。


Ed


---------------------------(播出结束)--- ------------------------

提示8:解释分析是你的朋友


2004年1月30日星期五下午03:19:56 -0700,Ed L.写道:

2004年1月30日星期五2:33,Ed L.写道:


pgsql性能调优的一个关键方面是调整内存消耗设置(shared_buffers,sort_mem等),大小足以容纳
数据库在共享内存中尽可能不会导致页面交换。我知道页面交换和交换空间的使用都是正常的,但很好,但很多页面交换都很糟糕)。换句话说,对于绝对最快的性能,我们希望数据库缓冲区的缓存命中率尽可能接近100%。
我也很好奇DB共享缓冲区缓存与
linux / hpux内核缓存的关系。特别是,如果在pgsql中检索的块在内核的缓存中但不在数据库缓存中,从而强制执行read()系统调用,那么性能的数量差异是什么?当与来自
缓存的块检索进行比较时,人们会期待吗?我认为他们的区别仅在于微秒的顺序。 Linux内核磁盘缓存通常与数据库缓存中的内容重复吗?对于linux,内核缓存是否仅使用
available?内存直到程序需要它,而pgsql数据库缓存内存在启动时得到保证?




每当数据库需要一个不在内存中的块时它会被加载并在操作系统缓存和数据库缓存中结束

。与从磁盘上加载

相比,从数据库缓存和操作系统缓存中获取

块之间的区别非常非常小。


因此,您的数据库缓存越大,您可以缓存的整体数据就越少,因为复制是重复的。 OTOH,postgres只能直接使用数据库缓存而且只能间接访问操作系统缓存。


我试图瞄准的是保持数据库缓存在working

set附近的某处,即系统中所有系统表和常用

表使用的空间量。它由群集中的所有postgres服务器共享,因此

记住这一点。操作系统缓存也可以由其他进程共享,所以

你的结果将取决于发生了什么其他事情。


最后,至少在Linux上, postgres用于缓存的共享内存也可以换掉
,这样就很难确定正确的值。


希望这会有所帮助,


-

Martijn van Oosterhout< kl ***** @ svana.org> http://svana.org/kleptog/ (......已经离开了在任何地方,即使是开发商也几乎无法使用,大约20%完成。很好。最后的80%通常需要20%的时间,对吧?) - Anthony Towns,debian- devel-announce




-----开始PGP签名-----

版本:GnuPG v1.0.6(GNU / Linux )

评论:有关信息,请参阅 http://www.gnupg.org


iD8DBQFAGv + nY5Twig3Ge + YRAlobAKDUzlZXSZt + tDn5c3ewkS uj965lMACeKUT1

AV / p / m2Pvf8QUF3MTH / opXk =

= XaoN

----- END PGP SIGNATURE -----


但请记住,所有这些策略都需要重新评估

7.5及其ARC。

Jan


Martijn van Oosterhout写道:

On 2004年1月30日星期五03:19:56 PM -0700, Ed L.写道:

2004年1月30日星期五2:33,Ed L.写道:

>
> pgsql性能调优的一个关键方面是调整内存
>消费设置(shared_buffers,sort_mem等)足以容纳
>尽可能多的共享内存中的数据库,而不是导致
>页面交换。我理解页面交换和交换空间
>用法很正常,但很多页面交换很糟糕。在其他
>为了绝对最快的性能,我们想要一个数据库缓冲区缓存
>命中率尽可能接近100%。



我也很好奇DB共享缓冲区缓存与
linux / hpux内核缓存的关系。特别是,如果在pgsql中检索的块在内核的缓存中但不在数据库缓存中,从而强制执行read()系统调用,那么性能的数量差异是什么?当与来自
缓存的块检索进行比较时,人们会期待吗?我认为他们的区别仅在于微秒的顺序。 Linux内核磁盘缓存通常与数据库缓存中的内容重复吗?对于linux,内核缓存是否仅使用
available?内存直到程序需要它,而pgsql数据库缓存内存在启动时得到保证?



每当数据库需要一个不在内存中的块时它会被加载并结束
在OS缓存和DB缓存中。与加载磁盘相比,从数据库缓存和操作系统缓存中获取数据块的区别非常非常小。

因此,数据库缓存越大,数据库缓存越大由于重复,您可以缓存的整体数据。 OTOH,postgres只能直接使用数据库缓存,并且只能间接访问操作系统缓存。

我试图瞄准的是将数据库缓存保留在工作<附近 br /> set",即系统中所有系统表和常用表格使用的空间量。它由群集中的所有postgres服务器共享,因此请记住这一点。 OS缓存也可以由其他进程共享,因此结果将取决于其他事情的发生。

最后,至少在Linux上,共享内存postgres用于缓存可以
也会被换掉,因此很难确定正确的价值。

希望这会有所帮助,



-

#=============================================== == =====================#

#因为错误而得到宽恕比对正确更容易。 #

#让我们打破这个规则 - 请原谅我。 #

#======================================== ========= = Ja******@Yahoo.com

---------------------------(播出结束)-------------- -------------

提示2:您可以使用取消注册命令立即取消所有列表

(发送取消注册YourEmailAddressHere ; ma*******@postgresql.org



Here''s some of my current notions on pgsql performance tuning strictly as it
relates to pgsql tuning parameters in the context of a dedicated linux or
hpux server. I''m particularly focusing on the shared_buffers setting. I
invite any corrective or confirming feedback. I realize there are many
other hugely important performance factors outside this scope.

One key aspect of pgsql performance tuning is to adjust the memory
consumption settings (shared_buffers, sort_mem, etc) large enough to hold
as much of the database in shared memory as possible while not causing page
swap-ins. I understand that both page swap-outs and swap space usage is
normal and OK, but lots of page swap-ins are bad). In other words, for
absolute fastest performance, we want a database buffer cache hit rate of
as close to 100% as possible.

Now, how do we accurately measure this cache hit rate at any one point?
Well, here''s what I currently know as the best measure for a given cluster
of databases:

SELECT SUM(pg_stat_get_db_blocks_hit(d.oid))
/ SUM(pg_stat_get_db_blocks_fetched(d.oid)) AS hit_rate
FROM pg_database d;

How do we identify when we have sub-optimal pgsql configuration parameters
in this regard? My understanding: to whatever extent the hit rate is
below 100%, our shared buffers are not large enough to hold the data we''re
accessing.

How do we fix the problem? We want to increase the shared_buffers setting
and/or maybe some combination sort_mem, wal_buffers, vacuum_mem, reset our
stats, and resume monitoring. I assume which of these we increase for
maximum benefit depends on the sources of the cache overflow (routine
queries, sorting, vacuuming), but that shared_buffers is the most impactful
in practice. Again, we want to do this without causing page swap-ins.
When you see swap-ins, you''ve gone too far. If we experience swap-ins and
less than 100% cache hit rate by any significant amount, then we need more
RAM or less RAM consumption.

The end result is that page swap-ins (sar -W) and cache hit rate (query
above) are two very key measures to use as warning bells.

Glad to hear your thoughts.

Ed
---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)

解决方案

On Friday January 30 2004 2:33, Ed L. wrote:


One key aspect of pgsql performance tuning is to adjust the memory
consumption settings (shared_buffers, sort_mem, etc) large enough to hold
as much of the database in shared memory as possible while not causing
page swap-ins. I understand that both page swap-outs and swap space
usage is normal and OK, but lots of page swap-ins are bad). In other
words, for absolute fastest performance, we want a database buffer cache
hit rate of as close to 100% as possible.



I''m also curious about the relationship of DB shared buffer cache to the
linux/hpux kernel caches. In particular, if the block being retrieved in
pgsql was in the kernel''s cache but not in the DB cache, thereby forcing a
read() system call, what kind of quantitative difference in performance
would one expect when comparing with block retrievals coming from the
cache? I would think they''d differ only by something on the order of
microseconds. Is the linux kernel disk cache normally a duplicate of much
of what is in the DB cache? For linux, does the kernel cache use only
"available" memory until a program needs it, while the pgsql DB cache
memory is guaranteed at startup?

TIA.

Ed

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend


On Fri, Jan 30, 2004 at 03:19:56PM -0700, Ed L. wrote:

On Friday January 30 2004 2:33, Ed L. wrote:


One key aspect of pgsql performance tuning is to adjust the memory
consumption settings (shared_buffers, sort_mem, etc) large enough to hold
as much of the database in shared memory as possible while not causing
page swap-ins. I understand that both page swap-outs and swap space
usage is normal and OK, but lots of page swap-ins are bad). In other
words, for absolute fastest performance, we want a database buffer cache
hit rate of as close to 100% as possible.
I''m also curious about the relationship of DB shared buffer cache to the
linux/hpux kernel caches. In particular, if the block being retrieved in
pgsql was in the kernel''s cache but not in the DB cache, thereby forcing a
read() system call, what kind of quantitative difference in performance
would one expect when comparing with block retrievals coming from the
cache? I would think they''d differ only by something on the order of
microseconds. Is the linux kernel disk cache normally a duplicate of much
of what is in the DB cache? For linux, does the kernel cache use only
"available" memory until a program needs it, while the pgsql DB cache
memory is guaranteed at startup?



Whenever the database needs a block not in memory it get loaded and ends up
in both the OS cache and the DB cache. The difference between getting a
block out of DB cache and OS cache is very, very small compared to loading
off disk.

Hence, the larger your DB cache, the less overall data you can cache due to
the duplication. OTOH, postgres can only directly use the DB cache and can
only access the OS cache indirectly.

What I try to aim for is the keep the DB cache somewhere near the "working
set", ie the amount of space used by all the system tables and commonly used
tables in the system. It''s shared by all postgres servers in a cluster so
keep that in mind. Also the OS cache may be shared by other processes, so
your results will depend on what other things are happening.

Finally, at least on Linux, the shared memory postgres uses for cache can
also be swapped out making it very difficult to determine the correct value.

Hope this helps,

--
Martijn van Oosterhout <kl*****@svana.org> http://svana.org/kleptog/ (... have gone from d-i being barely usable even by its developers
anywhere, to being about 20% done. Sweet. And the last 80% usually takes
20% of the time, too, right?) -- Anthony Towns, debian-devel-announce



-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQFAGv+nY5Twig3Ge+YRAlobAKDUzlZXSZt+tDn5c3ewkS uj965lMACeKUT1
AV/p/m2Pvf8QUF3MTH/opXk=
=XaoN
-----END PGP SIGNATURE-----


But keep in mind that all these strategies need to be reevaluated with
7.5 and its ARC.
Jan

Martijn van Oosterhout wrote:

On Fri, Jan 30, 2004 at 03:19:56PM -0700, Ed L. wrote:

On Friday January 30 2004 2:33, Ed L. wrote:

>
> One key aspect of pgsql performance tuning is to adjust the memory
> consumption settings (shared_buffers, sort_mem, etc) large enough to hold
> as much of the database in shared memory as possible while not causing
> page swap-ins. I understand that both page swap-outs and swap space
> usage is normal and OK, but lots of page swap-ins are bad). In other
> words, for absolute fastest performance, we want a database buffer cache
> hit rate of as close to 100% as possible.



I''m also curious about the relationship of DB shared buffer cache to the
linux/hpux kernel caches. In particular, if the block being retrieved in
pgsql was in the kernel''s cache but not in the DB cache, thereby forcing a
read() system call, what kind of quantitative difference in performance
would one expect when comparing with block retrievals coming from the
cache? I would think they''d differ only by something on the order of
microseconds. Is the linux kernel disk cache normally a duplicate of much
of what is in the DB cache? For linux, does the kernel cache use only
"available" memory until a program needs it, while the pgsql DB cache
memory is guaranteed at startup?



Whenever the database needs a block not in memory it get loaded and ends up
in both the OS cache and the DB cache. The difference between getting a
block out of DB cache and OS cache is very, very small compared to loading
off disk.

Hence, the larger your DB cache, the less overall data you can cache due to
the duplication. OTOH, postgres can only directly use the DB cache and can
only access the OS cache indirectly.

What I try to aim for is the keep the DB cache somewhere near the "working
set", ie the amount of space used by all the system tables and commonly used
tables in the system. It''s shared by all postgres servers in a cluster so
keep that in mind. Also the OS cache may be shared by other processes, so
your results will depend on what other things are happening.

Finally, at least on Linux, the shared memory postgres uses for cache can
also be swapped out making it very difficult to determine the correct value.

Hope this helps,


--
#================================================= =====================#
# It''s easier to get forgiveness for being wrong than for being right. #
# Let''s break this rule - forgive me. #
#================================================= = Ja******@Yahoo.com #
---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)


这篇关于数据库缓存大小策略的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆