如何找到增加扭曲服务器内存使用量的根源? [英] How to find the source of increasing memory usage of a twisted server?

查看:82
本文介绍了如何找到增加扭曲服务器内存使用量的根源?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个基于Twisted的用Python编写的音频广播服务器.它可以正常工作,但是当服务器上有更多用户时,其内存使用量会增加,但是当这些用户下线时,内存使用量永远不会下降.如下图所示:

I have an audio broadcasting server written in Python and based on Twisted. It works fine, but its memory usage is increasing when there are more users on server, but the memory usage never goes down when those users get off line. As you see in following figure:

您可以看到内存使用率曲线上升,而侦听器/无线电曲线上升,但是在侦听器/无线电波峰值之后,内存使用率仍然很高,从不下降.

You can see the curve of memory usage goes up where the curve of listeners/radios goes up, but after the peak of listener/radios, the memory usage is still high, never goes down.

我尝试了以下方法来解决此问题:

I have tried following method for solving this problem:

  1. 将Twisted从8.2升级到9.0
  2. 使用孔雀鱼堆掉堆,但完全没有帮助
  3. 将选择器反应堆切换到epoll反应堆,同样的问题.
  4. 使用objgraph绘制对象之间的关系图,但是我看不到它的要点.

这是我用来运行扭曲服务器的环境:

Here is the environment I used for running my twisted server:

  • Python:2.5.4 r254:67916
  • 操作系统:Linux版本2.6.18-164.9.1.el5PAE(mockbuild@builder16.centos.org)(gcc版本4.1.2 20080704(Red Hat 4.1.2-46))
  • 扭曲:9.0(在virtualenv下)

孔雀鱼的转储:

Partition of a set of 116280 objects. Total size = 9552004 bytes.
 Index  Count   %     Size   % Cumulative  % Type
  0  52874  45  4505404  47   4505404  47 str
  1   5927   5  2231096  23   6736500  71 dict
  2  29215  25  1099676  12   7836176  82 tuple
  3   7503   6   510204   5   8346380  87 types.CodeType
  4   7625   7   427000   4   8773380  92 function
  5    672   1   292968   3   9066348  95 type
  6    866   1    82176   1   9148524  96 list
  7   1796   2    71840   1   9220364  97 __builtin__.weakref
  8   1140   1    41040   0   9261404  97 __builtin__.wrapper_descriptor
  9   2603   2    31236   0   9292640  97 int

如您所见,总大小9552004字节为 9.1 MB ,并且您可以看到ps命令报告的rss:

As you can see, the total size 9552004 bytes is 9.1 MB, and you can see the rss reported by ps command:

[xxxx@webxx ~]$ ps -u xxxx-o pid,rss,cmd
  PID   RSS CMD
22123 67492 twistd -y broadcast.tac -r epoll

我的服务器的rss是 65.9 MB ,这意味着我的服务器中有 56.8 MB 的不可见内存使用情况,它们是什么?

The rss of my server is 65.9 MB, it means there are 56.8 MB invisible memory usage in my server, what are they?

我的问题是:

  1. 如何找到增加内存使用量的根源?
  2. 孔雀鱼的可见内存使用量是多少?
  3. 那些不可见的内存使用情况是什么?
  4. 这是由某些用C编写的模块的内存泄漏引起的吗?如果是的话,我该如何跟踪和修复它?
  5. Python如何管理内存?内存池?我认为这可能是由音频数据块引起的.因此,Python解释器拥有的内存块几乎没有泄漏.

更新2010/1/20 : 有趣的是,我下载了最新的日志文件,它显示内存从不增加.我认为可能是分配的内存空间足够大.这是最新的数字.

Update 2010/1/20: It's interesting, I download the latest log file, and it shows that the memory never increase from a moment. I think might be the allocated memory space is big enough. Here is the latest figure.

更新2010/1/21 : 这里的另一个数字.哼....提高一点

Update 2010/1/21: Another figure here. hum.... raise a little bit

糟糕...还在上涨

Oops... Still going up

推荐答案

据我猜测,这是由于内存碎片问题造成的.最初的设计是将音频数据块保留在列表中,但它们的大小都不固定.一旦缓冲列表的总大小超过了缓冲区的限制,它将从列表顶部弹出一些块以限制大小.可能看起来像这样:

As my guessing, it is due to memory fragmentation problem. The original design is to keep audio data chunks in a list, all of them are not in fixed size. Once the total size of the buffering list exceeds the limit of buffer, it pops some chunks from the top of list for limiting the size. It might looks like this:

  1. 块大小511
  2. 块大小1040
  3. 块大小386
  4. 块大小1350
  5. ...

它们中的大多数大于256字节,Python使用malloc来处理大于256字节的块,而不是使用内存池.您可以想象那些块被分配和释放了,会发生什么?例如,释放大小为1350的块时,堆中可能有1350字节的可用空间.之后,出现另一个请求988,一旦malloc接了这个孔,然后又有另一个新的小空闲孔,大小为362.长时间运行后,堆中的小孔越来越多,换句话说,有这么多小孔堆中有很多碎片.虚拟内存页面的大小通常为4KB,这些碎片分布在很大的堆范围内,这使得OS无法将这些页面换出.因此,RSS始终很高.

Most of them are bigger than 256 bytes, Python uses malloc for chunks that are bigger than 256 bytes rather than uses memory pool. And you can imagine that those chunks are allocated, and released, what would happened? For example, when the chunk with 1350 size is released, then there might be a free 1350 bytes space in heap. After that, here comes another request 988, once malloc pick up the hole, and then there is another new little free hole of size 362. After long running, there are more and more little holes in heaps, in other words, there are so many fragments in heaps. The size of page of virtual memory usually is 4KB, those fragments are distributed around a big range of heap, it makes OS can't swap those page out. Thus, the RSS is always high.

修改我服务器的音频块管理模块的设计后,它现在只占用很少的内存.您可以看到该图并与上一个图进行比较.

After modification of the design of the audio chunk management module of my server, it uses little memory now. You can see the figure and compare to previous one.

新设计使用字节数组,而不是字符串列表.这是一块很大的内存,因此不再有碎片.

The new design use bytearray rather than list of strings. It is a big chunk of memory, so there is no more fragmentation.

这篇关于如何找到增加扭曲服务器内存使用量的根源?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆