如何找到扭曲服务器内存使用量增加的来源? [英] How to find the source of increasing memory usage of a twisted server?

查看:20
本文介绍了如何找到扭曲服务器内存使用量增加的来源?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个用 Python 编写的基于 Twisted 的音频广播服务器.它工作正常,但是当服务器上有更多用户时,它的内存使用量会增加,但是当这些用户下线时,内存使用量永远不会下降.如下图所示:

I have an audio broadcasting server written in Python and based on Twisted. It works fine, but its memory usage is increasing when there are more users on server, but the memory usage never goes down when those users get off line. As you see in following figure:

在listeners/radio曲线上升的地方可以看到内存使用曲线上升,但是listener/radio的峰值过后,内存使用率仍然很高,不会下降.

You can see the curve of memory usage goes up where the curve of listeners/radios goes up, but after the peak of listener/radios, the memory usage is still high, never goes down.

我尝试了以下方法来解决这个问题:

I have tried following method for solving this problem:

  1. 将 Twisted 从 8.2 升级到 9.0
  2. 使用 guppy 转储 heapy,但根本没有帮助
  3. 将选择器反应器切换到 epoll 反应器,同样的问题.
  4. 用objgraph画对象的关系图,但我看不到点.

这是我用来运行我的扭曲服务器的环境:

Here is the environment I used for running my twisted server:

  • Python:2.5.4 r254:67916
  • 操作系统:Linux 版本 2.6.18-164.9.1.el5PAE (mockbuild@builder16.centos.org)(gcc 版本 4.1.2 20080704 (Red Hat 4.1.2-46))
  • 扭曲:9.0(在 virtualenv 下)

孔雀鱼的垃圾场:

Partition of a set of 116280 objects. Total size = 9552004 bytes.
 Index  Count   %     Size   % Cumulative  % Type
  0  52874  45  4505404  47   4505404  47 str
  1   5927   5  2231096  23   6736500  71 dict
  2  29215  25  1099676  12   7836176  82 tuple
  3   7503   6   510204   5   8346380  87 types.CodeType
  4   7625   7   427000   4   8773380  92 function
  5    672   1   292968   3   9066348  95 type
  6    866   1    82176   1   9148524  96 list
  7   1796   2    71840   1   9220364  97 __builtin__.weakref
  8   1140   1    41040   0   9261404  97 __builtin__.wrapper_descriptor
  9   2603   2    31236   0   9292640  97 int

如你所见,9552004字节的总大小为9.1 MB,你可以看到ps命令报告的rss:

As you can see, the total size 9552004 bytes is 9.1 MB, and you can see the rss reported by ps command:

[xxxx@webxx ~]$ ps -u xxxx-o pid,rss,cmd
  PID   RSS CMD
22123 67492 twistd -y broadcast.tac -r epoll

我服务器的 rss 是 65.9 MB,这意味着我的服务器中有 56.8 MB 隐形内存使用,它们是什么?

The rss of my server is 65.9 MB, it means there are 56.8 MB invisible memory usage in my server, what are they?

我的问题是:

  1. 如何找到内存使用量增加的根源?
  2. guppy 的可见内存使用情况是什么?
  3. 那些不可见的内存使用情况是什么?
  4. 这是由某些用 C 编写的模块的内存泄漏引起的吗?如果是,我该如何追踪和修复?
  5. Python 如何管理内存?内存池?我认为这可能是由音频数据块引起的.这样Python解释器拥有的内存块几乎没有泄漏.

2010/1/20 更新:有趣的是,我下载了最新的日志文件,它显示内存从未增加过.我想可能是分配的内存空间足够大.这是最新的数字.

Update 2010/1/20: It's interesting, I download the latest log file, and it shows that the memory never increase from a moment. I think might be the allocated memory space is big enough. Here is the latest figure.

2010 年 1 月 21 日更新:这里还有一个人影.嗯……抬高一点

Update 2010/1/21: Another figure here. hum.... raise a little bit

哎呀...还在上升

推荐答案

据我所知,是内存碎片问题.最初的设计是将音频数据块保存在一个列表中,所有这些块的大小都不是固定的.一旦缓冲列表的总大小超过缓冲区的限制,它就会从列表的顶部弹出一些块来限制大小.它可能看起来像这样:

As my guessing, it is due to memory fragmentation problem. The original design is to keep audio data chunks in a list, all of them are not in fixed size. Once the total size of the buffering list exceeds the limit of buffer, it pops some chunks from the top of list for limiting the size. It might looks like this:

  1. 块大小 511
  2. 块大小 1040
  3. 块大小 386
  4. 块大小 1350
  5. ...

它们中的大多数都大于 256 字节,Python 对大于 256 字节的块使用 malloc 而不是使用内存池.你可以想象那些块被分配和释放,会发生什么?例如,当大小为 1350 的块被释放时,堆中可能会有 1350 字节的空闲空间.之后,又来了一个请求988,一旦malloc捡到了这个洞,然后又出现了一个新的362大小的小空闲洞. 长时间运行后,堆里的小洞越来越多,也就是说,有这么多成堆的许多碎片.虚拟内存的页面大小通常为 4KB,这些碎片分布在很大范围的堆上,这使得 OS 无法交换这些页面.因此,RSS 总是很高.

Most of them are bigger than 256 bytes, Python uses malloc for chunks that are bigger than 256 bytes rather than uses memory pool. And you can imagine that those chunks are allocated, and released, what would happened? For example, when the chunk with 1350 size is released, then there might be a free 1350 bytes space in heap. After that, here comes another request 988, once malloc pick up the hole, and then there is another new little free hole of size 362. After long running, there are more and more little holes in heaps, in other words, there are so many fragments in heaps. The size of page of virtual memory usually is 4KB, those fragments are distributed around a big range of heap, it makes OS can't swap those page out. Thus, the RSS is always high.

修改我服务器的音频块管理模块的设计后,它现在占用的内存很少.您可以查看该图并与之前的图进行比较.

After modification of the design of the audio chunk management module of my server, it uses little memory now. You can see the figure and compare to previous one.

新设计使用字节数组而不是字符串列表.这是一大块内存,所以没有更多的碎片.

The new design use bytearray rather than list of strings. It is a big chunk of memory, so there is no more fragmentation.

这篇关于如何找到扭曲服务器内存使用量增加的来源?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆