Ruby 2.1.2内存泄漏如何处理? [英] How to deal with Ruby 2.1.2 memory leaks?

查看:101
本文介绍了Ruby 2.1.2内存泄漏如何处理?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个工作进程,该进程最多产生50个线程并执行一些异步操作(其中大多数是http调用).当我启动该过程时,它从大约35MB的已用内存开始,然后迅速增长到250MB.从那时起,它会进一步增长,问题是内存永远不会停止增长(即使增长阶段会随着时间的推移而减少).几天后,进程仅超出可用内存并崩溃.

I have a worker process which spawns up to 50 threads and do some async operations (most of which are http calls). When I start up the process, it starts with some 35MB of used memory, and quickly grows to 250MB. From that point it grows further more and the problem is that the memory never stops growing (even though the growing phase decreases over time). After several days, process just outgrows the available memory and crashes.

我做了很多分析和剖析,似乎找不到错误所在.即使堆大小几乎恒定,进程内存也在不断增长.我已将GC.stat输出收集到电子表格中,您可以在此处访问:

I did a lot of analysis and profiling and can't seem to find what is wrong. Process memory is constantly growing, even though the heap size is pretty much constant. I've collected GC.stat output into spreadsheet that you can access here:

https://docs.google.com/spreadsheets/d/17TohDNXQ_MXM31CeAmR2ptHFYfvOeF3dB6WCBkBS_Bc/edit?usp = sharing

尽管进程内存似乎最终稳定在415MB,但它将在接下来的两天内继续增长,直到达到512MB的极限并崩溃为止.

Even though it seems that the process memory has finally stabilized at 415MB, it will continue to grow over next couple of days until it reaches 512MB limit and crashes.

我也尝试过使用对象空间跟踪对象,但是被跟踪对象的内存总和从未超过70-80MB,这与GC报告完全吻合.剩余的300MB +(还在增长)在哪里花了呢...我一无所知.

I've also tried tracking objects with objectspace, but the sum of memory of tracked objects never crosses 70-80MB which perfectly aligns with GC reports. Where are the remaining 300MB+ (and growing) spent.. i have no clue.

如何处理这类问题?有没有什么工具可以让我更清楚地了解内存的消耗方式?

How to deal with these kinds of problems? Are there any tools that could give me clearer insight on how the memory is being consumed?

更新:宝石和操作系统

我正在使用以下宝石:

gem "require_all", "~> 1.3"
gem "thread", "~> 0.1"
gem "equalizer", "~> 0.0.9"
gem "digest-murmurhash", "~> 0.3", require: "digest/murmurhash"
gem "google-api-client", "~> 0.7", require: "google/api_client"
gem "aws-sdk", "~> 1.44"

该应用程序已部署在heroku上,尽管在Mac OS X 10.9.4上本地运行该应用程序时会发现内存泄漏.

The application is deployed on heroku, though memory leak is noticable when running it locally on Mac OS X 10.9.4.

更新:泄漏

我已经升级了stringbuffer并分析了@mtm建议的所有内容,现在leak工具没有发现内存泄漏,ruby堆大小没有随时间增加,但是,进程内存仍在增长.最初我以为它会在某个时候停止增长,但是几个小时后,它超出了限制,并且崩溃了.

I've upgraded stringbuffer and analyzed everything like @mtm suggested and now there are no memory leaks identified by leak tool, no increases in ruby heap size over time, and yet, the process memory is still growing. Originally I thought that it stopped growing at some point, but several hours later it outgrew the limit and process crashed.

推荐答案

从您的GC日志来看,由于heap_live_slot值没有显着增加,因此问题似乎不是红宝石对象引用泄漏.那表明问题是其中之一:

From your GC logs it appears the issue is not a ruby object reference leak as the heap_live_slot value is not increasing significantly. That would suggest the problem is one of:

  1. 数据存储在堆外部(字符串,数组等)
  2. 使用本机代码的gem中的泄漏
  3. Ruby解释器本身泄漏(极有可能)

有趣的是,该问题同时出现在OSX和Heroku(Ubuntu Linux)上.

It's interesting to note that the problem exhibits on both OSX and Heroku (Ubuntu Linux).

Ruby 2.1垃圾收集仅将报告的堆"用于包含少量垃圾的对象数据的.当对象中包含的数据超过特定限制时,数据将被移动并分配到堆之外的区域.您可以使用 ObjectSpace :

Ruby 2.1 garbage collection uses the reported "heap" only for Objects that contain a tiny amount of data. When the data contained in an Object goes over a certain limit, the data is moved and allocated to an area outside of the heap. You can get the overall size of each data type with ObjectSpace:

require 'objspace'
ObjectSpace.count_objects_size({})

将其与您的GC统计信息一起进行收集可能表明内存是在堆外部分配的.如果找到特定类型,请说:T_ARRAY比其他类型增加更多,您可能需要寻找永远要附加的数组.

Collecting this along with your GC stats might indicate where memory is being allocated outside the heap. If you find a particular type, say :T_ARRAY increasing a lot more than the others you might need to look for an array you are forever appending to.

您可以使用 pry-byebug 进入控制台以浏览特定内容对象,甚至从根部查看所有对象:

You can use pry-byebug to drop into a console to troll around specific objects, or even looking at all objects from the root:

ObjectSpace.memsize_of(some_object)
ObjectSpace.reachable_objects_from_root

关于 ruby​​开发人员博客此SO答案.我喜欢他们的JRuby/ VisualVM 分析想法.

There's a bit more detail on one of the ruby developers blog and also in this SO answer. I like their JRuby/VisualVM profiling idea.

使用 bundle 将宝石安装到本地路径:

Use bundle to install your gems into a local path:

bundle install --path=.gems/

然后您可以找到包含本机代码的代码:

Then you can find those that include native code:

find .gems/ -name "*.c"

哪个给您:(以我的可疑顺序)

Which gives you: (in my order of suspiciousness)

  • digest-stringbuffer-0.0.2
  • digest-murmurhash-0.3.0
  • nokogiri-1.6.3.1
  • json-1.8.1

OSX有一个有用的开发工具,称为 leaks 可以告诉您是否在运行的进程中找到未引用的内存.识别内存在Ruby中的来源不是很有用,但将有助于识别何时发生.

OSX has a useful dev tool called leaks that can tell you if it finds unreferenced memory in a running process. Not very useful for identifying where the memory comes from in Ruby but will help to identify when it is occurring.

首先要测试的是 digest-stringbuffer .从自述文件中获取示例,并使用 gc_tracer

require "digest/stringbuffer"
require "gc_tracer"
GC::Tracer.start_logging "gclog.txt"
module Digest
  class Prime31 < StringBuffer
    def initialize
      @prime = 31
    end

    def finish
      result = 0
      buffer.unpack("C*").each do |c|
        result += (c * @prime)
      end
      [result & 0xffffffff].pack("N")
    end
  end
end

并使它运行很多

while true do
  a=[]
  500.times do |i|
    a.push Digest::Prime31.hexdigest( "abc" * (1000 + i) )
  end
  sleep 1
end

运行示例:

bundle exec ruby ./stringbuffertest.rb &
pid=$!

监视ruby进程的驻留内存和虚拟内存大小,并确定leaks的计数:

Monitor the resident and virtual memory sizes of the ruby process, and the count of leaks identified:

while true; do
  ps=$(ps -o rss,vsz -p $pid | tail +2)
  leaks=$(leaks $pid | grep -c Leak)
  echo "$(date) m[$ps] l[$leaks]"
  sleep 15
done

看来我们已经找到了一些东西

And it looks like we've found something already:

Tue 26 Aug 2014 18:22:36 BST m[104776  2538288] l[8229]
Tue 26 Aug 2014 18:22:51 BST m[110524  2547504] l[13657]
Tue 26 Aug 2014 18:23:07 BST m[113716  2547504] l[19656]
Tue 26 Aug 2014 18:23:22 BST m[113924  2547504] l[25454]
Tue 26 Aug 2014 18:23:38 BST m[113988  2547504] l[30722]

常驻内存在增加,并且泄漏工具正在查找越来越多的未引用内存.确认GC堆大小,并且对象计数看起来仍然稳定

Resident memory is increasing and the leaks tool is finding more and more unreferenced memory. Confirm the GC heap size, and object count looks stable still

tail -f gclog.txt | awk '{ print $1, $3, $4, $7, $13 }
1581853040832 468 183 39171 3247996
1581859846164 468 183 33190 3247996
1584677954974 469 183 39088 3254580
1584678531598 469 183 39088 3254580
1584687986226 469 183 33824 3254580
1587512759786 470 183 39643 3261058
1587513449256 470 183 39643 3261058
1587521726010 470 183 34470 3261058

然后报告问题.

在我未经训练的C眼中,他们分配了指针缓冲区,但仅清理

It appears to my very untrained C eye that they allocate both a pointer and a buffer but only clean up the buffer.

看看digest-murmurhash,它似乎只提供了依赖StringBuffer的函数,因此一旦stringbuffer修复,泄漏就可以了.

Looking at digest-murmurhash, it seems to only provide functions that rely on StringBuffer so the leak might be fine once stringbuffer is fixed.

当他们修补了补丁后,请再次测试并移至下一个宝石.最好将实现中的代码片段用于每个gem测试,而不要使用通用示例.

When they have patched it, test again and move onto the next gem. It's probably best to use snippets of code from your implementation for each gem test rather than a generic example.

第一步将是在同一台MRI上证明多台机器上的问题,以排除本地问题(您已经做过).

First step would be to prove the issue on multiple machines under the same MRI to rule out anything local, which you've already done.

然后在不同的操作系统上尝试相同的Ruby版本.

Then try the same Ruby version on a different OS, which you've done too.

如果可能,在JRuby或Rubinius上尝试代码.会发生同样的问题吗?

Try the code on JRuby or Rubinius if possible. Does the same issue occur?

如果可能,请在2.0或1.9上尝试相同的代码,看看是否存在相同的问题.

Try the same code on 2.0 or 1.9 if possible, see if the same problem exists.

尝试使用github上的head开发版本,看看是否有任何不同.

Try the head development version from github and see if that makes any difference.

如果没有发现明显的问题,请提交错误给Ruby,详细说明问题和所有您要解决的问题已经淘汰.等待开发人员提供帮助,并提供他们所需的一切.他们很可能希望重现该问题,因此,如果您能够获得问题的最简洁明了的设置示例.这样做通常可以帮助您确定问题到底是什么.

If nothing becomes apparent, submit a bug to Ruby detailing the issue and all the things you have eliminated. Wait for a dev to help out and provide whatever they need. They will most likely want to reproduce the issue so if you can get the most concise/minimal example of the issue set up. Doing that will often help you identify what the issue is anyway.

这篇关于Ruby 2.1.2内存泄漏如何处理?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆