查找 Ruby 内存泄漏的原因 [英] Finding the cause of a memory leak in Ruby

查看:32
本文介绍了查找 Ruby 内存泄漏的原因的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在我的 Rails 代码中发现了内存泄漏 - 也就是说,我发现了什么代码泄漏,但没有发现它为什么泄漏.我已将其简化为不需要 Rails 的测试用例:

I've discovered a memory leak in my Rails code - that is to say, I've found what code leaks but not why it leaks. I've reduced it down to a test case that doesn't require Rails:

require 'csspool'
require 'ruby-mass'

def report
    puts 'Memory ' + `ps ax -o pid,rss | grep -E "^[[:space:]]*#{$$}"`.strip.split.map(&:to_i)[1].to_s + 'KB'
    Mass.print
end

report

# note I do not store the return value here
CSSPool::CSS::Document.parse(File.new('/home/jason/big.css'))

ObjectSpace.garbage_collect
sleep 1

report

ruby-mass 据说可以让我看到内存中的所有对象.CSSPool 是一个基于 racc./home/jason/big.css 是一个 1.5MB 的 CSS 文件.

ruby-mass supposedly lets me see all the objects in memory. CSSPool is a CSS parser based on racc. /home/jason/big.css is a 1.5MB CSS file.

输出:

Memory 9264KB

==================================================
 Objects within [] namespace
==================================================
  String: 7261
  RubyVM::InstructionSequence: 1151
  Array: 562
  Class: 313
  Regexp: 181
  Proc: 111
  Encoding: 99
  Gem::StubSpecification: 66
  Gem::StubSpecification::StubLine: 60
  Gem::Version: 60
  Module: 31
  Hash: 29
  Gem::Requirement: 25
  RubyVM::Env: 11
  Gem::Specification: 8
  Float: 7
  Gem::Dependency: 7
  Range: 4
  Bignum: 3
  IO: 3
  Mutex: 3
  Time: 3
  Object: 2
  ARGF.class: 1
  Binding: 1
  Complex: 1
  Data: 1
  Gem::PathSupport: 1
  IOError: 1
  MatchData: 1
  Monitor: 1
  NoMemoryError: 1
  Process::Status: 1
  Random: 1
  RubyVM: 1
  SystemStackError: 1
  Thread: 1
  ThreadGroup: 1
  fatal: 1
==================================================

Memory 258860KB

==================================================
 Objects within [] namespace
==================================================
  String: 7456
  RubyVM::InstructionSequence: 1151
  Array: 564
  Class: 313
  Regexp: 181
  Proc: 113
  Encoding: 99
  Gem::StubSpecification: 66
  Gem::StubSpecification::StubLine: 60
  Gem::Version: 60
  Module: 31
  Hash: 30
  Gem::Requirement: 25
  RubyVM::Env: 13
  Gem::Specification: 8
  Float: 7
  Gem::Dependency: 7
  Range: 4
  Bignum: 3
  IO: 3
  Mutex: 3
  Time: 3
  Object: 2
  ARGF.class: 1
  Binding: 1
  Complex: 1
  Data: 1
  Gem::PathSupport: 1
  IOError: 1
  MatchData: 1
  Monitor: 1
  NoMemoryError: 1
  Process::Status: 1
  Random: 1
  RubyVM: 1
  SystemStackError: 1
  Thread: 1
  ThreadGroup: 1
  fatal: 1
==================================================

您可以看到内存方式上升.一些计数器上升,但不存在特定于 CSSPool 的对象.我使用 ruby​​-mass 的索引"方法来检查具有引用的对象,如下所示:

You can see the memory going way up. Some of the counters go up, but no objects specific to CSSPool are present. I used ruby-mass's "index" method to inspect the objects that have references like so:

Mass.index.each do |k,v|
    v.each do |id|
        refs = Mass.references(Mass[id])
        puts refs if !refs.empty?
    end
end

但同样,这并没有给我任何与 CSSPool 相关的信息,只有 gem 信息等.

But again, this doesn't give me anything related to CSSPool, just gem info and such.

我也试过输出GC.stat"...

I've also tried outputting "GC.stat"...

puts GC.stat
CSSPool::CSS::Document.parse(File.new('/home/jason/big.css'))
ObjectSpace.garbage_collect
sleep 1
puts GC.stat

结果:

{:count=>4, :heap_used=>126, :heap_length=>138, :heap_increment=>12, :heap_live_num=>50924, :heap_free_num=>24595, :heap_final_num=>0, :total_allocated_object=>86030, :total_freed_object=>35106}
{:count=>16, :heap_used=>6039, :heap_length=>12933, :heap_increment=>3841, :heap_live_num=>13369, :heap_free_num=>2443302, :heap_final_num=>0, :total_allocated_object=>3771675, :total_freed_object=>3758306}

据我所知,如果一个对象没有被引用并且发生了垃圾回收,那么应该从内存中清除该对象.但这似乎不是这里发生的事情.

As I understand it, if an object is not referenced and garbage collection happens, then that object should be cleared from memory. But that doesn't seem to be what's happening here.

我还阅读了有关 C 级内存泄漏的信息,并且由于 CSSPool 使用使用 C 代码的 Racc,我认为这是一种可能性.我已经通过 Valgrind 运行了我的代码:

I've also read about C-level memory leaks, and since CSSPool uses Racc which uses C code, I think this is a possibility. I've run my code through Valgrind:

valgrind --partial-loads-ok=yes --undef-value-errors=no --leak-check=full --fullpath-after= ruby leak.rb 2> valgrind.txt

结果在这里.我不确定这是否证实了 C 级泄漏,因为我还读到 Ruby 使用 Valgrind 无法理解的内存进行处理.

Results are here. I'm not sure if this confirms a C-level leak, as I've also read that Ruby does things with memory that Valgrind doesn't understand.

使用的版本:

  • Ruby 2.0.0-p247(这是我的 Rails 应用程序运行的内容)
  • Ruby 1.9.3-p392-ref(用于使用 ruby​​-mass 进行测试)
  • 红宝石质量 0.1.3
  • CSSPool 4.0.0 来自这里
  • CentOS 6.4 和 Ubuntu 13.10

推荐答案

您似乎正在进入失落的世界.我也不认为 racc 中的 c 绑定有问题.

It looks like you are entering The Lost World here. I don’t think the problem is with c-bindings in racc either.

Ruby 内存管理既优雅又麻烦.它将对象(名为RVALUEs)存储在所谓的中,大小约为 16KB.在低级别上,RVALUE 是一个 c 结构,包含不同标准 ruby​​ 对象表示的 union.

Ruby memory management is both elegant and cumbersome. It stores objects (named RVALUEs) in so-called heaps of size of approx 16KB. On a low level, RVALUE is a c-struct, containing a union of different standard ruby object representations.

因此,堆存储 RVALUE 对象,其大小不超过 40 字节.对于诸如 StringArrayHash 等对象,这意味着小对象可以放入堆中,但一旦它们到达阈值,将分配 Ruby 堆之外的额外内存.

So, heaps store RVALUE objects, which size is not more than 40 bytes. For such objects as String, Array, Hash etc. this means that small objects can fit in the heap, but as soon as they reach a threshold, an extra memory outside of the Ruby heaps will be allocated.

这个额外的内存是灵活的;一旦一个对象被 GC 处理,它就会被释放.这就是为什么你的带有 big_string 的测试用例显示内存上下行为:

This extra memory is flexible; is will be freed as soon as an object became GC’ed. That’s why your testcase with big_string shows the memory up-down behaviour:

def report
  puts 'Memory ' + `ps ax -o pid,rss | grep -E "^[[:space:]]*#{$$}"`
          .strip.split.map(&:to_i)[1].to_s + 'KB'
end
report
big_var = " " * 10000000
report
big_var = nil 
report
ObjectSpace.garbage_collect
sleep 1
report
# ⇒ Memory 11788KB
# ⇒ Memory 65188KB
# ⇒ Memory 65188KB
# ⇒ Memory 11788KB

但是堆(参见GC[:heap_length])本身不会释放回到操作系统,一旦获得.看,我将对您的测试用例进行单调的更改:

But the heaps (see GC[:heap_length]) themselves are not released back to OS, once acquired. Look, I’ll make a humdrum change to your testcase:

- big_var = " " * 10000000
+ big_var = 1_000_000.times.map(&:to_s)

而且,瞧:

# ⇒ Memory 11788KB
# ⇒ Memory 65188KB
# ⇒ Memory 65188KB
# ⇒ Memory 57448KB

内存不再释放回操作系统,因为我引入的数组的每个元素适合RVALUE的大小,并且存储在红宝石堆.

The memory is not released back to OS anymore, because each element of the array I introduced suits the RVALUE size and is stored in the ruby heap.

如果您在 GC 运行后检查 GC.stat 的输出,您会发现 GC[:heap_used] 值按预期减少.Ruby 现在有很多空堆,准备好了.

If you’ll examine the output of GC.stat after the GC was run, you’ll find that GC[:heap_used] value is decreased as expected. Ruby now has a lot of empty heaps, ready.

总结:我不认为,c 代码泄漏.我认为问题出在 css 中巨大图像的 base64 表示中.我不知道解析器内部发生了什么,但看起来巨大的字符串迫使 ruby​​ 堆计数增加.

The summing up: I don’t think, the c code leaks. I think the problem is within base64 representation of huge image in your css. I have no clue, what’s happening inside parser, but it looks like the huge string forces the ruby heap count to increase.

希望有帮助.

这篇关于查找 Ruby 内存泄漏的原因的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆