在Ruby中查找内存泄漏的原因 [英] Finding the cause of a memory leak in Ruby

查看:63
本文介绍了在Ruby中查找内存泄漏的原因的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在Rails代码中发现了内存泄漏-也就是说,我发现了什么代码泄漏,但没有发现为什么泄漏.我将其简化为不需要Rails的测试用例:

require 'csspool'
require 'ruby-mass'

def report
    puts 'Memory ' + `ps ax -o pid,rss | grep -E "^[[:space:]]*#{$$}"`.strip.split.map(&:to_i)[1].to_s + 'KB'
    Mass.print
end

report

# note I do not store the return value here
CSSPool::CSS::Document.parse(File.new('/home/jason/big.css'))

ObjectSpace.garbage_collect
sleep 1

report

ruby​​-mass 应该可以让我看到内存中的所有对象. CSSPool 是基于 1.5MB CSS文件.

这将输出:

Memory 9264KB

==================================================
 Objects within [] namespace
==================================================
  String: 7261
  RubyVM::InstructionSequence: 1151
  Array: 562
  Class: 313
  Regexp: 181
  Proc: 111
  Encoding: 99
  Gem::StubSpecification: 66
  Gem::StubSpecification::StubLine: 60
  Gem::Version: 60
  Module: 31
  Hash: 29
  Gem::Requirement: 25
  RubyVM::Env: 11
  Gem::Specification: 8
  Float: 7
  Gem::Dependency: 7
  Range: 4
  Bignum: 3
  IO: 3
  Mutex: 3
  Time: 3
  Object: 2
  ARGF.class: 1
  Binding: 1
  Complex: 1
  Data: 1
  Gem::PathSupport: 1
  IOError: 1
  MatchData: 1
  Monitor: 1
  NoMemoryError: 1
  Process::Status: 1
  Random: 1
  RubyVM: 1
  SystemStackError: 1
  Thread: 1
  ThreadGroup: 1
  fatal: 1
==================================================

Memory 258860KB

==================================================
 Objects within [] namespace
==================================================
  String: 7456
  RubyVM::InstructionSequence: 1151
  Array: 564
  Class: 313
  Regexp: 181
  Proc: 113
  Encoding: 99
  Gem::StubSpecification: 66
  Gem::StubSpecification::StubLine: 60
  Gem::Version: 60
  Module: 31
  Hash: 30
  Gem::Requirement: 25
  RubyVM::Env: 13
  Gem::Specification: 8
  Float: 7
  Gem::Dependency: 7
  Range: 4
  Bignum: 3
  IO: 3
  Mutex: 3
  Time: 3
  Object: 2
  ARGF.class: 1
  Binding: 1
  Complex: 1
  Data: 1
  Gem::PathSupport: 1
  IOError: 1
  MatchData: 1
  Monitor: 1
  NoMemoryError: 1
  Process::Status: 1
  Random: 1
  RubyVM: 1
  SystemStackError: 1
  Thread: 1
  ThreadGroup: 1
  fatal: 1
==================================================

您可以看到内存在不断增加.一些计数器增加了,但是没有特定于CSSPool的对象.我使用ruby-mass的索引"方法来检查具有如下引用的对象:

Mass.index.each do |k,v|
    v.each do |id|
        refs = Mass.references(Mass[id])
        puts refs if !refs.empty?
    end
end

但是,再次,这并没有给我任何与CSSPool相关的信息,只是宝石信息等等.

我也尝试输出"GC.stat" ...

puts GC.stat
CSSPool::CSS::Document.parse(File.new('/home/jason/big.css'))
ObjectSpace.garbage_collect
sleep 1
puts GC.stat

结果:

{:count=>4, :heap_used=>126, :heap_length=>138, :heap_increment=>12, :heap_live_num=>50924, :heap_free_num=>24595, :heap_final_num=>0, :total_allocated_object=>86030, :total_freed_object=>35106}
{:count=>16, :heap_used=>6039, :heap_length=>12933, :heap_increment=>3841, :heap_live_num=>13369, :heap_free_num=>2443302, :heap_final_num=>0, :total_allocated_object=>3771675, :total_freed_object=>3758306}

据我了解,如果未引用对象并且发生垃圾回收,则应从内存中清除该对象.但这似乎不是这里发生的事情.

我还阅读了有关C级内存泄漏的信息,并且由于CSSPool使用使用C代码的Racc,所以我认为这是有可能的.我已经通过Valgrind运行我的代码:

valgrind --partial-loads-ok=yes --undef-value-errors=no --leak-check=full --fullpath-after= ruby leak.rb 2> valgrind.txt

结果显示在此处.我不确定这是否确认了C级泄漏,因为我还读到Ruby使用Valgrind无法理解的内存来做事.

使用的版本:

  • Ruby 2.0.0-p247(这是我的Rails应用程序运行的)
  • Ruby 1.9.3-p392-ref(用于使用ruby-mass进行测试)
  • 红宝石质量0.1.3
  • 来自此处
  • 的CSSPool 4.0.0
  • CentOS 6.4和Ubuntu 13.10

解决方案

您似乎在此处输入失落的世界.我也不认为问题出在racc中的c绑定.

Ruby内存管理既优雅又麻烦.它将对象(名为RVALUE s)存储在大小约为16KB的所谓的堆"中.在较低的层次上,RVALUE是一个c结构,包含不同标准红宝石对象表示形式的union.

因此,堆存储RVALUE对象,其大小不超过40个字节.对于StringArrayHash等对象,这意味着小对象可以放入堆中,但是一旦它们达到阈值,就会在Ruby堆之外分配额外的内存. /p>

此额外的内存非常灵活; 这就是为什么使用big_string的测试用例会显示内存上下行为的原因:

def report
  puts 'Memory ' + `ps ax -o pid,rss | grep -E "^[[:space:]]*#{$$}"`
          .strip.split.map(&:to_i)[1].to_s + 'KB'
end
report
big_var = " " * 10000000
report
big_var = nil 
report
ObjectSpace.garbage_collect
sleep 1
report
# ⇒ Memory 11788KB
# ⇒ Memory 65188KB
# ⇒ Memory 65188KB
# ⇒ Memory 11788KB

但是一旦获得,堆本身(请参见GC[:heap_length])就不会释放回到操作系统.看,我将对您的测试用例进行单调修改:

- big_var = " " * 10000000
+ big_var = 1_000_000.times.map(&:to_s)

然后,贴吧:

# ⇒ Memory 11788KB
# ⇒ Memory 65188KB
# ⇒ Memory 65188KB
# ⇒ Memory 57448KB

内存不再释放回OS,因为我介绍的数组的每个元素 suits RVALUE size和都存储在Ruby堆中. /p>

如果在运行GC后检查GC.stat的输出,则会发现GC[:heap_used]的值已按预期降低. Ruby现在有很多可用的空堆.

总结:我不认为c代码会泄漏.我认为问题出在css中巨大图像的base64表示形式之内.我不知道解析器内部发生了什么,但是看起来像是巨大的字符串迫使红宝石堆计数增加.

希望它会有所帮助.

I've discovered a memory leak in my Rails code - that is to say, I've found what code leaks but not why it leaks. I've reduced it down to a test case that doesn't require Rails:

require 'csspool'
require 'ruby-mass'

def report
    puts 'Memory ' + `ps ax -o pid,rss | grep -E "^[[:space:]]*#{$$}"`.strip.split.map(&:to_i)[1].to_s + 'KB'
    Mass.print
end

report

# note I do not store the return value here
CSSPool::CSS::Document.parse(File.new('/home/jason/big.css'))

ObjectSpace.garbage_collect
sleep 1

report

ruby-mass supposedly lets me see all the objects in memory. CSSPool is a CSS parser based on racc. /home/jason/big.css is a 1.5MB CSS file.

This outputs:

Memory 9264KB

==================================================
 Objects within [] namespace
==================================================
  String: 7261
  RubyVM::InstructionSequence: 1151
  Array: 562
  Class: 313
  Regexp: 181
  Proc: 111
  Encoding: 99
  Gem::StubSpecification: 66
  Gem::StubSpecification::StubLine: 60
  Gem::Version: 60
  Module: 31
  Hash: 29
  Gem::Requirement: 25
  RubyVM::Env: 11
  Gem::Specification: 8
  Float: 7
  Gem::Dependency: 7
  Range: 4
  Bignum: 3
  IO: 3
  Mutex: 3
  Time: 3
  Object: 2
  ARGF.class: 1
  Binding: 1
  Complex: 1
  Data: 1
  Gem::PathSupport: 1
  IOError: 1
  MatchData: 1
  Monitor: 1
  NoMemoryError: 1
  Process::Status: 1
  Random: 1
  RubyVM: 1
  SystemStackError: 1
  Thread: 1
  ThreadGroup: 1
  fatal: 1
==================================================

Memory 258860KB

==================================================
 Objects within [] namespace
==================================================
  String: 7456
  RubyVM::InstructionSequence: 1151
  Array: 564
  Class: 313
  Regexp: 181
  Proc: 113
  Encoding: 99
  Gem::StubSpecification: 66
  Gem::StubSpecification::StubLine: 60
  Gem::Version: 60
  Module: 31
  Hash: 30
  Gem::Requirement: 25
  RubyVM::Env: 13
  Gem::Specification: 8
  Float: 7
  Gem::Dependency: 7
  Range: 4
  Bignum: 3
  IO: 3
  Mutex: 3
  Time: 3
  Object: 2
  ARGF.class: 1
  Binding: 1
  Complex: 1
  Data: 1
  Gem::PathSupport: 1
  IOError: 1
  MatchData: 1
  Monitor: 1
  NoMemoryError: 1
  Process::Status: 1
  Random: 1
  RubyVM: 1
  SystemStackError: 1
  Thread: 1
  ThreadGroup: 1
  fatal: 1
==================================================

You can see the memory going way up. Some of the counters go up, but no objects specific to CSSPool are present. I used ruby-mass's "index" method to inspect the objects that have references like so:

Mass.index.each do |k,v|
    v.each do |id|
        refs = Mass.references(Mass[id])
        puts refs if !refs.empty?
    end
end

But again, this doesn't give me anything related to CSSPool, just gem info and such.

I've also tried outputting "GC.stat"...

puts GC.stat
CSSPool::CSS::Document.parse(File.new('/home/jason/big.css'))
ObjectSpace.garbage_collect
sleep 1
puts GC.stat

Result:

{:count=>4, :heap_used=>126, :heap_length=>138, :heap_increment=>12, :heap_live_num=>50924, :heap_free_num=>24595, :heap_final_num=>0, :total_allocated_object=>86030, :total_freed_object=>35106}
{:count=>16, :heap_used=>6039, :heap_length=>12933, :heap_increment=>3841, :heap_live_num=>13369, :heap_free_num=>2443302, :heap_final_num=>0, :total_allocated_object=>3771675, :total_freed_object=>3758306}

As I understand it, if an object is not referenced and garbage collection happens, then that object should be cleared from memory. But that doesn't seem to be what's happening here.

I've also read about C-level memory leaks, and since CSSPool uses Racc which uses C code, I think this is a possibility. I've run my code through Valgrind:

valgrind --partial-loads-ok=yes --undef-value-errors=no --leak-check=full --fullpath-after= ruby leak.rb 2> valgrind.txt

Results are here. I'm not sure if this confirms a C-level leak, as I've also read that Ruby does things with memory that Valgrind doesn't understand.

Versions used:

  • Ruby 2.0.0-p247 (this is what my Rails app runs)
  • Ruby 1.9.3-p392-ref (for testing with ruby-mass)
  • ruby-mass 0.1.3
  • CSSPool 4.0.0 from here
  • CentOS 6.4 and Ubuntu 13.10

解决方案

It looks like you are entering The Lost World here. I don’t think the problem is with c-bindings in racc either.

Ruby memory management is both elegant and cumbersome. It stores objects (named RVALUEs) in so-called heaps of size of approx 16KB. On a low level, RVALUE is a c-struct, containing a union of different standard ruby object representations.

So, heaps store RVALUE objects, which size is not more than 40 bytes. For such objects as String, Array, Hash etc. this means that small objects can fit in the heap, but as soon as they reach a threshold, an extra memory outside of the Ruby heaps will be allocated.

This extra memory is flexible; is will be freed as soon as an object became GC’ed. That’s why your testcase with big_string shows the memory up-down behaviour:

def report
  puts 'Memory ' + `ps ax -o pid,rss | grep -E "^[[:space:]]*#{$$}"`
          .strip.split.map(&:to_i)[1].to_s + 'KB'
end
report
big_var = " " * 10000000
report
big_var = nil 
report
ObjectSpace.garbage_collect
sleep 1
report
# ⇒ Memory 11788KB
# ⇒ Memory 65188KB
# ⇒ Memory 65188KB
# ⇒ Memory 11788KB

But the heaps (see GC[:heap_length]) themselves are not released back to OS, once acquired. Look, I’ll make a humdrum change to your testcase:

- big_var = " " * 10000000
+ big_var = 1_000_000.times.map(&:to_s)

And, voilá:

# ⇒ Memory 11788KB
# ⇒ Memory 65188KB
# ⇒ Memory 65188KB
# ⇒ Memory 57448KB

The memory is not released back to OS anymore, because each element of the array I introduced suits the RVALUE size and is stored in the ruby heap.

If you’ll examine the output of GC.stat after the GC was run, you’ll find that GC[:heap_used] value is decreased as expected. Ruby now has a lot of empty heaps, ready.

The summing up: I don’t think, the c code leaks. I think the problem is within base64 representation of huge image in your css. I have no clue, what’s happening inside parser, but it looks like the huge string forces the ruby heap count to increase.

Hope it helps.

这篇关于在Ruby中查找内存泄漏的原因的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆