在Ruby中查找内存泄漏的原因 [英] Finding the cause of a memory leak in Ruby
问题描述
我在Rails代码中发现了内存泄漏-也就是说,我发现了什么代码泄漏,但没有发现为什么泄漏.我将其简化为不需要Rails的测试用例:
require 'csspool'
require 'ruby-mass'
def report
puts 'Memory ' + `ps ax -o pid,rss | grep -E "^[[:space:]]*#{$$}"`.strip.split.map(&:to_i)[1].to_s + 'KB'
Mass.print
end
report
# note I do not store the return value here
CSSPool::CSS::Document.parse(File.new('/home/jason/big.css'))
ObjectSpace.garbage_collect
sleep 1
report
ruby-mass 应该可以让我看到内存中的所有对象. CSSPool 是基于 1.5MB CSS文件.
这将输出:
Memory 9264KB
==================================================
Objects within [] namespace
==================================================
String: 7261
RubyVM::InstructionSequence: 1151
Array: 562
Class: 313
Regexp: 181
Proc: 111
Encoding: 99
Gem::StubSpecification: 66
Gem::StubSpecification::StubLine: 60
Gem::Version: 60
Module: 31
Hash: 29
Gem::Requirement: 25
RubyVM::Env: 11
Gem::Specification: 8
Float: 7
Gem::Dependency: 7
Range: 4
Bignum: 3
IO: 3
Mutex: 3
Time: 3
Object: 2
ARGF.class: 1
Binding: 1
Complex: 1
Data: 1
Gem::PathSupport: 1
IOError: 1
MatchData: 1
Monitor: 1
NoMemoryError: 1
Process::Status: 1
Random: 1
RubyVM: 1
SystemStackError: 1
Thread: 1
ThreadGroup: 1
fatal: 1
==================================================
Memory 258860KB
==================================================
Objects within [] namespace
==================================================
String: 7456
RubyVM::InstructionSequence: 1151
Array: 564
Class: 313
Regexp: 181
Proc: 113
Encoding: 99
Gem::StubSpecification: 66
Gem::StubSpecification::StubLine: 60
Gem::Version: 60
Module: 31
Hash: 30
Gem::Requirement: 25
RubyVM::Env: 13
Gem::Specification: 8
Float: 7
Gem::Dependency: 7
Range: 4
Bignum: 3
IO: 3
Mutex: 3
Time: 3
Object: 2
ARGF.class: 1
Binding: 1
Complex: 1
Data: 1
Gem::PathSupport: 1
IOError: 1
MatchData: 1
Monitor: 1
NoMemoryError: 1
Process::Status: 1
Random: 1
RubyVM: 1
SystemStackError: 1
Thread: 1
ThreadGroup: 1
fatal: 1
==================================================
您可以看到内存在不断增加.一些计数器增加了,但是没有特定于CSSPool的对象.我使用ruby-mass的索引"方法来检查具有如下引用的对象:
Mass.index.each do |k,v|
v.each do |id|
refs = Mass.references(Mass[id])
puts refs if !refs.empty?
end
end
但是,再次,这并没有给我任何与CSSPool相关的信息,只是宝石信息等等.
我也尝试输出"GC.stat" ...
puts GC.stat
CSSPool::CSS::Document.parse(File.new('/home/jason/big.css'))
ObjectSpace.garbage_collect
sleep 1
puts GC.stat
结果:
{:count=>4, :heap_used=>126, :heap_length=>138, :heap_increment=>12, :heap_live_num=>50924, :heap_free_num=>24595, :heap_final_num=>0, :total_allocated_object=>86030, :total_freed_object=>35106}
{:count=>16, :heap_used=>6039, :heap_length=>12933, :heap_increment=>3841, :heap_live_num=>13369, :heap_free_num=>2443302, :heap_final_num=>0, :total_allocated_object=>3771675, :total_freed_object=>3758306}
据我了解,如果未引用对象并且发生垃圾回收,则应从内存中清除该对象.但这似乎不是这里发生的事情.
我还阅读了有关C级内存泄漏的信息,并且由于CSSPool使用使用C代码的Racc,所以我认为这是有可能的.我已经通过Valgrind运行我的代码:
valgrind --partial-loads-ok=yes --undef-value-errors=no --leak-check=full --fullpath-after= ruby leak.rb 2> valgrind.txt
结果显示在此处.我不确定这是否确认了C级泄漏,因为我还读到Ruby使用Valgrind无法理解的内存来做事.
使用的版本:
- Ruby 2.0.0-p247(这是我的Rails应用程序运行的)
- Ruby 1.9.3-p392-ref(用于使用ruby-mass进行测试)
- 红宝石质量0.1.3
- 来自此处 的CSSPool 4.0.0
- CentOS 6.4和Ubuntu 13.10
您似乎在此处输入失落的世界.我也不认为问题出在racc
中的c绑定.
Ruby内存管理既优雅又麻烦.它将对象(名为RVALUE
s)存储在大小约为16KB的所谓的堆"中.在较低的层次上,RVALUE
是一个c结构,包含不同标准红宝石对象表示形式的union
.
因此,堆存储RVALUE
对象,其大小不超过40个字节.对于String
,Array
,Hash
等对象,这意味着小对象可以放入堆中,但是一旦它们达到阈值,就会在Ruby堆之外分配额外的内存. /p>
此额外的内存非常灵活; 这就是为什么使用big_string
的测试用例会显示内存上下行为的原因:
def report
puts 'Memory ' + `ps ax -o pid,rss | grep -E "^[[:space:]]*#{$$}"`
.strip.split.map(&:to_i)[1].to_s + 'KB'
end
report
big_var = " " * 10000000
report
big_var = nil
report
ObjectSpace.garbage_collect
sleep 1
report
# ⇒ Memory 11788KB
# ⇒ Memory 65188KB
# ⇒ Memory 65188KB
# ⇒ Memory 11788KB
但是一旦获得,堆本身(请参见GC[:heap_length]
)就不会释放回到操作系统.看,我将对您的测试用例进行单调修改:
- big_var = " " * 10000000
+ big_var = 1_000_000.times.map(&:to_s)
然后,贴吧:
# ⇒ Memory 11788KB
# ⇒ Memory 65188KB
# ⇒ Memory 65188KB
# ⇒ Memory 57448KB
内存不再释放回OS,因为我介绍的数组的每个元素 suits RVALUE
size和都存储在Ruby堆中. /p>
如果在运行GC后检查GC.stat
的输出,则会发现GC[:heap_used]
的值已按预期降低. Ruby现在有很多可用的空堆.
总结:我不认为c
代码会泄漏.我认为问题出在css
中巨大图像的base64表示形式之内.我不知道解析器内部发生了什么,但是看起来像是巨大的字符串迫使红宝石堆计数增加.
希望它会有所帮助.
I've discovered a memory leak in my Rails code - that is to say, I've found what code leaks but not why it leaks. I've reduced it down to a test case that doesn't require Rails:
require 'csspool'
require 'ruby-mass'
def report
puts 'Memory ' + `ps ax -o pid,rss | grep -E "^[[:space:]]*#{$$}"`.strip.split.map(&:to_i)[1].to_s + 'KB'
Mass.print
end
report
# note I do not store the return value here
CSSPool::CSS::Document.parse(File.new('/home/jason/big.css'))
ObjectSpace.garbage_collect
sleep 1
report
ruby-mass supposedly lets me see all the objects in memory. CSSPool is a CSS parser based on racc. /home/jason/big.css is a 1.5MB CSS file.
This outputs:
Memory 9264KB
==================================================
Objects within [] namespace
==================================================
String: 7261
RubyVM::InstructionSequence: 1151
Array: 562
Class: 313
Regexp: 181
Proc: 111
Encoding: 99
Gem::StubSpecification: 66
Gem::StubSpecification::StubLine: 60
Gem::Version: 60
Module: 31
Hash: 29
Gem::Requirement: 25
RubyVM::Env: 11
Gem::Specification: 8
Float: 7
Gem::Dependency: 7
Range: 4
Bignum: 3
IO: 3
Mutex: 3
Time: 3
Object: 2
ARGF.class: 1
Binding: 1
Complex: 1
Data: 1
Gem::PathSupport: 1
IOError: 1
MatchData: 1
Monitor: 1
NoMemoryError: 1
Process::Status: 1
Random: 1
RubyVM: 1
SystemStackError: 1
Thread: 1
ThreadGroup: 1
fatal: 1
==================================================
Memory 258860KB
==================================================
Objects within [] namespace
==================================================
String: 7456
RubyVM::InstructionSequence: 1151
Array: 564
Class: 313
Regexp: 181
Proc: 113
Encoding: 99
Gem::StubSpecification: 66
Gem::StubSpecification::StubLine: 60
Gem::Version: 60
Module: 31
Hash: 30
Gem::Requirement: 25
RubyVM::Env: 13
Gem::Specification: 8
Float: 7
Gem::Dependency: 7
Range: 4
Bignum: 3
IO: 3
Mutex: 3
Time: 3
Object: 2
ARGF.class: 1
Binding: 1
Complex: 1
Data: 1
Gem::PathSupport: 1
IOError: 1
MatchData: 1
Monitor: 1
NoMemoryError: 1
Process::Status: 1
Random: 1
RubyVM: 1
SystemStackError: 1
Thread: 1
ThreadGroup: 1
fatal: 1
==================================================
You can see the memory going way up. Some of the counters go up, but no objects specific to CSSPool are present. I used ruby-mass's "index" method to inspect the objects that have references like so:
Mass.index.each do |k,v|
v.each do |id|
refs = Mass.references(Mass[id])
puts refs if !refs.empty?
end
end
But again, this doesn't give me anything related to CSSPool, just gem info and such.
I've also tried outputting "GC.stat"...
puts GC.stat
CSSPool::CSS::Document.parse(File.new('/home/jason/big.css'))
ObjectSpace.garbage_collect
sleep 1
puts GC.stat
Result:
{:count=>4, :heap_used=>126, :heap_length=>138, :heap_increment=>12, :heap_live_num=>50924, :heap_free_num=>24595, :heap_final_num=>0, :total_allocated_object=>86030, :total_freed_object=>35106}
{:count=>16, :heap_used=>6039, :heap_length=>12933, :heap_increment=>3841, :heap_live_num=>13369, :heap_free_num=>2443302, :heap_final_num=>0, :total_allocated_object=>3771675, :total_freed_object=>3758306}
As I understand it, if an object is not referenced and garbage collection happens, then that object should be cleared from memory. But that doesn't seem to be what's happening here.
I've also read about C-level memory leaks, and since CSSPool uses Racc which uses C code, I think this is a possibility. I've run my code through Valgrind:
valgrind --partial-loads-ok=yes --undef-value-errors=no --leak-check=full --fullpath-after= ruby leak.rb 2> valgrind.txt
Results are here. I'm not sure if this confirms a C-level leak, as I've also read that Ruby does things with memory that Valgrind doesn't understand.
Versions used:
- Ruby 2.0.0-p247 (this is what my Rails app runs)
- Ruby 1.9.3-p392-ref (for testing with ruby-mass)
- ruby-mass 0.1.3
- CSSPool 4.0.0 from here
- CentOS 6.4 and Ubuntu 13.10
It looks like you are entering The Lost World here. I don’t think the problem is with c-bindings in racc
either.
Ruby memory management is both elegant and cumbersome. It stores objects (named RVALUE
s) in so-called heaps of size of approx 16KB. On a low level, RVALUE
is a c-struct, containing a union
of different standard ruby object representations.
So, heaps store RVALUE
objects, which size is not more than 40 bytes. For such objects as String
, Array
, Hash
etc. this means that small objects can fit in the heap, but as soon as they reach a threshold, an extra memory outside of the Ruby heaps will be allocated.
This extra memory is flexible; is will be freed as soon as an object became GC’ed. That’s why your testcase with big_string
shows the memory up-down behaviour:
def report
puts 'Memory ' + `ps ax -o pid,rss | grep -E "^[[:space:]]*#{$$}"`
.strip.split.map(&:to_i)[1].to_s + 'KB'
end
report
big_var = " " * 10000000
report
big_var = nil
report
ObjectSpace.garbage_collect
sleep 1
report
# ⇒ Memory 11788KB
# ⇒ Memory 65188KB
# ⇒ Memory 65188KB
# ⇒ Memory 11788KB
But the heaps (see GC[:heap_length]
) themselves are not released back to OS, once acquired. Look, I’ll make a humdrum change to your testcase:
- big_var = " " * 10000000
+ big_var = 1_000_000.times.map(&:to_s)
And, voilá:
# ⇒ Memory 11788KB
# ⇒ Memory 65188KB
# ⇒ Memory 65188KB
# ⇒ Memory 57448KB
The memory is not released back to OS anymore, because each element of the array I introduced suits the RVALUE
size and is stored in the ruby heap.
If you’ll examine the output of GC.stat
after the GC was run, you’ll find that GC[:heap_used]
value is decreased as expected. Ruby now has a lot of empty heaps, ready.
The summing up: I don’t think, the c
code leaks. I think the problem is within base64 representation of huge image in your css
. I have no clue, what’s happening inside parser, but it looks like the huge string forces the ruby heap count to increase.
Hope it helps.
这篇关于在Ruby中查找内存泄漏的原因的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!