调试不会在堆概要分析中显示的内存泄漏 [英] Debugging a memory leak that doesn't show on heap profiling

查看:123
本文介绍了调试不会在堆概要分析中显示的内存泄漏的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在处理一个接收和处理JSON请求的Haskell守护进程。尽管守护进程的操作非常复杂,但主要结构仍然保持简单:它的内部状态只是一个带有数据结构的 IORef ,所有线程都在这个 IOREF 。然后有一些线程在触发器上使用值做一些事情。



问题是守护进程正在泄漏内存,我找不到为什么。这当然与请求有关:当守护进程每秒获得几个请求时,它会泄漏1MB / s的东西(如Linux工具所报告的)。内存消耗稳步增加。在没有请求的情况下,内存消耗保持不变。

让我感到困惑的是GHC分析中没有显示这一点。或者我在分析参数中缺少一些东西,或者内存被别的东西占用:



使用 + RTS -hc -xt - p





使用 + RTS -hr -xt -p 运行:





在此测试运行期间,守护进程随后消耗超过1GB。因此分析数据显然不符合实际消耗的内存数量级。 (我知道RTS,GC和分析本身会增加实际的内存消耗,但是这种差异太大了,并且不符合不断增长的消耗)。

我已经尝试 rnf IORef 中守护进程的所有状态数据,以及解析JSON请求(以避免部分JSON字符串保留在某处),但没有太多成功。



欢迎任何想法或建议。



更新:守护进程运行时没有 -threaded ,所以没有OS级线程。



GC统计信息比堆分析更接近Linux报告的数字:

  Alloc Copied Live GC GC TOT TOT Page Flts 
bytes bytes byte user elap user elap
[...]
5476616 44504 2505736 0.00 0.00 23.21 410.03 0 0(Gen:0)
35499296 41624 2 (代:0)
31259144 36416 2612088 0.00 0.00 23.40 410.61 0 0(代:0)$ b $ 60.0032 0.00 0.00 23.26 410.25 0 0(代:0)
51841800 46848 2701592 0.00 0.00 23.32 410.49 0 0 $ b $ 53433632 51976 2742664 0.00 0.00 23.49 412.05 0 0(Gen:0)
48142768 50928 2784744 0.00 0.00 23.54 412.49 0 0(Gen:0)






更新2:我发现问题的根源,内存泄漏是由 handleToFd 造成的(请参阅此问题 unix 库)。我只是想知道如何更有效地查明这种泄漏(可能发生在外国代码中)。

虽然我对Haskell守护进程本身并不熟悉,但回答您的问题如何更有效地查明这种泄漏,可以使用

<

valgrind --leak-check = yes haskelldaemon (如果您使用调试信息编译它,或者,如果泄漏发生在共享库中,请尝试

LD_PRELOAD =yourlibrary.sovalgrind your-executable

I'm working on a Haskell daemon that receives and processes JSON requests. While the operations of the daemon are complex, the main structure is intentionally kept simple: Its internal state is just an IORef with a data structure and all threads perform atomic operations on this IORef. Then there are a few threads that upon a trigger take the value a do something with it.

The problem is that the daemon is leaking memory and I can't find out why. It's certainly related to the requests: when the daemon is getting several requests per second, it leaks something like 1MB/s (as reported by the Linux tools). The memory consumption steadily increases. With no requests, the memory consumption remains constant.

What puzzles me that none of this shows in GHC profiling. Either I'm missing something in the profiling parameters, or the memory is consumed by something else:

Run with +RTS -hc -xt -p:

Run with +RTS -hr -xt -p:

During this testing run, the daemon subsequently consumes over 1GB. So the profiling data clearly don't correspond to the actual consumed memory by orders of magnitude. (I understand that the RTS, the GC and the profiling itself add to the real memory consumption, but this difference is too big, and doesn't correspond to the ever-increasing consumption.)

I already tried to rnf all the state data of the daemon inside the IORef, as well as parsed JSON requests (to avoid parts of JSON strings be retained somewhere), but without much success.

Any ideas or suggestions welcomed.

Update: The daemon is running without -threaded, so there are no OS-level threads.

The GC statistics are much closer to the heap profiling than to the numbers reported by Linux:

    Alloc    Copied     Live    GC    GC     TOT     TOT  Page Flts
    bytes     bytes     bytes  user  elap    user    elap
[...]
  5476616     44504   2505736  0.00  0.00   23.21  410.03    0    0  (Gen:  0)
 35499296     41624   2603032  0.00  0.00   23.26  410.25    0    0  (Gen:  0)
 51841800     46848   2701592  0.00  0.00   23.32  410.49    0    0  (Gen:  0)
 31259144     36416   2612088  0.00  0.00   23.40  410.61    0    0  (Gen:  0)
 53433632     51976   2742664  0.00  0.00   23.49  412.05    0    0  (Gen:  0)
 48142768     50928   2784744  0.00  0.00   23.54  412.49    0    0  (Gen:  0)
[...]


Update 2: I found the origin of the problem, the memory leak is caused by handleToFd (see this issue for the unix library). I just wonder how it'd be possible to more effectively pinpoint such a leak (perhaps occurring in a foreign piece of code).

解决方案

While I am not familiar with Haskell daemon itself, answering your question "how it'd be possible to more effectively pinpoint such a leak", it might be possible to use

valgrind --leak-check=yes haskelldaemon (better if you compile it with debug info),

OR, if the leak happens in shared library, try

LD_PRELOAD="yourlibrary.so" valgrind your-executable.

这篇关于调试不会在堆概要分析中显示的内存泄漏的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆