numpy的内存探查器 [英] Memory profiler for numpy

查看:101
本文介绍了numpy的内存探查器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个numpy脚本-根据top的说法-正在使用大约5GB的RAM:

I have a numpy script that -- according to top -- is using about 5GB of RAM:

  PID USER   PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
16994 aix    25   0 5813m 5.2g 5.1g S  0.0 22.1  52:19.66 ipython

是否有一个内存分析器可以使我对占用大部分内存的对象有一些了解?

Is there a memory profiler that would enable me to get some idea about the objects that are taking most of that memory?

我尝试了heapy,但是guppy.hpy().heap()给了我这个:

I've tried heapy, but guppy.hpy().heap() is giving me this:

Partition of a set of 90956 objects. Total size = 12511160 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0  42464  47  4853112  39   4853112  39 str
     1  22147  24  1928768  15   6781880  54 tuple
     2    287   0  1093352   9   7875232  63 dict of module
     3   5734   6   733952   6   8609184  69 types.CodeType
     4    498   1   713904   6   9323088  75 dict (no owner)
     5   5431   6   651720   5   9974808  80 function
     6    489   1   512856   4  10487664  84 dict of type
     7    489   1   437704   3  10925368  87 type
     8    261   0   281208   2  11206576  90 dict of class
     9   1629   2   130320   1  11336896  91 __builtin__.wrapper_descriptor
<285 more rows. Type e.g. '_.more' to view.>

由于某种原因,它仅占5GB的12MB(numpy阵列几乎可以肯定使用了大部分内存).

For some reason, it's only accounting for 12MB of the 5GB (the bulk of the memory is almost certainly used by numpy arrays).

关于我可能在heapy上做错什么或应该尝试其他工具的任何建议(除了

Any suggestions as to what I might be doing wrong with heapy or what other tools I should try (other than those already mentioned in this thread)?

推荐答案

Numpy(及其库绑定,稍后会详细介绍)使用C malloc分配空间,这就是为什么大numpy分配所使用的内存不出现在诸如堆之类的东西的分析中,并且永远不会被垃圾收集器清除.

Numpy (and its library bindings, more on that in a minute) use C malloc to allocate space, which is why memory used by big numpy allocations doesn't show up in the profiling of things like heapy and never gets cleaned up by the garbage collector.

通常引起大泄漏的怀疑实际上是scipy或numpy库绑定,而不是python代码本身.去年,我被umfpack的默认scipy.linalg接口严重烧伤,该接口以每次呼叫约10Mb的速度泄漏内存.您可能想尝试使用valgrind之类的东西来分析代码.它通常可以提供一些提示,告诉您在哪里查看可能存在泄漏的地方.

The usual suspects for big leaks are actually scipy or numpy library bindings, rather than python code itself. I got burned badly last year by the default scipy.linalg interface to umfpack, which leaked memory at the rate of about 10Mb a call. You might want to try something like valgrind to profile the code. It can often give some hints as to where to look at where there might be leaks.

这篇关于numpy的内存探查器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆