解释`Rprof`的内存分析输出 [英] Interpretation of memory profiling output of `Rprof`

查看:102
本文介绍了解释`Rprof`的内存分析输出的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用性能分析来查看我的代码的哪一部分对最大3GB内存的使用负责(如gc()关于最大已使用内存的统计信息所报告,

I am trying use profiling to see which part of my code is reposponsible for the maximum usage of 3GB of memory (as reported by gc() statistic on maximum used memory, see here how). I am running memory profiling like this:

Rprof(line.profiling = TRUE, memory.profiling = TRUE)
graf(...) # ... here I run the profiled code
Rprof(NULL)
summaryRprof(lines = "both", memory = "both")

输出如下:

$by.total
                       total.time total.pct mem.total self.time self.pct
"graf"                     299.12     99.69   50814.4      0.02     0.01
#2                         299.12     99.69   50814.4      0.00     0.00
"graf.fit.laplace"         299.06     99.67   50787.2      0.00     0.00
"doTryCatch"               103.42     34.47    4339.2      0.00     0.00
"chol"                     103.42     34.47    4339.2      0.00     0.00
"tryCatch"                 103.42     34.47    4339.2      0.00     0.00
"tryCatchList"             103.42     34.47    4339.2      0.00     0.00
"tryCatchOne"              103.42     34.47    4339.2      0.00     0.00
"chol.default"             101.62     33.87    1087.0    101.62    33.87
graf.fit.laplace.R#46       85.80     28.60    3633.2      0.00     0.00
"backsolve"                 78.82     26.27    1635.2     58.40    19.46

我应该如何解释mem.total?它是什么,单位是什么?我试图查看文档,即?Rprof?summaryRprof,但似乎没有充分记录:-/

How shall I interpret mem.total? What is it and what is the unit of it? I tried to look at the documentation, namely ?Rprof and ?summaryRprof, but it seems it's not well documented :-/

编辑:此处他们说Rprof定期检测R的总内存使用量".但这不适合50GB,超出了我的存储能力! (现在为8GB物理+ 12 GB页面文件).

Here they say that Rprof "probes the total memory usage of R at regular time intervals". But that doesn't fit to the 50GB which are way beyond what my memory is able to fit! (8GB physical + 12 GB pagefile now).

类似地,正如R Yoda指出的那样,?summaryRprof说,如果memory ="both",则表示总内存的变化".但是到底是什么(是总内存还是总内存的变化),以及它与50GB的数字如何匹配?

Similarly, as pointed out by R Yoda, ?summaryRprof says that with memory = "both" it means "the change in total memory". But what is it exactly (is it a total memory or change of total memory), and how does it fit with the 50GB number?

编辑:与profvis中所做的分析相同-当我将鼠标悬停在50812上方时,它显示内存分配(MB)",并悬停在靠近该垂直线的黑条上方峰值内存分配和重新分配的百分比".不知道那意味着什么...这就像50 GB,这意味着也许这可能是所有分配的总和(??)...绝对不是峰值内存使用情况:

the same analysis done in profvis - when i hover ovew the 50812, it shows "Memory allocation (MB)", and hover over the black bar close to that vertical line "Percentage of peak memory allocation and deallocation". Not sure what that means... This is like 50 GB, which means like this may be perhaps the sum of all allocations (??) ... definitely not the peak memory usage:

推荐答案

?summaryRprof说:

如果memory ="both"和"both"相同,则除了计时以外,还有Mb的内存消耗.

If memory = "both" the same list but with memory consumption in Mb in addition to the timings.

所以mem.total以MB为单位

当memory ="both"和"both"时,报告总内存的变化(截断为零)[...]

With memory = "both" the change in total memory (truncated at zero) is reported [...]

您有8 GB RAM + 12 GB交换空间,但mem.total声称您已使用50 GB?

You have 8 GB RAM + 12 GB swap but mem.total proclaims you have used 50 GB?

因为它是两个后续探测之间的合计增量(Rprof在固定时间间隔内拍摄的内存使用情况快照:如果在执行函数f中执行了探测,则内存使用情况变化量到最后一个探针将添加到f)的mem.total.

Because it is the aggregated delta between two subsequent probes (memory usage snapshots taken by Rprof in regular time intervals: If a probe is taken while the execution is in function f the memory usage delta to the last probe is added to the mem.total of f).

内存使用量增量可能为负,但我从未见过负的mem.total值,所以我猜(!)仅将正值添加到mem.total.

The memory usage delta could be negative but I have never seen negative mem.total values so I am guessing (!) only positive values are added to mem.total.

这将解释您所看到的50 GB的总使用量:这不是单个时间点上分配的内存量,而是整个执行时间内的聚合内存增量.

This would explain the 50 GB total usage you are seeing: It is not the amount of allocated memory during a single point of time but the aggregated memory delta during the complete execution time.

这也解释了以下事实:gc仅显示3 GB为已使用的最大(Mb)" :内存已分配和释放/释放了很多次,因此您不会遇到内存问题压力很大,但这要花费大量时间(在RAM中移动大量数据会使所有缓存失效,因此速度很慢),这是CPU所采用的计算逻辑的基础.

This also explains the fact that gc only shows 3 GB as "max used (Mb)": The memory is allocated and freed/deallocated many times so that you do not run into memory pressure but this costs a lot of time (moving so much data in RAM invalidates all caches and is slow therefore) on top of the calculation logic the CPU applies.

此摘要(IMHO)似乎也掩盖了一个事实,即垃圾收集器(gc)在不确定的时间点启动,以清理释放的内存.

This summary (IMHO) also seems to hide the fact that the garbage collector (gc) is starting at non-deterministic points in time to clean-up freed memory.

由于gc开始(不确定地)开始延迟,恕我直言,将负内存增量归因于刚刚探查的单个函数是不公平的.

Since the gc starts lazy (non-deterministically) it would IMHO be unfair to attribute the negative memory deltas to a single function just probed.

我会将mem.total解释为mem.total.used.during.runtime,这可能是该列的更好标签.

I would interpret mem.total as mem.total.used.during.runtime which would possibly be a better label for the column.

profvis具有更详细的内存使用情况摘要(如您在问题中的屏幕截图所示):它还汇总了负的内存使用情况增量(释放的内存),但

profvis has a more detailed memory usage summary (as you can see in your screen shot in your question): It also aggregates the negative memory usage deltas (freed memory) but the profvis documentation also warns about the short-comings:

代码面板还显示内存分配和释放. 解释此信息可能会有些棘手,因为它确实 不一定反映该行分配和取消分配的内存 代码.采样探查器记录有关内存的信息 前一个样本与当前样本之间发生的分配 一.这意味着该行上的分配/取消分配值 可能实际上是在前一行代码中发生的.

The code panel also shows memory allocation and deallocation. Interpreting this information can be a little tricky, because it does not necessarily reflect memory allocated and deallcated at that line of code. The sampling profiler records information about memory allocations that happen between the previous sample and the current one. This means that the allocation/deallocation values on that line may have actually occurred in a previous line of code.

更详细的答案将需要更多的研究时间(我没有) -了解C和R源 -根据Rprof

A more details answer would require more research time (I don't have) - to look into the C and R source - to understand (replicate) the aggregation logic of summaryRprof based on the data files created by Rprof

Rprof数据文件(Rprof.out)看起来像这样:

Rprof data files (Rprof.out) look like this:

:376447:6176258:30587312:152:1#2 "test" 1#1 "test2"

前四个数字(用冒号分隔)均值(请参阅?summaryRprof) -R_SmallVallocSize:R堆上小块中的向量内存[桶数] -R_LargeVallocSize:大块中的向量内存[bucket数量](来自malloc) -R堆上节点中的内存 -在该时间间隔内对内部函数duplicate的调用次数(用于复制向量,例如,在函数参数的首次写入时复制"的情况下)

The first four numbers (separated by colons) mean (see ?summaryRprof) - R_SmallVallocSize: the vector memory [number of buckets] in small blocks on the R heap - R_LargeVallocSize: the vector memory [number of buckets] in large blocks (from malloc) - the memory in nodes on the R heap - the number of calls to the internal function duplicate in the time interval (used to duplicate vectors eg. in case of copy-on-first-write semantics of function arguments)

字符串是函数调用堆栈.

The strings are the function call stack.

只有前两个数字与以MB为单位计算当前的(向量)内存使用量有关:

Only the first two number are relevant to calculate the current memory usage (of vectors) in MB:

TotalBuckets = R_SmallVallocSize + R_LargeVallocSize
mem.used = TotalBuckets * 8 Bytes / 1024 / 1024
# 50 MB in the above `Rprof` probe line:
# (376447 + 6176258) * 8 / 1024 / 1024

有关Vcells的详细信息,请参见?Memory.

For details about the Vcells see ?Memory.

BTW:我想尝试summaryRProf(memory = "stats", diff = F)以获得当前的内存摘要,但在Ubuntu上收到R3.4.4 64位错误消息:

BTW: I wanted to try out summaryRProf(memory = "stats", diff = F) to get a current memory summary but I get an error message with R3.4.4 64-bits on Ubuntu:

Error in tapply(seq_len(1L), list(index = c("1::#File", "\"test2\":1#1",  : 
  arguments must have same length

您能重现吗(看起来像统计数据"坏了)?

Can you reproduce this (looks like "stats" is broken)?

这篇关于解释`Rprof`的内存分析输出的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆