R数据表大小和内存限制 [英] R data.table Size and Memory Limits

查看:621
本文介绍了R数据表大小和内存限制的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个15.4GB的R data.table对象,有29百万记录和135个变量。我的系统& R信息如下:

  Windows 7 x64在具有16GB RAM的x86_64机器上R版本3.1.1(2014- 



我得到以下内存分配错误请参阅图片)



p>

我设置我的内存限制如下:

 #memory.limit size = 7000000)
#使用ff库时将内存限制为40GB
memory.limit(size = 40000)

我的问题如下:


  1. 我应该将内存限制更改为7 TB

  2. 将文件拆分为大块,然后执行过程

  3. 任何其他建议?


<

尝试分析您的代码以确定哪些语句会导致浪费RAM > #install.packages(pryr)
库(pryr)#用于内存调试

memory.size(max = TRUE)#print max memory used so远程(仅适用于MS Windows!)
mem_used()
gc(verbose = TRUE)#显示内部内存(更多帮助)


Rprof(pfile< - rprof.log,memory.profiling = TRUE)#取消注释以分析内存消耗

#!!!你的代码在这里

#在你认为合理的时候打印内存统计
memory.size(max = TRUE)
mem_used()
gc verbose = TRUE)

#停止分析您的代码
Rprof(NULL)
summaryRprof(pfile,memory =both)#显示内存消耗配置文件

然后评估内存消耗配置文件...



代码停止与内存不足异常,你应该将输入数据减少到一个量,使您的代码可行,并使用此输入内存分析...


I have a 15.4GB R data.table object with 29 Million records and 135 variables. My system & R info are as follows:

Windows 7 x64 on a x86_64 machine with 16GB RAM."R version 3.1.1 (2014-07-10)" on "x86_64-w64-mingw32" 

I get the following memory allocation error (see image)

I set my memory limits as follows:

#memory.limit(size=7000000)
#Change memory.limit to 40GB when using ff library
memory.limit(size=40000)

My questions are the following:

  1. Should I change the memory limit to 7 TB
  2. Break the file into chunks and do the process?
  3. Any other suggestions?

解决方案

Try to profile your code to identify which statements cause the "waste of RAM":

# install.packages("pryr")
library(pryr) # for memory debugging

memory.size(max = TRUE) # print max memory used so far (works only with MS Windows!)
mem_used()
gc(verbose=TRUE) # show internal memory stuff (see help for more)

# start profiling your code
Rprof( pfile <- "rprof.log", memory.profiling=TRUE) # uncomment to profile the memory consumption

# !!! Your code goes here

# Print memory statistics within your code whereever you think it is sensible
memory.size(max = TRUE)
mem_used()
gc(verbose=TRUE)

# stop profiling your code
Rprof(NULL)
summaryRprof(pfile,memory="both") # show the memory consumption profile

Then evaluate the memory consumption profile...

Since your code stops with an "out of memory" exception you should reduce the input data to an amount the makes your code workable and use this input for memory profiling...

这篇关于R数据表大小和内存限制的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆