R 内存管理/无法分配大小为 n Mb 的向量 [英] R memory management / cannot allocate vector of size n Mb

查看:49
本文介绍了R 内存管理/无法分配大小为 n Mb 的向量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在尝试在 R 中使用大对象时遇到问题.例如:

I am running into issues trying to use large objects in R. For example:

> memory.limit(4000)
> a = matrix(NA, 1500000, 60)
> a = matrix(NA, 2500000, 60)
> a = matrix(NA, 3500000, 60)
Error: cannot allocate vector of size 801.1 Mb
> a = matrix(NA, 2500000, 60)
Error: cannot allocate vector of size 572.2 Mb # Can't go smaller anymore
> rm(list=ls(all=TRUE))
> a = matrix(NA, 3500000, 60) # Now it works
> b = matrix(NA, 3500000, 60)
Error: cannot allocate vector of size 801.1 Mb # But that is all there is room for

我知道这与获取连续内存块的难度有关(来自 这里):

I understand that this is related to the difficulty of obtaining contiguous blocks of memory (from here):

开头的错误信息不能分配大小的向量表示一个无法获得记忆,要么因为尺寸超过了进程的地址空间限制,或者,更有可能,因为系统是无法提供内存.笔记在 32 位版本上可能有有足够的可用内存,但是没有足够大的连续块将其映射到的地址空间.

Error messages beginning cannot allocate vector of size indicate a failure to obtain memory, either because the size exceeded the address-space limit for a process or, more likely, because the system was unable to provide the memory. Note that on a 32-bit build there may well be enough free memory available, but not a large enough contiguous block of address space into which to map it.

我该如何解决这个问题?我的主要困难是我在脚本中达到了某个点,而 R 无法为对象分配 200-300 Mb……我无法真正预先分配块,因为我需要内存用于其他处理.即使我努力删除不需要的对象,也会发生这种情况.

How can I get around this? My main difficulty is that I get to a certain point in my script and R can't allocate 200-300 Mb for an object... I can't really pre-allocate the block because I need the memory for other processing. This happens even when I dilligently remove unneeded objects.

是的,抱歉:Windows XP SP3、4Gb RAM、R 2.12.0:

Yes, sorry: Windows XP SP3, 4Gb RAM, R 2.12.0:

> sessionInfo()
R version 2.12.0 (2010-10-15)
Platform: i386-pc-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=English_Caribbean.1252  LC_CTYPE=English_Caribbean.1252   
[3] LC_MONETARY=English_Caribbean.1252 LC_NUMERIC=C                      
[5] LC_TIME=English_Caribbean.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

推荐答案

考虑一下您是否真的需要显式地需要所有这些数据,或者矩阵可以是稀疏的吗?对于稀疏矩阵,R 中有很好的支持(例如,参见 Matrix 包).

Consider whether you really need all this data explicitly, or can the matrix be sparse? There is good support in R (see Matrix package for e.g.) for sparse matrices.

当您需要制作这种大小的对象时,请尽量减少 R 中的所有其他进程和对象.使用 gc() 清除现在未使用的内存,或者,最好只在一个会话中创建您需要的对象.

Keep all other processes and objects in R to a minimum when you need to make objects of this size. Use gc() to clear now unused memory, or, better only create the object you need in one session.

如果以上方法不能解决问题,请购买一台 64 位机器,并在您能承受的范围内尽可能多地使用 RAM,然后安装 64 位 R.

If the above cannot help, get a 64-bit machine with as much RAM as you can afford, and install 64-bit R.

如果您做不到,有许多用于远程计算的在线服务.

If you cannot do that there are many online services for remote computing.

如果你不能这样做,内存映射工具如包 ff(或 Sascha 提到的 bigmemory)将帮助你构建一个新的解决方案.根据我有限的经验,ff 是更高级的包,但您应该阅读 CRAN 任务视图上的 高性能计算 主题.

If you cannot do that the memory-mapping tools like package ff (or bigmemory as Sascha mentions) will help you build a new solution. In my limited experience ff is the more advanced package, but you should read the High Performance Computing topic on CRAN Task Views.

这篇关于R 内存管理/无法分配大小为 n Mb 的向量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆