尽管内存似乎可用,但 R 无法分配内存 [英] R Cannot allocate memory though memory seems to be available

查看:20
本文介绍了尽管内存似乎可用,但 R 无法分配内存的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

运行多个模型后,我需要在我的 R 脚本上运行 system() 命令来关闭我的 EC2 实例,但是当我到达那个点时,我得到:

cannot popen 'ls', 可能的原因 'Cannot allocation memory'

注意:对于这个问题,我什至尝试了 ls 但没有用

我的脚本流程如下

  • 加载模型(约 2GB)
  • 挖掘文档并写入 MySQL 数据库

以上步骤重复大约 20 次,不同型号的平均大小为 2GB

  • 终止实例

此时是我需要调用 system("sudo shutdown -h now") 并且没有任何反应的时候,但是当我尝试 system("sudo shutdown -h now",intern=TRUE) 我收到分配错误.

在调用关闭之前,我为所有对象尝试了 rm(),但同样的错误仍然存​​在.

这是我系统上的一些数据,它是一个大型 EC2 Ubuntu 实例

R 版本 2.15.1 (2012-06-22)平台:x86_64-pc-linux-gnu(64 位)语言环境:[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8[7] LC_PAPER=C LC_NAME=C[9] LC_ADDRESS=C LC_TELEPHONE=C[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C附带的基础包:[1] splines stats graphics grDevices utils datasets 方法[8] 基地其他附加包:[1] RTextTools_1.3.9 tau_0.0-15 glmnet_1.8 Matrix_1.0-6[5]lattice_0.20-10 maxent_1.3.2 Rcpp_0.9.13 caTools_1.13[9] bitops_1.0-4.1 ipred_0.8-13 prodlim_1.3.2 KernSmooth_2.23-8[13] 生存_2.36-14 mlbench_2.1-1 MASS_7.3-21 rpart_3.1-54[17] e1071_1.6-1 class_7.3-4 tm_0.5-7.3 nnet_7.3-4[21] tree_1.0-31 randomForest_4.6-6 SparseM_0.96 RMySQL_0.9-3[25] ggplot2_0.9.1 DBI_0.2-5通过命名空间加载(而不是附加):[1] colorspace_1.1-2 dichromat_1.2-4digest_0.5.2 grid_2.15.1[5] labeling_0.2 memoise_0.1 munsell_0.3 plyr_1.7.1[9] proto_0.3-9.2 RColorBrewer_1.0-5 reshape2_1.2.1 scales_0.2.1[13] slam_0.1-25 stringr_0.6.1

gc() 返回

 used (Mb) gc trigger (Mb) max used (Mb)Ncells 1143171 61.1 5234604 279.6 5268036 281.4Vcells 1055057 8.1 465891772 3554.5 767962930 5859.1

我注意到,如果我只运行 1 个模型而不是 20 个模型,它可以正常工作,所以可能是每次运行后内存没有得到释放,尽管我做了 rm() 使用过的对象

我还注意到,如果我关闭 R 并重新启动它,然后调用 system() 它可以工作.如果有一种方法可以在 R 中重新启动 R,那么也许我可以将其添加到我的 script.sh 流程中.

哪种方法是清理所有对象并为每个循环释放内存的合适方法,以便在我需要调用 system() 命令时没有内存问题?

任何正确方向的提示将不胜感激!谢谢

解决方案

我只是张贴这个,因为它太长,无法放入评论中.由于您尚未包含任何代码,因此很难提供建议.但是,这里有一些您可以考虑的代码.

wd <- getwd()分配('.第一个',函数(x){require('plyr') #以及你正在使用的任何其他包file.remove(".RData") #已经加载rm(".Last", pos=.GlobalEnv) #否则R不重启就无法退出setwd(wd)}, pos=.GlobalEnv)分配(.最后",函数(){system("R --no-site-file --no-init-file --quiet")}, pos=.GlobalEnv)save.image() #或者只保存你想要重新加载的东西.q("不")

这个想法是将您需要的东西保存在一个名为 .RData 的文件中.您创建了一个 .Last 函数,该函数将在您退出 R 时运行..Last 函数将启动 R 的新会话.然后您创建一个 .First 函数将在 R 重新启动后立即运行..First 函数将加载您需要的包并进行清理.

现在,您可以退出 R,它会重新加载您需要的东西.

(q("no") 表示不保存,但是你已经在 .RData 中保存了你需要的一切,它会在它重新启动时加载)

After running several models I need to run a system() command on my R script to shutdown my EC2 instance, but when I get to that point I get:

cannot popen 'ls', probable reason 'Cannot allocate memory'

Note: for this question I even tried ls which did not work

The flow of my script is the following

  • Load Model (about 2GB)
  • Mine documents and write to a MySQL database

The above steps are repeated around 20 times with different models with an average size of 2GB each

  • Terminate the instance

At this point is when I need to call system("sudo shutdown -h now") and nothing happens, but when I try system("sudo shutdown -h now",intern=TRUE) I get the allocation error.

I tried rm() for all my objects just before calling the shutdown, but the same error persists.

Here is some data on my system which is a large EC2 Ubuntu instance

R version 2.15.1 (2012-06-22)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=C                 LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] splines   stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] RTextTools_1.3.9   tau_0.0-15         glmnet_1.8         Matrix_1.0-6      
 [5] lattice_0.20-10    maxent_1.3.2       Rcpp_0.9.13        caTools_1.13      
 [9] bitops_1.0-4.1     ipred_0.8-13       prodlim_1.3.2      KernSmooth_2.23-8 
[13] survival_2.36-14   mlbench_2.1-1      MASS_7.3-21        rpart_3.1-54      
[17] e1071_1.6-1        class_7.3-4        tm_0.5-7.3         nnet_7.3-4        
[21] tree_1.0-31        randomForest_4.6-6 SparseM_0.96       RMySQL_0.9-3      
[25] ggplot2_0.9.1      DBI_0.2-5         

loaded via a namespace (and not attached):
 [1] colorspace_1.1-2   dichromat_1.2-4    digest_0.5.2       grid_2.15.1       
 [5] labeling_0.2       memoise_0.1        munsell_0.3        plyr_1.7.1        
 [9] proto_0.3-9.2      RColorBrewer_1.0-5 reshape2_1.2.1     scales_0.2.1      
[13] slam_0.1-25        stringr_0.6.1    

gc() returns

          used (Mb) gc trigger   (Mb)  max used   (Mb)
Ncells 1143171 61.1    5234604  279.6   5268036  281.4
Vcells 1055057  8.1  465891772 3554.5 767962930 5859.1

I noticed that if I run just 1 model instead of the 20 it works fine, so it might be that memory is not getting free after each run although I did rm() the used objects

I also noticed that if I close R and restart it and then call system() it works. If there is a way to restart R within R then maybe I can add that to my script.sh flow.

Which would be the appropriate way of cleaning all of my objects and letting the memory free for each loop so when I need to call the system() commands there is no memory issue?

Any tip in the right direction will be much appreciated! Thanks

解决方案

I'm just posting this because it's too long to fit in the comments. Since you haven't included any code, it's pretty hard to give advice. But, here is some code that maybe you can think about.

wd <- getwd()
assign('.First', function(x) {
  require('plyr') #and whatever other packages you're using
  file.remove(".RData") #already been loaded
  rm(".Last", pos=.GlobalEnv) #otherwise won't be able to quit R without it restarting
  setwd(wd)
}, pos=.GlobalEnv)
assign(".Last", function() {
  system("R --no-site-file --no-init-file --quiet")
}, pos=.GlobalEnv)
save.image() #or only save the things you want to be reloaded.
q("no")

The idea is that you save the things you need in a file called .RData. You create a .Last function that will be run when you quit R. The .Last function will start a new session of R. And you create a .First function that will be run as soon as R is restarted. The .First function will load packages you need and clean up.

Now, you can quit R and it will restart loading the things you need.

(q("no") means don't save, but you already saved everything you need in .RData which will be loaded when it restarts)

这篇关于尽管内存似乎可用,但 R 无法分配内存的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆