R中的DLL超过最大数量 [英] Exceeded maximum number of DLLs in R

查看:117
本文介绍了R中的DLL超过最大数量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用RStan从大量的高斯过程(GP)中进行采样,即使用stan()函数.对于我适合的每个GP,都会加载另一个DLL,如运行R命令所示

I am using RStan to sample from a large number of Gaussian Processes (GPs), i.e., using the function stan(). For every GP that I fit, another DLL gets loaded, as can be seen by running the R command

getLoadedDLLs()

我遇到的问题是,因为我需要容纳这么多的唯一GP,所以我超出了可以加载的DLL的最大数量,这时我收到以下错误消息:

The problem I'm running into is that, because I need to fit so many unique GPs, I'm exceeding the maximum number of DLLs that can be loaded, at which point I receive the following error:

Error in dyn.load(libLFile) : 
unable to load shared object '/var/folders/8x/n7pqd49j4ybfhrm999z3cwp81814xh/T//RtmpmXCRCy/file80d1219ef10d.so':
maximal number of DLLs reached...

据我所知,这是在基本R代码的Rdynload.c中设置的,如下所示:

As far as I can tell, this is set in Rdynload.c of the base R code, as follows:

#define MAX_NUM_DLLS 100

所以,我的问题是,如何解决这个问题?用较大的MAX_NUM_DLLS从源代码构建R是不可行的,因为我的代码将由不满意该过程的协作者运行.我尝试过只使用dyn.unload()卸载DLL的天真方法,希望它们在需要时可以重新加载.卸载工作正常,但是当我再次尝试使用fit时,R意外崩溃会出现如下错误:

So, my question is, what can be done to fix this? Building R from source with a larger MAX_NUM_DLLS isn't an option, as my code will be run by collaborators who wouldn't be comfortable with that process. I've tried the naive approach of just unloading DLLs using dyn.unload() in the hopes that they'd just be reloaded when they're needed again. The unloading works fine, but when I try to use the fit again, R fairly unsurprisingly crashes with an error like:

*** caught segfault ***
address 0x121366da8, cause 'memory not mapped'

我还尝试了分离RStan,希望可以自动卸载DLL,但是即使卸载了软件包,DLL也仍然存在(如预期的那样,鉴于分离帮助中的以下内容:分离通常不会卸载任何动态加载的已编译代码(DLL)".

I've also tried detaching RStan in the hopes that the DLLs would be automatically unloaded, but they persist even after unloading the package (as expected, given the following in the help for detach: "detaching will not in general unload any dynamically loaded compiled code (DLLs)").

从该问题开始,可以不卸载Rcpp程序包DLL而无需重新启动R吗?,看来library.dynam.unload()在解决方案中可能有一定作用,但是我无法成功使用它卸载DLL,并且我怀疑在卸载DLL之后我会遇到与以前相同的段错误.

From this question, Can Rcpp package DLLs be unloaded without restarting R?, it seems that library.dynam.unload() might have some role in the solution, but I haven't had any success using it to unload the DLLs, and I suspect that after unloading the DLL I'd run into the same segfault as before.

添加一个最小的,功能齐全的示例:

adding a minimal, fully-functional example:

R代码:

require(rstan)

x <- c(1,2)
N <- length(x)

fits <- list()
for(i in 1:100)
{
    fits[i] <- stan(file="gp-sim.stan", data=list(x=x,N=N), iter=1, chains=1)
}

此代码要求以下模型定义位于文件gp-sim.stan的工作目录中(该模型是Stan附带的示例之一):

This code requires that the following model definition be in the working directory in a file gp-sim.stan (this model is one of the examples included with Stan):

// Sample from Gaussian process
// Fixed covar function: eta_sq=1, rho_sq=1, sigma_sq=0.1

data {
  int<lower=1> N;
  real x[N];
}
transformed data {
   vector[N] mu;
   cov_matrix[N] Sigma;
   for (i in 1:N) 
     mu[i] <- 0;
   for (i in 1:N) 
     for (j in 1:N)
       Sigma[i,j] <- exp(-pow(x[i] - x[j],2)) + if_else(i==j, 0.1, 0.0);
 }
 parameters {
   vector[N] y;
 }
 model {
   y ~ multi_normal(mu,Sigma);
 }

注意:这段代码要花很长时间才能运行,因为它正在创建约100个Stan模型.

Note: this code takes quite some time to run, as it is creating ~100 Stan models.

推荐答案

我不能说关于dll的问题,但是您不必每次都编译模型.您可以一次编译模型并重用它,这不会导致此问题,并且可以加快代码的速度.

I can't speak for the issues regarding dlls, but you shouldn't need to compile the model each time. You can compile the model once and reuse it, which won't cause this problem and it will speed up your code.

函数stanstan_model的包装器,用于包装模型,而sampling方法是从模型中抽取样品的方法.您应该运行一次stan_model来编译模型并将其保存到一个对象,然后对该对象使用sampling方法绘制样本.

The function stan is a wrapper for stan_model which compiles the model and the sampling method which draws samples from the model. You should run stan_model once to compile the model and save it to an object, and then use the sampling method on that object to draw samples.

require(rstan)

x <- c(1,2)
N <- length(x)

fits <- list()
mod <- stan_model("gp-sim.stan")
for(i in 1:100)
{
    fits[i] <- sampling(mod, data=list(x=x,N=N), iter=1, chains=1)
}

这类似于运行并行链的问题,在Rstan wiki .通过用并行处理采样的东西代替for循环,可以加快您的代码.

This is similar to the problem of running parallel chains, discussed in the Rstan wiki. Your code could by sped up by replace the for loop with something that processes the sampling in parallel.

这篇关于R中的DLL超过最大数量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆