在R的par *内部使用Rcpp函数从并行包中应用函数 [英] Using Rcpp functions inside of R's par*apply functions from the parallel package

查看:222
本文介绍了在R的par *内部使用Rcpp函数从并行包中应用函数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图了解在并行化环境中Rcpp::sourceCpp()调用背后发生的情况.最近,以下问题得到了部分解决:在parLapply中使用Rcpp函数Windows .

I'm trying to understand what is happening behind the Rcpp::sourceCpp() call on a parallelized environment. Recently, this was partially addressed in the question: Using Rcpp function in parLapply on Windows.

在这篇帖子中,德克说,

Within this post, Dirk said,

您需要在每个生成的进程中运行sourceCpp()调用,否则,请获取它们的代码."

"You need to run the sourceCpp() call in each spawned process, or else get them your code."

这是对发问者使用的将 Rcpp 函数分配给工作进程的回应.发问者正在通过以下方式发送 Rcpp 函数:

This was in response to questioner's use of distributing the Rcpp function to the worker processes. The questioner was sending the Rcpp function via:

clusterExport(cl = cl, varlist = "payoff")

我对为什么这不起作用感到困惑.我的想法是这就是clusterExport()的目标.

I'm confused as to why this doesn't work. My thoughts are that this was what the objective of the clusterExport() is for.

推荐答案

此处的问题是,由于二进制文件如何链接到中,因此编译后的代码无法导出"到生成的进程中而不嵌入到包中R 的流程.

The issue here is that the compiled code is not "exportable" to the spawned processes without being embedded in a package due to how binaries are linked into R's processes.

传统上,clusterExport()语句允许将 R 特定的代码分发给工作人员.

Traditionally, the clusterExport() statement allows for R specific code to be distributed to workers.

通过在Rcpp函数上使用clusterExport(),您只收到 R 声明,而基础共享库.也就是说,在属性中给出的R CMD SHLIB. R 不与/共享给工人.结果,当在工作线程上调用Rcpp函数时, R 无法找到正确的共享库.

By using clusterExport() on an Rcpp function, you are only receiving the R declaration and not the underlying shared library. That is to say, the R CMD SHLIB given in Attributes.R is not shared with / exported to the workers. As a result, when a call is then made to an Rcpp function on the worker, R cannot find the correct shared library.

接受上一个问题的功能:

Take the previous question's function:

Rcpp::cppFunction("NumericVector payoff( double strike, NumericVector data) {
    return pmax(data - strike, 0);
}")

注意::我使用的是cppFunction()而不是sourceCpp(),但是结果是等效的,因为

Note: I'm using cppFunction() instead of sourceCpp() but the results are equivalent since cppFunction() calls sourceCpp() to create the function.

键入函数名称:

payoff

使用共享库指针产生 R 声明.

Yields the R declaration with a shared library pointer.

function (strike, data) 
.Primitive(".Call")(<pointer: 0x1015ec130>, strike, data)

此共享库仅在编译功能的进程中可用.

This shared library is only available on process that compiled the function.

因此,为什么将编译后的代码嵌入到程序包中然后分发程序包始终是理想的选择.

Hence, why it is always ideal to embed compiled code within a package and then distribute the package.

这篇关于在R的par *内部使用Rcpp函数从并行包中应用函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆