在并行包中的 R 的 par*apply 函数中使用 Rcpp 函数 [英] Using Rcpp functions inside of R's par*apply functions from the parallel package

查看:32
本文介绍了在并行包中的 R 的 par*apply 函数中使用 Rcpp 函数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图了解在并行化环境中 Rcpp::sourceCpp() 调用背后发生了什么.最近,这在问题中得到了部分解决:在 parLapply 上使用 Rcpp 函数窗户.

I'm trying to understand what is happening behind the Rcpp::sourceCpp() call on a parallelized environment. Recently, this was partially addressed in the question: Using Rcpp function in parLapply on Windows.

在这篇文章中,德克说,

Within this post, Dirk said,

您需要在每个衍生的进程中运行 sourceCpp() 调用,否则将您的代码交给他们."

"You need to run the sourceCpp() call in each spawned process, or else get them your code."

这是对提问者使用将 Rcpp 函数分发给工作进程的回应.提问者通过以下方式发送 Rcpp 函数:

This was in response to questioner's use of distributing the Rcpp function to the worker processes. The questioner was sending the Rcpp function via:

clusterExport(cl = cl, varlist = "payoff")

我很困惑为什么这不起作用.我的想法是这就是 clusterExport() 的目标.

I'm confused as to why this doesn't work. My thoughts are that this was what the objective of the clusterExport() is for.

推荐答案

这里的问题是,由于二进制文件链接到 中的方式,编译后的代码无法导出"到生成的进程,而无需嵌入到包中R 的进程.

The issue here is that the compiled code is not "exportable" to the spawned processes without being embedded in a package due to how binaries are linked into R's processes.

传统上,clusterExport() 语句允许将 R 特定代码分发给工作人员.

Traditionally, the clusterExport() statement allows for R specific code to be distributed to workers.

通过在 Rcpp 函数上使用 clusterExport(),您只会收到R 声明,而不会 接收底层共享库.也就是说,R CMD SHLIB="noreferrer">Attributes.R 不与工作人员共享/导出给工作人员.结果,当随后调用工作程序上的 Rcpp 函数时,R 无法找到正确的共享库.

By using clusterExport() on an Rcpp function, you are only receiving the R declaration and not the underlying shared library. That is to say, the R CMD SHLIB given in Attributes.R is not shared with / exported to the workers. As a result, when a call is then made to an Rcpp function on the worker, R cannot find the correct shared library.

取上一题的函数:

Rcpp::cppFunction("NumericVector payoff( double strike, NumericVector data) {
    return pmax(data - strike, 0);
}")

注意:我使用的是 cppFunction() 而不是 sourceCpp() 但结果是等价的由于 cppFunction()code>sourceCpp() 来创建函数.

Note: I'm using cppFunction() instead of sourceCpp() but the results are equivalent since cppFunction() calls sourceCpp() to create the function.

输入函数名称:

payoff

产生带有共享库指针的 R 声明.

Yields the R declaration with a shared library pointer.

function (strike, data) 
.Primitive(".Call")(<pointer: 0x1015ec130>, strike, data)

此共享库仅在编译函数的进程上可用.

This shared library is only available on process that compiled the function.

因此,为什么将编译后的代码嵌入包中然后分发包总是理想的.

Hence, why it is always ideal to embed compiled code within a package and then distribute the package.

这篇关于在并行包中的 R 的 par*apply 函数中使用 Rcpp 函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆