使用Rcpp运行编译的C ++代码 [英] Running compiled C++ code with Rcpp

查看:128
本文介绍了使用Rcpp运行编译的C ++代码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在通过Dirk Eddelbuettel的 Rcpp 教程:



http://www.rinfinance.com/agenda/



我学习了如何储存一个目录中的C ++文件,并从R中运行它。我运行的C ++文件称为'logabs2.ccp',其内容直接来自Dirk的幻灯片之一:

  #include< Rcpp.h> 

使用命名空间Rcpp;

inline double f(double x){return :: log(:: fabs(x)); }

// [[Rcpp :: export]]
std :: vector< double> logabs2(std :: vector< double> x){
std :: transform(x.begin(),x.end(),x.begin(),f);
return x;
}

我用这个R代码运行:

  library(Rcpp)
sourceCpp(c:/ users / mmiller21 / simple r programs / logabs2.cpp)
logabs2 (-5,5,by = 2))
#[1] 1.609438 1.098612 0.000000 0.000000 1.098612 1.609438


$ b b

我在Windows 7机器上运行代码,从R GUI默认情况下似乎安装。我还安装了最新版本的 Rtools 。上面的R代码似乎需要相对长的时间运行。我怀疑大部分时间是致力于编译的C ++代码,一旦C ++代码编译,它运行非常快。 Microbenchmark 肯定表明 Rcpp 可减少计算时间。



我从来没有使用C ++直到现在,但我知道,当我编译C代码,我得到一个* .exe文件。我从一个名为 logabs2.exe 的文件搜索我的硬盘驱动器,但找不到。我想知道如果创建了一个 logabs2.exe 文件,上面的C ++代码是否可能运行得更快。是否可以创建一个 logabs2.exe 文件并将其存储在某个文件夹中,然后每当我想使用它时,Rcpp调用该文件?我不知道这是否有意义。如果我可以在* .exe文件中存储一个C ++函数,也许我不需要编译该函数每次我想使用Rcpp,然后也许Rcpp代码会更快。



对不起,如果这个问题没有意义或重复。如果可以将C ++函数存储为* .exe文件,我希望有人会告诉我如何修改我的R代码来运行它。感谢您对此的任何帮助,或直接告诉我为什么我不建议或不推荐。



我期待看到Dirk的新书。 >

解决方案

感谢user1981275,Dirk Eddelbuettel和Romain Francois的回复。下面是我如何编译一个C ++文件并创建一个* .dll,然后调用并在 R 中使用* .dll文件。



步骤1.我创建了一个名为c:\users\mmiller21\myrpackages的新文件夹,并将文件logabs2.cpp粘贴到该新文件夹中。



步骤2.在新文件夹中创建了一个新的 R 使用 R 文件命名为'logabs2'的包我写了一个名为new package creation.r。 'new package creation.r'的内容是:

  setwd('c:/ users / mmiller21 / myrpackages /') 

库(Rcpp)

Rcpp.package.skeleton(logabs2,example_code = FALSE,cpp_files = c(logabs2.cpp))

我发现上面的语法 Rcpp.package.skeleton 的Hadley Wickham网站: https://github.com/hadley/devtools/wiki/Rcpp



步骤3.我在 R中安装了新的 R / code>在DOS命令窗口中使用以下行:

  C:\Program Files \R\\ \\ R-3.0.1\bin\x64> R CMD INSTALL -lc:\users\mmiller21\documents\r\win-library\3.0\ c:\users\mmiller21\\ \\ myrpackages\logabs2 

其中:



rcmd.exe文件的位置是:

  C:\Program Files\R\R-3.0。 1 \bin\x64> 

我安装的 R 计算机是:

  c:\users\mmiller21\documents\r\win-library\3.0\\ \\ 

以及我的新 R 程序包安装之前是:

  c:\users\mmiller21\myrpackages \ 

DOS命令窗口中使用的语法通过尝试和错误找到,可能不理想。在某个时候,我在C:\Program Files\R\R-3.0.1\\\x64中粘贴了一个'logabs2.cpp'的副本,但我认为这不重要。



步骤4.安装新的 R 包后,使用 R $ c>文件我在'c:/ users / mmiller21 / myrpackages /'文件夹中命名为'new package usage.r'(虽然我不认为文件夹很重要)。 'new package usage.r'的内容是:

  library(logabs2)
logabs2(seq ,5,by = 2))

输出为:


b $ b

 #[1] 1.609438 1.098612 0.000000 0.000000 1.098612 1.609438 


$ b b

这个文件加载了 Rcpp 包。



 #> R  microbenchmark(logabs2(seq(-5,5,by = 2)),times = 100)
#Unit:microseconds
#expr min lq median uq max neval
#logabs2(seq -5,5,by = 2))43.086 44.453 50.6075 69.756 190.803 100

#> microbenchmark(log(abs(seq(-5,5,by = 2))),times = 100)
#Unit:microseconds
#expr min lq median uq max neval
#log (abs(seq(-5,5,by = 2)))38.298 38.982 39.666 40.35 173.023 100



  system.time(

cppFunction(
NumericVector logabs(NumericVector x){
return log(abs(x));
}




#用户系统已过
#0.06 0.08 5.85

R在这种情况下看起来更快或者与* .dll文件一样快,我毫不怀疑使用带有 Rcpp 的* .dll文件将比基础 R



这是我第一次尝试创建R包或使用Rcpp,毫无疑问,我没有使用最有效的方法。



在下面的评论中,我认为Romain Francois建议我将* .cpp文件修改为以下内容:

  #include< Rcpp.h> 
使用命名空间Rcpp;

// [[Rcpp :: export]]

NumericVector logabs(NumericVector x){
return log(abs(x));
}

并重新创建我的 R package,我现在做了。然后我使用以下代码将基础 R 与我的新包进行比较:

  library(logabs)

logabs(seq(-5,5,by = 2))
log(abs(seq(-5,5,by = 2)))

库(microbenchmark)

microbenchmark(logabs(seq(-5,5,by = 2)),log(abs(seq(-5,5,by = 2) )),times = 100000)

Base R 仍然有点快一点或没有什么不同:

 单位:微秒
expr min lq median uq max neval
logabs(seq(-5,5,by = 2))42.401 45.137 46.505 69.073 39754.598 1e + 05
log(abs(seq(-5,5,by = 2)))37.614 40.350 41.718 62.234 3422.133 1e + 05

也许这是因为base R 已经被矢量化。我怀疑有更复杂的函数base R 会慢得多。或者,我还是不使用最有效的方法,或者我只是在某个地方发生错误。


I have been working my way through Dirk Eddelbuettel's Rcpp tutorial here:

http://www.rinfinance.com/agenda/

I have learned how to save a C++ file in a directory and call it and run it from within R. The C++ file I am running is called 'logabs2.ccp' and its contents are directly from one of Dirk's slides:

#include <Rcpp.h>

using namespace Rcpp;

inline double f(double x) { return ::log(::fabs(x)); }

// [[Rcpp::export]]
std::vector<double> logabs2(std::vector<double> x) {
    std::transform(x.begin(), x.end(), x.begin(), f);
    return x;
}

I run it with this R code:

library(Rcpp)
sourceCpp("c:/users/mmiller21/simple r programs/logabs2.cpp")
logabs2(seq(-5, 5, by=2))
# [1] 1.609438 1.098612 0.000000 0.000000 1.098612 1.609438

I am running the code on a Windows 7 machine from within the R GUI that seems to install by default. I also installed the most recent version of Rtools. The above R code seems to take a relatively long time to run. I suspect most of that time is devoted to compiling the C++ code and that once the C++ code is compiled it runs very quickly. Microbenchmark certainly suggests that Rcpp reduces computation time.

I have never used C++ until now, but I know that when I compile C code I get an *.exe file. I have searched my hard-drive from a file called logabs2.exe but cannot find one. I am wondering whether the above C++ code might run even faster if a logabs2.exe file was created. Is it possible to create a logabs2.exe file and store it in a folder somewhere and then have Rcpp call that file whenever I wanted to use it? I do not know whether that makes sense. If I could store a C++ function in an *.exe file then perhaps I would not have to compile the function every time I wanted to use it with Rcpp and then perhaps the Rcpp code would be even faster.

Sorry if this question does not make sense or is a duplicate. If it is possible to store the C++ function as an *.exe file I am hoping someone will show me how to modify my R code above to run it. Thank you for any help with this or for setting me straight on why what I suggest is not possible or recommended.

I look forward to seeing Dirk's new book.

解决方案

Thank you to user1981275, Dirk Eddelbuettel and Romain Francois for their responses. Below is how I compiled a C++ file and created a *.dll, then called and used that *.dll file inside R.

Step 1. I created a new folder called 'c:\users\mmiller21\myrpackages' and pasted the file 'logabs2.cpp' into that new folder. The file 'logabs2.cpp' was created as described in my original post.

Step 2. Inside the new folder I created a new R package called 'logabs2' using an R file I wrote called 'new package creation.r'. The contents of 'new package creation.r' are:

setwd('c:/users/mmiller21/myrpackages/')

library(Rcpp)

Rcpp.package.skeleton("logabs2", example_code = FALSE, cpp_files = c("logabs2.cpp"))

I found the above syntax for Rcpp.package.skeleton on one of Hadley Wickham's websites: https://github.com/hadley/devtools/wiki/Rcpp

Step 3. I installed the new R package "logabs2" in R using the following line in the DOS command window:

C:\Program Files\R\R-3.0.1\bin\x64>R CMD INSTALL -l c:\users\mmiller21\documents\r\win-library\3.0\ c:\users\mmiller21\myrpackages\logabs2

where:

the location of the rcmd.exe file is:

C:\Program Files\R\R-3.0.1\bin\x64>

the location of installed R packages on my computer is:

c:\users\mmiller21\documents\r\win-library\3.0\

and the location of my new R package prior to being installed is:

c:\users\mmiller21\myrpackages\

Syntax used in the DOS command window was found by trial and error and may not be ideal. At some point I pasted a copy of 'logabs2.cpp' in 'C:\Program Files\R\R-3.0.1\bin\x64>' but I do not think that mattered.

Step 4. After installing the new R package I ran it using an R file I named 'new package usage.r' in the 'c:/users/mmiller21/myrpackages/' folder (although I do not think the folder was important). The contents of 'new package usage.r' are:

library(logabs2)
logabs2(seq(-5, 5, by=2))

The output was:

# [1] 1.609438 1.098612 0.000000 0.000000 1.098612 1.609438

This file loaded the package Rcpp without me asking.

In this case base R was faster assuming I did this correctly.

#> microbenchmark(logabs2(seq(-5, 5, by=2)), times = 100)
#Unit: microseconds
#                        expr    min     lq  median     uq     max neval
# logabs2(seq(-5, 5, by = 2)) 43.086 44.453 50.6075 69.756 190.803   100

#> microbenchmark(log(abs(seq(-5, 5, by=2))), times=100)
#Unit: microseconds
#                         expr    min     lq median    uq     max neval
# log(abs(seq(-5, 5, by = 2))) 38.298 38.982 39.666 40.35 173.023   100

However, using the dll file was faster than calling the external cpp file:

system.time(

cppFunction("
NumericVector logabs(NumericVector x) {
    return log(abs(x));
}
")

)

#   user  system elapsed 
#   0.06    0.08    5.85 

Although base R seems faster or as fast as the *.dll file in this case, I have no doubt that using the *.dll file with Rcpp will be faster than base R in most cases.

This was my first attempt creating an R package or using Rcpp and no doubt I did not use the most efficient methods. Also, I apologize for any typographic errors in this post.

EDIT

In a comment below I think Romain Francois suggested I modify the *.cpp file to the following:

#include <Rcpp.h>
using namespace Rcpp;

// [[Rcpp::export]]

NumericVector logabs(NumericVector x) {
return log(abs(x));
}

and recreate my R package, which I have now done. I then compared base R against my new package using the following code:

library(logabs)

logabs(seq(-5, 5, by=2))
log(abs(seq(-5, 5, by=2)))

library(microbenchmark)

microbenchmark(logabs(seq(-5, 5, by=2)), log(abs(seq(-5, 5, by=2))), times = 100000)

Base R is still a tiny bit faster or no different:

Unit: microseconds
                         expr    min     lq median     uq       max neval
   logabs(seq(-5, 5, by = 2)) 42.401 45.137 46.505 69.073 39754.598 1e+05
 log(abs(seq(-5, 5, by = 2))) 37.614 40.350 41.718 62.234  3422.133 1e+05

Perhaps this is because base R is already vectorized. I suspect with more complex functions base R will be much slower. Or perhaps I am still not using the most efficient approach, or perhaps I simply made an error somewhere.

这篇关于使用Rcpp运行编译的C ++代码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆