在Windows下调试(一行一行)Rcpp生成的DLL [英] Debugging (line by line) of Rcpp-generated DLL under Windows

查看:155
本文介绍了在Windows下调试(一行一行)Rcpp生成的DLL的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

最近我一直在尝试使用Rcpp(内联)来生成在提供的R输入上执行各种任务的DLL。 我希望能够逐行调试这些DLL中的代码,给定一组特定的R输入。(我正在Windows下工作。)



为了说明,让我们考虑一个具体的例子,任何人都应该可以运行...



下面的代码是一个非常简单的cxxfunction,输入向量加倍。但是请注意,还有一个额外的变量 myvar 可以更改值几次,但不影响输出 - 已添加,以便我们可以看到调试过程正常运行。

 库(inline)
库(Rcpp)

f0< - cxxfunction(signature(a =numeric),plugin =Rcpp,body ='
Rcpp :: NumericVector xa(a);
int myvar = 19;
int na = xa.size();
myvar = 27;
Rcpp :: NumericVector out1(na);
for(int i = 0; i< na; i ++){
out1 [i] = 2 * xa [i];
myvar ++;
}
myvar = 101;
return(Rcpp :: List :: create out1] = out1));
')

输入命令

  getLoadedDLLs()

在R会话中显示一个DLL列表。列出的最后一个应该是由上述过程创建的DLL - 它有一个随机的临时名称,在我的例子中是

  file7e61645c 

文件名列显示cxxfunction将该DLL放在位置 tempdir(),对我来说现在是

  C:/ Users / TimP / AppData /本地/临时/ RtmpXuxtpa / file7e61645c.dll 

现在,调用DLL的明显方法是通过 f0 ,如下

 > f0(c(-7,0.7,77))

$ out1
[1] -14.0 1.4 154.0

但是我们当然也可以通过名称使用 .Call 命令直接调用DLL:

 > .Call(file7e61645c,c(-7,0.7,77))

$ out1
[1] -14.0 1.4 154.0
/ pre>

所以我已经达到了直接用R输入调用独立DLL的点(这里,向量 c( - 7,0.7,77)),并将答案正确地返回给R。



然而,我真正需要的是,是一个逐行调试(使用gdb,我推测)的设施,这将允许我观察到 myvar 的值设置为19,27,28,29 ,30,最后101代码进行。上面的例子是故意设置的,所以调用DLL告诉我们没有关于myvar。



澄清,这里的胜利条件是能够观察myvar的变化(看到值myvar = 19将是第一步!),而不添加任何其他代码的身体。这显然可能需要更改编译代码的方式(调试模式设置是否打开?)或者R调用方式 - 但是我不知道从哪里开始。如上所述,所有这些都是基于Windows的。



最后注意事项:在我的实验中,我实际上对cxxfunction的一个副本做了一些小的修改,以便输出DLL - 并且其中的代码 - 接收用户定义的名称并且位于用户定义的目录中,而不是临时名称和位置。但这并不影响问题的本质。我提到这只是为了强调,如果有人给我一个推动,应该很容易改变编译设置:)



为了完整,在原文中设置verbose = TRUE cxxfunction调用上面显示的编译参数为以下形式:

  C:/R/R-2.13.2/bin / i386 / R CMD SHLIB file7e61645c.cpp 2> file7e61645c.cpp.err.txt 
g ++ -IC:/R/R-213~1.2/include-IC:/R/R-2.13.2/library/Rcpp/include-O2 -Wall -c file7e61645c.cpp -o file7e61645c.o
g ++ -shared -s -static-libgcc -o file7e61645c.dll tmp.def file7e61645c.o C:/R/R-2.13.2/library/Rcpp /lib/i386/libRcpp.a -LC:/R/R-213~1.2/bin/i386 -lR

我的修改版本具有与上述相同的编译参数,除了字符串file7e61645c被用户选择的名称(例如testdll)替换,并将相关文件复制到更永久的位置。



提前感谢您的帮助::)

解决方案

令人沮丧的是一些 Rcpp 用户对 inline 包及其 cxxfunction()。是的,这确实是非常有帮助的,它肯定已经推动了 Rcpp 的进一步采用使快速实验更容易。是的,它允许我们在源中使用700+单元测试。是的,我一直在使用它来演示这里的例子,在 rcpp-devel列表或甚至可以在演示文稿中生活。



但是这是否意味着我们应该使用它来进行每一项任务?这是否意味着它没有成本,如临时目录等中的随机文件名? Romain和我在我们的文档中另有争议。



最后,动态加载的R模块的调试很难。 (强制性)撰写R扩展程序中的整个部分关于它,Doug Bates一两次发布了一个教程,介绍如何通过 ESS 和Emacs我总是忘记他发布的地方;曾经是IIRC在 rcpp-devel list )。



编辑2012年7月 - 7月:



你的一步一步:




  • (序言:我已经使用gcc和g ++多年了,即使我添加 - g我不总是把-O2变成-O0,我真的不知道你需要什么,但是你要求的是...)


  • 将您的环境变量CXXFLAGS设置为-g -O0 -Wall。有许多方法可以做到这一点,一些是平台依赖的(例如Windows控制面板),因此不那么普遍和有趣。我在Windows和Unix上使用〜/ .R / Makevars 。您可以使用它,或者您可以覆盖R的系统级$ RHOME / etc / Makeconf,或者您可以使用Makeconf.site或...查看完整的文档---但正如我所说,〜/ .R / Makevars 是我的首选方式,因为它不会干扰R之外的编译。


  • 现在每个编译R都通过R CMD SHLIB,R CMD COMPILE,R CMD INSTALL,...将使用。因此,您不再需要使用内联或本地软件包。继续内联...


  • 其余的,我们主要遵循编写R扩展的第4.4.1节动态加载代码查找入口点 :


  • 使用R -d gdb启动另一个R会话。


  • 编译代码。对于





  fun<  -  cxxfunction signature(),plugin =Rcpp,verbose = TRUE,body ='
int theAnswer = 42;
return wrap(theAnswer);
')


我得到

 code> [...] 
编译参数:
/ usr / lib / R / bin / R CMD SHLIB file11673f928501.cpp 2> file11673f928501.cpp.err.txt
ccache g ++ - 4.6 -I / usr / share / R / include -DNDEBUG -I/ usr / local / lib / R / site-library / Rcpp / include-fpic - g -O0 -Wall -c file11673f928501.cpp -o file11673f928501.o
g ++ - 4.6 -shared -o file11673f928501.so file11673f928501.o -L / usr / local / lib / R / site-library / Rcpp / lib -lRcpp -Wl,-rpath,/ usr / local / lib / R / site-library / Rcpp / lib -L ​​/ usr / lib / R / lib -lR




  • 调用例如 tempdir()查看临时目录,cd临时上面使用的目录和 dyn.load()上面构建的文件:




  dyn.load(file11673f928501.so)


< blockquote>


  • 现在通过发送中断信号来暂停R(在Emacs中,从下拉菜单中选择一个简单的选项)。


  • 在gdb中设置断点。上面的单个作业成为我的第32行,所以





  break file11673f928501.cpp 32 
cont





  • 返回R,调用函数:


    fun()



  • Presto,在我们想要的断点处的调试器中:





  R> fun()

断点1,file11673f928501()在file11673f928501.cpp:32
32 int theAnswer = 42;
(gdb)





  • 现在,正如你刚才所说的那样,将gdb加入到它的魔法中。



这将更容易(在我眼中)通过一个简单的包,其中 Rcpp.package.skeleton()可以为您写信,因为您不必处理随机目录,文件名。但是每个都是自己的...


Recently I've been experimenting with Rcpp (inline) to generate DLLs that perform various tasks on supplied R inputs. I'd like to be able to debug the code in these DLLs line by line, given a specific set of R inputs. (I'm working under Windows.)

To illustrate, let's consider a specific example that anybody should be able to run...

The code below is a really simple cxxfunction which simply doubles the input vector. Note however that there's an additional variable myvar that changes value a few times but doesn't affect the output - this has been added so that we'll be able to see when the debugging process is running correctly.

library(inline)
library(Rcpp)

f0 <- cxxfunction(signature(a="numeric"), plugin="Rcpp", body='
    Rcpp::NumericVector xa(a);
    int myvar = 19;
    int na = xa.size();
    myvar = 27;
    Rcpp::NumericVector out1(na);
    for(int i=0; i < na; i++) {
        out1[i] = 2*xa[i];
        myvar++;
    }
    myvar = 101;
    return(Rcpp::List::create( _["out1"] = out1));
')

After we run the above, typing the command

getLoadedDLLs()

brings up a list of DLLs in the R session. The last one listed should be the DLL created by the above process - it has a random temporary name, which in my case is

file7e61645c

The "Filename" column shows that cxxfunction has put this DLL in the location tempdir(), which for me is currently

C:/Users/TimP/AppData/Local/Temp/RtmpXuxtpa/file7e61645c.dll

Now, the obvious way to call the DLL is via f0, as follows

> f0(c(-7,0.7,77))

$out1
[1] -14.0   1.4 154.0

But we can of course also call the DLL directly by name using the .Call command:

> .Call("file7e61645c",c(-7,0.7,77))

$out1
[1] -14.0   1.4 154.0

So I've reached the point where I'm calling a standalone DLL directly with R input (here, the vector c(-7,0.7,77)), and having it return the answer correctly to R.

What I really need, though, is a facility for line-by-line debugging (using gdb, I presume) that will allow me to observe the value of myvar being set to 19, 27, 28, 29, 30, and finally 101 as the code progresses. The example above is deliberately set up so that calling the DLL tells us nothing about myvar.

To clarify, the "win condition" here is being able to observe myvar changing (seeing the value myvar=19 would be the first step!) without adding anything else to the body of the code. This obviously may require changes to the way in which the code is compiled (are there debugging mode settings to turn on?), or the way R is called - but I don't know where to begin. As noted above, all of this is Windows-based.

Final note: In my experiments, I actually made some minor modifications to a copy of cxxfunction so that the output DLL - and the code within it - receives a user-defined name and sits in a user-defined directory, rather than a temporary name and location. But this doesn't affect the essence of the question. I mention this just to emphasise that it should be fairly easy to alter the compilation settings if someone gives me a nudge :)

For completeness, setting verbose=TRUE in the original cxxfunction call above shows the compilation argument to be of the following form:

C:/R/R-2.13.2/bin/i386/R CMD SHLIB file7e61645c.cpp 2> file7e61645c.cpp.err.txt 
g++ -I"C:/R/R-213~1.2/include"    -I"C:/R/R-2.13.2/library/Rcpp/include"      -O2 -Wall  -c file7e61645c.cpp -o file7e61645c.o
g++ -shared -s -static-libgcc -o file7e61645c.dll tmp.def file7e61645c.o C:/R/R-2.13.2/library/Rcpp/lib/i386/libRcpp.a -LC:/R/R-213~1.2/bin/i386 -lR

My adapted version has a compilation argument identical to the above, except that the string "file7e61645c" is replaced everywhere by the user's choice of name (e.g. "testdll") and the relevant files copied over to a more permanent location.

Thanks in advance for your help guys :)

解决方案

I am a little stunned by the obsession some Rcpp users have with the inline package and its cxxfunction(). Yes, it is indeed very helpful and it has surely has driven adoption of Rcpp further as it makes quick experimentation so much easier. Yes, it allowed us to use 700+ unit tests in the sources. Yes, I use it all the time to demonstrate examples here, on the rcpp-devel list or even live in presentations.

But does that mean we should use it for each and every task? Does it mean that it does not have "costs" such as randomized filenames in a temporary directory etc pp? Romain and I argued otherwise in our documentation.

Lastly, debugging of dynamically loaded R modules is difficult as it stands. There is an entire section in the (mandatory) Writing R Extensions about it, and Doug Bates once or twice posted a tutorial about how to do this via ESS and Emacs (though I always forget where he posted it; once was IIRC on the rcpp-devel list).

Edit 2012-Jul-07:

Here is your step by step:

  • (Preamble: I've used gcc and g++ for many years, and even when I add -g I don't always turn -O2 into -O0. I am really not sure you need that, but as you ask for it...)

  • Set your environment variable CXXFLAGS to "-g -O0 -Wall". There numerous ways to do it, some are platform-dependent (eg Windows control panel) and therefore less universal and interesting. I use ~/.R/Makevars on Windows and Unix. You could use that, or you could override R's system-wide $RHOME/etc/Makeconf or you could use Makeconf.site or ... See the full docs---but as I said, ~/.R/Makevars is my preferred way as it does NOT interfere with compilation outside of R.

  • Now every compilation R does via R CMD SHLIB, R CMD COMPILE, R CMD INSTALL, ... will use. So it no longer matters you use inline or a local package. Continuing with inline...

  • For the rest, we mostly follow 'Section 4.4.1 Finding entry points in dynamically loaded code' of "Writing R Extensions":

  • Start another R session with R -d gdb.

  • Compile your code. For

fun <- cxxfunction(signature(), plugin="Rcpp", verbose=TRUE, body='
   int theAnswer = 42;
   return wrap(theAnswer);
')

I get

[...]
Compilation argument:
 /usr/lib/R/bin/R CMD SHLIB file11673f928501.cpp 2> file11673f928501.cpp.err.txt 
 ccache g++-4.6 -I/usr/share/R/include -DNDEBUG   -I"/usr/local/lib/R/site- library/Rcpp/include"   -fpic  -g -O0 -Wall -c file11673f928501.cpp -o file11673f928501.o
g++-4.6 -shared -o file11673f928501.so file11673f928501.o -L/usr/local/lib/R/site-library/Rcpp/lib -lRcpp -Wl,-rpath,/usr/local/lib/R/site-library/Rcpp/lib -L/usr/lib/R/lib -lR

  • Invoke eg tempdir() to see the temporary directory, cd to this temporary directory used above and dyn.load() the file built above:

 dyn.load("file11673f928501.so")

  • Now suspend R by sending a break signal (in Emacs, a simple choice from a drop-down).

  • In gdb, set a breakpoint. The single assignment above became line 32 for me, so

break file11673f928501.cpp 32
cont

  • Back in R, call the function:

    fun()

  • Presto, in the debugger at the break point we wanted:

R> fun()

Breakpoint 1, file11673f928501 () at file11673f928501.cpp:32
32      int theAnswer = 42;
(gdb) 

  • Now it is "just" up to you to work gdb to its magic

Now, as I said in my first attempt, all this would be easier (in my eyes) via a simple package which Rcpp.package.skeleton() can write for you as you don't have to deal with randomized directories and filenames. But each to their own...

这篇关于在Windows下调试(一行一行)Rcpp生成的DLL的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆