在 OS X 上分析 Rcpp 代码 [英] Profiling Rcpp code on OS X

查看:46
本文介绍了在 OS X 上分析 Rcpp 代码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有兴趣在 OS X (Mountain Lion 10.8.2) 下分析一些 Rcpp 代码,但分析器在运行时崩溃.

I am interested in profiling some Rcpp code under OS X (Mountain Lion 10.8.2), but the profiler crashes when being run.

玩具示例,使用 inline,只是为了花足够的时间让分析器注意到.

Toy example, using inline, just designed to take enough time for a profiler to notice.

library(Rcpp)
library(inline)

src.cpp <- "
  RNGScope scope;
  int n = as<int>(n_);
  double x = 0.0;
  for ( int i = 0; i < n; i++ )
    x += (unif_rand()-.5);
  return wrap(x);"

src.c <- "
  int i, n = INTEGER(n_)[0];
  double x = 0.0;
  GetRNGstate();
  for ( i = 0; i < n; i++ )
    x += (unif_rand()-.5);
  PutRNGstate();
  return ScalarReal(x);"

f.cpp <- cxxfunction(signature(n_="integer"), src.cpp, plugin="Rcpp")
f.c <- cfunction(signature(n_="integer"), src.c)

如果我使用 GUI Instruments(在 Xcode 中,版本 4.5 (4523))或命令行 sample,两者都会崩溃:Instruments 立即崩溃,而 sample 在崩溃之前完成处理样本:

If I use either the GUI Instruments (in Xcode, version 4.5 (4523)) or the command line sample, both crash: Instruments crashes straight away, while sample completes processing samples before crashing:

# (in R)
set.seed(1)
f.cpp(200000000L)

# (in a separate terminal window)
~ » sample R  # this invokes the profiler
Sampling process 81337 for 10 seconds with 1 millisecond of run time between samples
Sampling completed, processing symbols...
[1]    81654 segmentation fault  sample 81337

如果我使用 C 版本(即 fc(200000000L))执行相同的过程,则 Instruments 和 sample 都可以正常工作,并产生类似

If I do the same process but with the C version (i.e., f.c(200000000L)) both Instruments and sample work fine, and produce output like

Call graph:
1832 Thread_6890779   DispatchQueue_1: com.apple.main-thread  (serial)
  1832 start  (in R) + 52  [0x100000e74]
    1832 main  (in R) + 27  [0x100000eeb]
      1832 run_Rmainloop  (in libR.dylib) + 80  [0x1000e4020]
        1832 R_ReplConsole  (in libR.dylib) + 161  [0x1000e3b11]
          1832 Rf_ReplIteration  (in libR.dylib) + 514  [0x1000e3822]
            1832 Rf_eval  (in libR.dylib) + 1010  [0x1000aa402]
              1832 Rf_applyClosure  (in libR.dylib) + 849  [0x1000af5d1]
                1832 Rf_eval  (in libR.dylib) + 1672  [0x1000aa698]
                  1832 do_dotcall  (in libR.dylib) + 16315  [0x10007af3b]
                    1382 file1412f6e212474  (in file1412f6e212474.so) + 53  [0x1007fded5]  file1412f6e212474.cpp:16
                    + 862 unif_rand  (in libR.dylib) + 1127,1099,...  [0x10000b057,0x10000b03b,...]
                    + 520 fixup  (in libR.dylib) + 39,67,...  [0x10000aab7,0x10000aad3,...]
                    356 file1412f6e212474  (in file1412f6e212474.so) + 70,61,...  [0x1007fdee6,0x1007fdedd,...]  file1412f6e212474.cpp:16
                    56 unif_rand  (in libR.dylib) + 1133  [0x10000b05d]
                    38 DYLD-STUB$$unif_rand  (in file1412f6e212474.so) + 0  [0x1007fdf1c]

如果我做错了什么,如果有其他首选方式,或者这是不可能的,我真的很感激一些建议.鉴于 Rcpp 的主要用途之一似乎是加速 R 代码,我很惊讶没有找到更多关于此的信息,但也许我找错了地方.

I would really appreciate some advice into if there is anything I'm doing wrong, if there is some other preferred way, or if this is just not possible. Given that one of the main uses of Rcpp seems to be in speeding up R code, I'm surprised not to find more information on this, but perhaps I'm looking in the wrong place.

这是在 OS X 10.8.2 上使用 R 2.15.1 (x86_64-apple-darwin9.8.0)、Rcpp 0.9.15 和 g++ --version 报告i686-apple-darwin11-llvm-g++-4.2 (GCC) 4.2.1(基于 Apple Inc. build 5658)(LLVM build 2336.11.00)".

This is on OS X 10.8.2 with R 2.15.1 (x86_64-apple-darwin9.8.0), Rcpp 0.9.15, and g++ --version reports "i686-apple-darwin11-llvm-g++-4.2 (GCC) 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.11.00)".

感谢 Dirk 在下面的回答,以及他在这里的演讲 http://dirk.eddelbuettel.com/论文/ismNov2009introHPCwithR.pdf,我至少有一个使用 Google perftools 的部分解决方案.首先,从这里安装 http://code.google.com/p/gperftools/,并在编译 C++ 代码时将 -lprofiler 添加到 PKG_LIBS.那么要么

Thanks to Dirk's answer below, and his talk here http://dirk.eddelbuettel.com/papers/ismNov2009introHPCwithR.pdf, I have at least a partial solution using Google perftools. First, install from here http://code.google.com/p/gperftools/, and add -lprofiler to PKG_LIBS when compiling the C++ code. Then either

(a) 以 CPUPROFILE=samples.log R 运行 R,运行所有代码并退出(或使用 Rscript)

(a) Run R as CPUPROFILE=samples.log R, run all code and quit (or use Rscript)

(b) 使用两个小实用函数来打开/关闭分析:

(b) Use two small utility functions to turn on/off profiling:

RcppExport SEXP start_profiler(SEXP str) {
  ProfilerStart(as<const char*>(str));
  return R_NilValue;
}

RcppExport SEXP stop_profiler() {
  ProfilerStop();
  return R_NilValue;
}

然后,在 R 中你可以做

Then, within R you can do

.Call("start_profiler", "samples.log")
# code that calls C++ code to be profiled
.Call("stop_profiler")

无论哪种方式,文件 samples.log 都将包含分析信息.这可以用

either way, the file samples.log will contain profiling information. This can be looked at with

pprof --text /Library/Frameworks/R.framework/Resources/bin/exec/x86_64/R samples.log

产生像

Using local file /Library/Frameworks/R.framework/Resources/bin/exec/x86_64/R.
Using local file samples.log.
Removing __sigtramp from all stack traces.
Total: 112 samples
  64  57.1%  57.1%       64  57.1% _unif_rand
  30  26.8%  83.9%       30  26.8% _process_system_Renviron
  14  12.5%  96.4%      101  90.2% _for_profile
   3   2.7%  99.1%        3   2.7% Rcpp::internal::expr_eval_methods
   1   0.9% 100.0%        1   0.9% _Rf_PrintValueRec
   0   0.0% 100.0%        1   0.9% 0x0000000102bbc1ff
   0   0.0% 100.0%       15  13.4% 0x00007fff5fbfe06f
   0   0.0% 100.0%        1   0.9% _Rf_InitFunctionHashing
   0   0.0% 100.0%        1   0.9% _Rf_PrintValueEnv
   0   0.0% 100.0%      112 100.0% _Rf_ReplIteration

这可能会在真实示例中提供更多信息.

which would probably be more informative on a real example.

推荐答案

我很困惑,你的例子不完整:

I'm confused, your example is incomplete:

  • 你没有展示 cfunction()cxxfunction()

你没有展示你如何调用分析器

you don't show how you invoke the profiler

您没有分析 C 或 C++ 代码 (!!)

you aren't profiling the C or C++ code (!!)

您能否编辑一下问题并使其更清楚?

Can you maybe edit the question and make it clearer?

此外,当我运行它时,这两个示例确实给出了相同的速度结果,因为它们本质上是相同的.[ Rcpp 会让你在调用糖随机数函数时做到这一点.]

Also, when I run this, the two example do give identical speed results as they are essentially identical. [ Rcpp would let you do this in call with sugars random number functions. ]

R> library(Rcpp)
R> library(inline)
R> 
R> src.cpp <- "
+   RNGScope scope;
+   int n = as<int>(n_);
+   double x = 0.0;
+   for ( int i = 0; i < n; i++ )
+     x += (unif_rand()-.5);
+   return wrap(x);"
R> 
R> src.c <- "
+   int i, n = INTEGER(n_)[0];
+   double x = 0.0;
+   GetRNGstate();
+   for ( i = 0; i < n; i++ )
+     x += (unif_rand()-.5);
+   PutRNGstate();
+   return Rf_ScalarReal(x);"
R> 
R> fc   <- cfunction(signature(n_="int"), body=src.c)
R> fcpp <- cxxfunction(signature(n_="int"), body=src.c, plugin="Rcpp")
R> 
R> library(rbenchmark)
R> 
R> print(benchmark(fc(10000L), fcpp(10000L)))
         test replications elapsed relative user.self sys.self user.child sys.child
1   fc(10000)          100   0.013        1     0.012        0          0         0
2 fcpp(10000)          100   0.013        1     0.012        0          0         0
R> 

这篇关于在 OS X 上分析 Rcpp 代码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆