如何优化R性能 [英] How to optimize R performance

查看:160
本文介绍了如何优化R性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个最近要尝试的性能基准.我们有一个大型脚本,在Redhat Linux机器上,性能似乎比规格相当的Windows 7笔记本电脑慢50%. linux计算机使用kvm进行了虚拟化,并为其分配了4个内核以及16GB内存.该脚本不是io密集型的,但是有很多for循环.我主要是想知道是否可以使用R编译选项进行优化,或者是否有任何内核编译器选项可能有助于使其更具有可比性.任何指针,将不胜感激.我将尝试获得另一台机器,并在使用生金属的情况下对其进行测试,以便进行更好的比较.

We have a recent performance bench mark that I am trying to understand. We have a large script that performance appears 50% slower on a Redhat Linux machine than a Windows 7 laptop where the specs are comparable. The linux machine is virtualized using kvm and has 4 cores assigned to it along with 16GB of memory. The script is not io intensive but has quite a few for loops. Mainly I am wondering if there are any R compile options that I can use to optimize or any kernel compiler options that might help to make this more comparable. Any pointers would be appreciated. I will try to get another machine and test it in using raw metal also for a better comparison.

这些是我用来在Linux机器上编译R的configure标志.我已经进行了很多实验,对于较大的数据集,这似乎将绿色的执行时间减少了12秒.基本上,使用这些选项,我从2.087秒缩短到1.48秒.

These are the configure flags that I am using to compile R on the linux machine. I have experimented quite a bit and this seems to cut 12 seconds off the execution time in the green for larger data sets. Essentially I went from 2.087 to 1.48 seconds with these options.

./configure CFLAGS="-O3 -g -std=gnu99" CXXFLAGS="-O3 -g" FFLAGS="-O2 -g" LDFLAGS="-Bdirect,--hash-stype=both,-Wl,-O1" --enable-R-shlib --without-x --with-cairo --with-libpng --with-jpeglib

更新1

该脚本尚未优化.另一个小组实际上正在研究脚本,我们提出了使用apply函数的请求,但不确定这如何解释时代的差异.

The script has not been optimized yet. Another group is actually working on the script and we have put in requests to use the apply function but not sure how this explains the disparity in the times.

个人资料的顶部看起来像这样.这些功能中的大多数功能将在以后使用apply功能进行优化,但现在这是在两台计算机上将苹果标记为苹果的基准.

The top of the profile looks like this. Most of these functions will later be optimized using the apply functions but right now it is bench marked apples to apples on both machines.

"eval.with.vis"                    8.66    100.00      0.12     1.39
"source"                           8.66    100.00      0.00     0.00
"["                                5.38     62.12      0.16     1.85
"GenerateQCC"                      5.30     61.20      0.00     0.00
"[.data.frame"                     5.20     60.05      2.40    27.71
"ZoneCalculation"                  5.12     59.12      0.02     0.23
"cbind"                            3.22     37.18      0.34     3.93
"[["                               2.26     26.10      0.38     4.39
"[[.data.frame"                    1.88     21.71      0.82     9.47

我的第一个怀疑是我将很快进行测试,并根据我的发现进行更新,这归咎于KVM linux虚拟化.该脚本占用大量内存,并且由于大量的数组操作以及R通过复制传递(当然必须通过malloc传递),因此可能会导致问题.由于VM无法直接访问内存控制器,必须与其他VM共享它,因此很可能会引起问题.今天晚些时候,我将获得一台原始机器,并将更新我的发现.

My first suspicion and I will be testing shortly and updating with my findings is that KVM linux virtualization is to blame. This script is very memory intensive and due to the large number of array operations and R being pass by copy ( which of course has to malloc ) this may be causing the problem. Since the VM does not have direct access to the memory controller and must share it with it's other VM's this could very likely cause the problem. I will be getting a raw machine later on today and will update with my findings.

谢谢大家的快速更新.

更新2

我们本来以为性能问题的原因是由VM的超线程引起的,但事实证明这是不正确的,并且在裸机上的性能是相同的.

We originally thought the cause of the performance problem was caused by hyper threading with a VM, but this turned out to be incorrect and performance was the same on a bare metal machine comparatively.

我们后来意识到Windows笔记本电脑正在使用32位版本的R进行计算.这导致我们尝试使用64位版本的R,结果在相同的完全相同的脚本上比32位慢了约140%.这引出我一个问题,即64位比32位版本的R慢大约140%?

We later realized that the windows laptop is using a 32 bit version of R for computations. This led us to try the 64 bit version of R and the result was ~140% slower than 32 bit on the same exact same script. This leads me to the question of how is it possible that the 64 bit could be ~140% slower than the 32 bit version of R?

我们看到的是32

Windows 32位执行时间48秒 Windows 64位执行时间为2.33秒.

Windows 32 bit execution time 48 seconds Windows 64 bit execution time 2.33 seconds.

Linux 64位执行时间为2.15秒. Linux 32位执行时间<正在进行中>(在RHEL 6.3 x86_64上构建了32位版本,但是没有看到性能改进会重新加载RHEL 6.3的32位版本)

Linux 64 bit execution time 2.15 seconds. Linux 32 bit execution time < in progress > ( Built a 32 bit version on RHEL 6.3 x86_64 but did not see performance improvement am going to reload with 32 bit version of RHEL 6.3 )

我找到了此链接,但它仅说明在某些64位计算机上命中率达到15-20%.

I found this link but it only explains a 15-20% hit on some 64 bit machines.

http://www.hep.by /gnu/r-patched/r-admin/R-admin_51.html

对不起,我无法合法发布脚本.

Sorry I cannot legally post the script.

推荐答案

此问题已解决,它是由未优化的BLAS库引起的.

The issue was resolved and it was caused by a non optimized BLAS library.

本文对您有很大的帮助.使用ATLAS是一个很大的帮助.

This article was a great help. Using ATLAS was a great help.

http://www.cybaea.net/Blogs/Data/Faster-R-through-better-BLAS.html

这篇关于如何优化R性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆