R的适用范围是否比句法糖还要多? [英] Is R's apply family more than syntactic sugar?

查看:68
本文介绍了R的适用范围是否比句法糖还要多?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

...关于执行时间和/或内存.

...regarding execution time and / or memory.

如果不正确,请使用代码段进行证明.请注意,通过矢量化进行的加速不计算在内.加速必须来自apply(tapplysapply,...)本身.

If this is not true, prove it with a code snippet. Note that speedup by vectorization does not count. The speedup must come from apply (tapply, sapply, ...) itself.

推荐答案

R中的apply函数不能提供比其他循环函数(例如for)更高的性能. lapply是它的一个例外,它可以更快一些,因为它在C代码中比在R语言中执行更多的工作(请参见

The apply functions in R don't provide improved performance over other looping functions (e.g. for). One exception to this is lapply which can be a little faster because it does more work in C code than in R (see this question for an example of this).

但一般来说,规则是 您应该使用apply函数来提高清晰度,而不是为了提高性能 .

But in general, the rule is that you should use an apply function for clarity, not for performance.

我要补充一点, 应用函数具有没有副作用 ,这是使用R进行函数编程时的一个重要区别.可以使用assign<<-覆盖它,但这可能非常危险.副作用还使程序更难以理解,因为变量的状态取决于历史记录.

I would add to this that apply functions have no side effects, which is an important distinction when it comes to functional programming with R. This can be overridden by using assign or <<-, but that can be very dangerous. Side effects also make a program harder to understand since a variable's state depends on the history.

仅用一个简单的例子来强调这一点,该例子递归地计算斐波纳契数列;可以多次运行以获得准确的度量,但要点是,这些方法都没有明显不同的性能:

Just to emphasize this with a trivial example that recursively calculates the Fibonacci sequence; this could be run multiple times to get an accurate measure, but the point is that none of the methods have significantly different performance:

> fibo <- function(n) {
+   if ( n < 2 ) n
+   else fibo(n-1) + fibo(n-2)
+ }
> system.time(for(i in 0:26) fibo(i))
   user  system elapsed 
   7.48    0.00    7.52 
> system.time(sapply(0:26, fibo))
   user  system elapsed 
   7.50    0.00    7.54 
> system.time(lapply(0:26, fibo))
   user  system elapsed 
   7.48    0.04    7.54 
> library(plyr)
> system.time(ldply(0:26, fibo))
   user  system elapsed 
   7.52    0.00    7.58 

关于R的并行包的使用(例如rpvm,rmpi,snow),它们通常确实提供了apply系列功能(即使foreach包在本质上也等效,尽管有名称).这是snowsapply函数的简单示例:

Regarding the usage of parallel packages for R (e.g. rpvm, rmpi, snow), these do generally provide apply family functions (even the foreach package is essentially equivalent, despite the name). Here's a simple example of the sapply function in snow:

library(snow)
cl <- makeSOCKcluster(c("localhost","localhost"))
parSapply(cl, 1:20, get("+"), 3)

此示例使用套接字集群,无需为其安装其他软件;否则,您将需要PVM或MPI之类的东西(请参见 Tierney的群集页面). snow具有以下应用功能:

This example uses a socket cluster, for which no additional software needs to be installed; otherwise you will need something like PVM or MPI (see Tierney's clustering page). snow has the following apply functions:

parLapply(cl, x, fun, ...)
parSapply(cl, X, FUN, ..., simplify = TRUE, USE.NAMES = TRUE)
parApply(cl, X, MARGIN, FUN, ...)
parRapply(cl, x, fun, ...)
parCapply(cl, x, fun, ...)

应该使用apply函数进行并行执行,因为它们没有 副作用 .在for循环中更改变量值时,将对其进行全局设置.另一方面,所有apply函数都可以安全地并行使用,因为更改是函数调用的局部更改(除非您尝试使用assign<<-,否则可能会带来副作用).不用说,至关重要的是要注意局部变量与全局变量,尤其是在处理并行执行时.

It makes sense that apply functions should be used for parallel execution since they have no side effects. When you change a variable value within a for loop, it is globally set. On the other hand, all apply functions can safely be used in parallel because changes are local to the function call (unless you try to use assign or <<-, in which case you can introduce side effects). Needless to say, it's critical to be careful about local vs. global variables, especially when dealing with parallel execution.

这是一个简单的例子,用于说明for*apply在副作用方面的区别:

Here's a trivial example to demonstrate the difference between for and *apply so far as side effects are concerned:

> df <- 1:10
> # *apply example
> lapply(2:3, function(i) df <- df * i)
> df
 [1]  1  2  3  4  5  6  7  8  9 10
> # for loop example
> for(i in 2:3) df <- df * i
> df
 [1]  6 12 18 24 30 36 42 48 54 60

请注意,父环境中的df如何被for而不是*apply改变.

Note how the df in the parent environment is altered by for but not *apply.

这篇关于R的适用范围是否比句法糖还要多?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆