基准:使用`expression``quote`或者没有 [英] Benchmarking: using `expression` `quote` or neither

查看:122
本文介绍了基准:使用`expression``quote`或者没有的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

一般来说,当我运行基准测试时,我将我的语句包裹在表达式中。
最近有人建议(a)不这样做,或(b)使用引用而不是表达式。



我发现包装语句有两个优点:




  • 与整个语句相比,它们更容易交换出来。

  • 我可以通过一个输入列表,并比较这些结果



然而,探索不同的方法,我注意到三种方法之间的差异(在表达式中包装,包装在 quote 中包装)



问题是:

为什么错误?

(似乎在引用中的包装实际上没有评估该呼叫。)



示例:



 #SAMPLE DATA 
mat< - matrix(sample(seq(1e6),4 ^ 1e4,T),ncol = 400)

#RAW表达式到基准IS:
#apply(mat,2,mean)

#WrapPED EXPRESSION:
expr< - 表达式(apply(mat,2,mean))
quot< - quote(apply(mat,2,mean))

#BENCHMARKS
基准(raw = apply(mat,2,mean),expr,)[, - (7:8)]
#测试复制已经过了相对user.self sys.self
#2 expr 100 1.269 NA 1.256 0.019
#3 100 100 NA NA 0.00 0.000
#1 raw 100 1.494 NA 1.286 0.021


#BENCHMARKED INDIVIDUALLY
基准$($)$($)$($)$($)$($)$($) 8)]

#results
#test replica已经过了相对user.self sys.self
#1 raw 100 1.274 1 1.26 0.018
#test replises elapsed相对user.self sys.self
#1 expr 100 1.476 1 1.342 0.021
#test replications elapsed相对user.self sys.self
#1 100 100 0.006 1 0.006 0.001


解决方案

您的问题是引用不会产生表达式,而是调用,所以在基准调用中,没有表达式来评估。



如果您评估调用,它将实际得到评估,并且时间合理。

  class()
[1]call
> class(expr)
[1]表达式


基准(raw = apply(mat,2,mean),expr,eval())[, - (7:8)]
测试复制已经过了相对user.self sys.self
3 eval(quot)100 0.76 1 .000 0.77 0
2 expr 100 0.83 1.092 0.83 0
1 raw 100 0.78 1.026 0.78 0

一般来说,我倾向于创建一个包含我希望基准的调用/进程的函数。请注意,最好包括将结果分配给一个值。



例如

  raw<  -  function(){x<  -  apply(mat,2,mean)} 

在这种情况下,它看起来像 eval(quote(...))稍有改进。

  benchmark(raw(),eval(quote(raw()))

测试复制已过去相对user.self sys.self
2 eval(quote(raw()))100 0.76 1.000 0.75 0.01
1 raw()100 0.80 1.053 0.80 0.00
但是经常这些微小的差异可能是由于功能中的开销,可能不会反映性能如何扩大到更大的问题。看到许多问题,基准为 data.table 解决方案,使用少量复制,但大数据可能更好地反映性能。


Generally, when I run benchmarks, I wrap my statements in expression. Recently, it was suggested to either (a) not do so or (b) use quote instead of expression.

I find two advantages to wrapping the statements:

  • compared to entire statements, they are more easily swapped out.
  • I can lapply over a list of inputs, and compare those results

However, in exploring the different methods, I noticed a discrepency between the three methods (wrapping in expression, wrapping in quote, or not wrapping at all)

The question is:
Why the discrepency?
(it appears that wrapping in quote does not actually evaluate the call.)

EXAMPLE:

# SAMPLE DATA
  mat <-  matrix(sample(seq(1e6), 4^2*1e4, T), ncol=400) 

# RAW EXPRESSION TO BENCHMARK IS: 
  # apply(mat, 2, mean)

# WRAPPED EXPRESSION: 
  expr <- expression(apply(mat, 2, mean))
  quot <- quote(apply(mat, 2, mean))

# BENCHMARKS
  benchmark(raw=apply(mat, 2, mean), expr, quot)[, -(7:8)]
  #    test replications elapsed relative user.self sys.self
  #  2 expr          100   1.269       NA     1.256    0.019
  #  3 quot          100   0.000       NA     0.001    0.000
  #  1  raw          100   1.494       NA     1.286    0.021


# BENCHMARKED INDIVIDUALLY 
  benchmark(raw=apply(mat, 2, mean))[, -(7:8)]
  benchmark(expr)[, -(7:8)]
  benchmark(quot)[, -(7:8)]

  # results
  #    test replications elapsed relative user.self sys.self
  #  1  raw          100   1.274        1      1.26    0.018
  #    test replications elapsed relative user.self sys.self
  #  1 expr          100   1.476        1     1.342    0.021
  #    test replications elapsed relative user.self sys.self
  #  1 quot          100   0.006        1     0.006    0.001

解决方案

Your issue is that quote does not produce an expression but a call, so within the call to benchmark, there is no expression to evaluate.

If you evaluate the `call it will actually get evaluated, and the timings are reasonable.

class(quot)
[1] "call"
>class(expr)
[1] "expression"


 benchmark(raw=apply(mat, 2, mean), expr, eval(quot))[, -(7:8)]
        test replications elapsed relative user.self sys.self
3 eval(quot)          100    0.76    1.000      0.77        0
2       expr          100    0.83    1.092      0.83        0
1        raw          100    0.78    1.026      0.78        0

In general, I tend to create a function that contains the call / process I wish to benchmark. Note that it is good practice to include things like assigning the result to a value.

eg

 raw <- function() {x <- apply(mat, 2, mean)}

In which case it looks like that there is a slight improvement by eval(quote(...)).

benchmark(raw(), eval(quote(raw()))

                test replications elapsed relative user.self sys.self 
2 eval(quote(raw()))          100    0.76    1.000      0.75     0.01        
1              raw()          100    0.80    1.053      0.80     0.00        

But often these small differences can be due to overheads in functions and may not reflect how the performance scales to larger problems. See the many questions with benchmarkings of data.table solutions, using a small number of replications but big data may better reflect performance.

这篇关于基准:使用`expression``quote`或者没有的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆