基准:使用`expression``quote`或者没有 [英] Benchmarking: using `expression` `quote` or neither
问题描述
表达式
中。 最近有人建议(a)不这样做,或(b)使用
引用
而不是表达式。 我发现包装语句有两个优点:
- 与整个语句相比,它们更容易交换出来。
- 我可以通过一个输入列表,并比较这些结果
然而,探索不同的方法,我注意到三种方法之间的差异(在表达式
中包装,包装在 quote
中包装)
问题是:
为什么错误?
(似乎在引用
中的包装实际上没有评估该呼叫。)
示例:
#SAMPLE DATA
mat< - matrix(sample(seq(1e6),4 ^ 1e4,T),ncol = 400)
#RAW表达式到基准IS:
#apply(mat,2,mean)
#WrapPED EXPRESSION:
expr< - 表达式(apply(mat,2,mean))
quot< - quote(apply(mat,2,mean))
#BENCHMARKS
基准(raw = apply(mat,2,mean),expr,)[, - (7:8)]
#测试复制已经过了相对user.self sys.self
#2 expr 100 1.269 NA 1.256 0.019
#3 100 100 NA NA 0.00 0.000
#1 raw 100 1.494 NA 1.286 0.021
#BENCHMARKED INDIVIDUALLY
基准$($)$($)$($)$($)$($)$($) 8)]
#results
#test replica已经过了相对user.self sys.self
#1 raw 100 1.274 1 1.26 0.018
#test replises elapsed相对user.self sys.self
#1 expr 100 1.476 1 1.342 0.021
#test replications elapsed相对user.self sys.self
#1 100 100 0.006 1 0.006 0.001
您的问题是引用
不会产生表达式,而是调用
,所以在基准调用中,没有表达式来评估。
如果您评估调用,它将实际得到评估,并且时间合理。
class()
[1]call
> class(expr)
[1]表达式
基准(raw = apply(mat,2,mean),expr,eval())[, - (7:8)]
测试复制已经过了相对user.self sys.self
3 eval(quot)100 0.76 1 .000 0.77 0
2 expr 100 0.83 1.092 0.83 0
1 raw 100 0.78 1.026 0.78 0
一般来说,我倾向于创建一个包含我希望基准的调用/进程的函数。请注意,最好包括将结果分配给一个值。
例如
raw< - function(){x< - apply(mat,2,mean)}
在这种情况下,它看起来像 eval(quote(...))
稍有改进。
benchmark(raw(),eval(quote(raw()))
但是经常这些微小的差异可能是由于功能中的开销,可能不会反映性能如何扩大到更大的问题。看到许多问题,基准为
测试复制已过去相对user.self sys.self
2 eval(quote(raw()))100 0.76 1.000 0.75 0.01
1 raw()100 0.80 1.053 0.80 0.00
data.table
解决方案,使用少量复制,但大数据可能更好地反映性能。Generally, when I run benchmarks, I wrap my statements in
expression
. Recently, it was suggested to either (a) not do so or (b) usequote
instead of expression.I find two advantages to wrapping the statements:
- compared to entire statements, they are more easily swapped out.
- I can lapply over a list of inputs, and compare those results
However, in exploring the different methods, I noticed a discrepency between the three methods (wrapping in expression
, wrapping in quote
, or not wrapping at all)
The question is:
Why the discrepency?
(it appears that wrapping in quote
does not actually evaluate the call.)
EXAMPLE:
# SAMPLE DATA
mat <- matrix(sample(seq(1e6), 4^2*1e4, T), ncol=400)
# RAW EXPRESSION TO BENCHMARK IS:
# apply(mat, 2, mean)
# WRAPPED EXPRESSION:
expr <- expression(apply(mat, 2, mean))
quot <- quote(apply(mat, 2, mean))
# BENCHMARKS
benchmark(raw=apply(mat, 2, mean), expr, quot)[, -(7:8)]
# test replications elapsed relative user.self sys.self
# 2 expr 100 1.269 NA 1.256 0.019
# 3 quot 100 0.000 NA 0.001 0.000
# 1 raw 100 1.494 NA 1.286 0.021
# BENCHMARKED INDIVIDUALLY
benchmark(raw=apply(mat, 2, mean))[, -(7:8)]
benchmark(expr)[, -(7:8)]
benchmark(quot)[, -(7:8)]
# results
# test replications elapsed relative user.self sys.self
# 1 raw 100 1.274 1 1.26 0.018
# test replications elapsed relative user.self sys.self
# 1 expr 100 1.476 1 1.342 0.021
# test replications elapsed relative user.self sys.self
# 1 quot 100 0.006 1 0.006 0.001
Your issue is that quote
does not produce an expression but a call
, so within the call to benchmark, there is no expression to evaluate.
If you evaluate the `call it will actually get evaluated, and the timings are reasonable.
class(quot)
[1] "call"
>class(expr)
[1] "expression"
benchmark(raw=apply(mat, 2, mean), expr, eval(quot))[, -(7:8)]
test replications elapsed relative user.self sys.self
3 eval(quot) 100 0.76 1.000 0.77 0
2 expr 100 0.83 1.092 0.83 0
1 raw 100 0.78 1.026 0.78 0
In general, I tend to create a function that contains the call / process I wish to benchmark. Note that it is good practice to include things like assigning the result to a value.
eg
raw <- function() {x <- apply(mat, 2, mean)}
In which case it looks like that there is a slight improvement by eval(quote(...))
.
benchmark(raw(), eval(quote(raw()))
test replications elapsed relative user.self sys.self
2 eval(quote(raw())) 100 0.76 1.000 0.75 0.01
1 raw() 100 0.80 1.053 0.80 0.00
But often these small differences can be due to overheads in functions and may not reflect how the performance scales to larger problems. See the many questions with benchmarkings of data.table
solutions, using a small number of replications but big data may better reflect performance.
这篇关于基准:使用`expression``quote`或者没有的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!