为什么mean()这么慢? [英] Why is mean() so slow?

查看:103
本文介绍了为什么mean()这么慢?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所有问题都在这里!我只是想做一些优化,并且出于好奇而钉住瓶颈,我尝试过:

Everything is in the question! I just tried to do a bit of optimization, and nailing down the bottle necks, out of curiosity, I tried that:

t1 <- rnorm(10)
microbenchmark(
  mean(t1),
  sum(t1)/length(t1),
  times = 10000)

结果是,mean()比手工"计算慢6倍以上!

and the result is that mean() is 6+ times slower than the computation "by hand"!

它是源于调用Internal(mean)之前mean()代码的开销,还是C代码本身更慢?为什么?是否有充分的理由,因此有很好的用例?

Does it stem from the overhead in the code of mean() before the call to the Internal(mean) or is it the C code itself which is slower? Why? Is there a good reason and thus a good use case?

推荐答案

这是由于s3查找该方法,然后需要对mean.default中的参数进行解析. (以及其他代码)

It is due to the s3 look up for the method, and then the necessary parsing of arguments in mean.default. (and also the other code in mean)

sumlength都是基本函数.这样会很快(但是您如何处理NA值?)

sum and length are both Primitive functions. so will be fast (but how are you handling NA values?)

t1 <- rnorm(10)
microbenchmark(
  mean(t1),
  sum(t1)/length(t1),
  mean.default(t1),
  .Internal(mean(t1)),
  times = 10000)

Unit: nanoseconds
                expr   min    lq median    uq     max neval
            mean(t1) 10266 10951  11293 11635 1470714 10000
  sum(t1)/length(t1)   684  1027   1369  1711  104367 10000
    mean.default(t1)  2053  2396   2738  2739 1167195 10000
 .Internal(mean(t1))   342   343    685   685   86574 10000

mean的内部位甚至比sum/length还要快.

The internal bit of mean is faster even than sum/length.

请参见 http://rwiki.sciviews. org/doku.php?id = packages:cran:data.table#method_dispatch_takes_time (

See http://rwiki.sciviews.org/doku.php?id=packages:cran:data.table#method_dispatch_takes_time (mirror) for more details (and a data.table solution that avoids .Internal).

请注意,如果我们增加向量的长度,那么原始方法是最快的

Note that if we increase the length of the vector, then the primitive approach is fastest

t1 <- rnorm(1e7)
microbenchmark(
     mean(t1),
     sum(t1)/length(t1),
     mean.default(t1),
     .Internal(mean(t1)),
+     times = 100)

Unit: milliseconds
                expr      min       lq   median       uq      max neval
            mean(t1) 25.79873 26.39242 26.56608 26.85523 33.36137   100
  sum(t1)/length(t1) 15.02399 15.22948 15.31383 15.43239 19.20824   100
    mean.default(t1) 25.69402 26.21466 26.44683 26.84257 33.62896   100
 .Internal(mean(t1)) 25.70497 26.16247 26.39396 26.63982 35.21054   100

现在方法分派只是所需的总时间"的一小部分.

Now method dispatch is only a fraction of the overall "time" required.

这篇关于为什么mean()这么慢?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆