根据经验估算大时间效率 [英] Empirically estimating big-oh time efficiency

查看:88
本文介绍了根据经验估算大时间效率的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想通过基准测试估算库中某些方法的性能。我不需要精确度 - 它足以证明某些东西是O(1),O(logn),O(n),O(nlogn),O(n ^ 2)或更糟。因为大哦意味着上限,估计O(logn)的东西是O(log logn)不是问题。

I'd like to estimate the big-oh performance of some methods in a library through benchmarks. I don't need precision -- it suffices to show that something is O(1), O(logn), O(n), O(nlogn), O(n^2) or worse than that. Since big-oh means upper-bound, estimating O(logn) for something that is O(log logn) is not a problem.

现在,我在想找到最适合每个大数据的数据的常数乘数k(但会覆盖所有结果),然后选择最适合的大哦。

Right now, I'm thinking of finding the constant multiplier k that best fits data for each big-oh (but will top all results), and then choosing the big-oh with the best fit.


  1. 有没有更好的方式来做这件事,而不是我的意思?如果是这样,它们是什么?

  2. 否则,是否有人可以指出算法来估算k以获得最佳拟合,并比较每条曲线与数据拟合的程度?



Notes&约束



鉴于到目前为止的评论,我需要做一些清楚的事情:

Notes & Constraints

Given the comments so far, I need to make a few things clear:


  • 这需要自动化。我无法查看数据并进行判断调用。

  • 我将使用多个 n 对方法进行基准测试大小。对于每个 n 的大小,我将使用经过验证的基准框架来提供可靠的统计结果。

  • 我事先知道大多数将要测试的方法。我的主要目的是为它们提供性能回归测试。

  • 代码将用Scala编写,并且可以使用任何免费的Java库。

  • This needs to be automated. I can't "look" at data and make a judgment call.
  • I'm going to benchmark the methods with multiple n sizes. For each size n, I'm going to use a proven benchmark framework that provides reliable statistical results.
  • I actually know beforehand the big-oh of most of the methods that will be tested. My main intention is to provide performance regression testing for them.
  • The code will be written in Scala, and any free Java library can be used.

这是我要测量的那种东西的一个例子。我有一个带有此签名的方法:

Here's one example of the kind of stuff I want to measure. I have a method with this signature:

def apply(n: Int): A

给定 n ,它将返回序列的第n个元素。给定现有实现,该方法可以具有O(1),O(logn)或O(n),并且小的改变可以使其错误地使用次优实现。或者,更容易的是,可以使用取决于的其他方法来使用它的次优版本。

Given an n, it will return the nth element of a sequence. This method can have O(1), O(logn) or O(n) given the existing implementations, and small changes can get it to use a suboptimal implementation by mistake. Or, more easily, could get some other method that depends on it to use a suboptimal version of it.

推荐答案

为了开始,你必须做出几个假设。

In order to get started, you have to make a couple of assumptions.


  1. n 与任何常数项相比都很大。

  2. 您可以有效地随机化输入数据

  3. 您可以以足够的密度进行采样以获得很好地处理运行时的分布

  1. n is large compared to any constant terms.
  2. You can effectively randomize your input data
  3. You can sample with sufficient density to get a good handle on the distribution of runtimes

特别是,(3)很难与(1)一致。所以你可能得到一些具有指数最坏情况的东西,但从未遇到过最坏的情况,因此认为你的算法比它平均好得多。

In particular, (3) is difficult to achieve in concert with (1). So you may get something with an exponential worst case, but never run into that worst case, and thus think your algorithm is much better than it is on average.

说,你需要的是任何标准的曲线拟合库。 Apache Commons Math 有一个完全足够的。然后,您可以创建一个包含要测试的所有常用术语的函数(例如,常量,log n,n,n log n,n n,n n * n,e ^ n),或者你获取你的数据的日志并适合指数,然后如果你得到一个不接近整数的指数,看看抛出一个log n是否更适合。

With that said, all you need is any standard curve fitting library. Apache Commons Math has a fully adequate one. You then either create a function with all the common terms that you want to test (e.g. constant, log n, n, n log n, nn, nn*n, e^n), or you take the log of your data and fit the exponent, and then if you get an exponent not close to an integer, see if throwing in a log n gives a better fit.

(更详细地说,如果你符合 C * x ^ a C a ,或者更容易 log C + a log x ,你可以得到指数 a ;在all-common-term-at-once计划中,你将获得每个术语的权重,所以如果你有 n * n + C * n * log(n)其中 C 很大,你也可以拿到那个词。)

(In more detail, if you fit C*x^a for C and a, or more easily log C + a log x, you can get the exponent a; in the all-common-terms-at-once scheme, you'll get weights for each term, so if you have n*n + C*n*log(n) where C is large, you'll pick up that term also.)

你想要的足够大小改变大小,以便你可以区分不同的情况(如果你关心那些可能很难用日志术语),并且安全地比你有参数更多不同的大小(可能3倍过量会开始没事,只要正如你所做的那样至少运行十几个)。

You'll want to vary the size by enough so that you can tell the different cases apart (might be hard with log terms, if you care about those), and safely more different sizes than you have parameters (probably 3x excess would start being okay, as long as you do at least a dozen or so runs total).

编辑:这是Scala代码,可以为您完成所有这些操作。我不会解释每一件小事,而是留给你调查;它使用C * x ^ a fit实现上面的方案,并返回((a,C),(a的上限为a的上限))。边界是相当保守的,你可以看到几次运行这个东西。 C 的单位是秒( a 是无单位的),但不要相信 很多因为有一些循环开销(以及一些噪音)。

Here is Scala code that does all this for you. Rather than explain each little piece, I'll leave it to you to investigate; it implements the scheme above using the C*x^a fit, and returns ((a,C),(lower bound for a, upper bound for a)). The bounds are quite conservative, as you can see from running the thing a few times. The units of C are seconds (a is unitless), but don't trust that too much as there is some looping overhead (and also some noise).

class TimeLord[A: ClassManifest,B: ClassManifest](setup: Int => A, static: Boolean = true)(run: A => B) {
  @annotation.tailrec final def exceed(time: Double, size: Int, step: Int => Int = _*2, first: Int = 1): (Int,Double) = {
    var i = 0
    val elapsed = 1e-9 * {
      if (static) {
        val a = setup(size)
        var b: B = null.asInstanceOf[B]
        val t0 = System.nanoTime
        var i = 0
        while (i < first) {
          b = run(a)
          i += 1
        }
        System.nanoTime - t0
      }
      else {
        val starts = if (static) { val a = setup(size); Array.fill(first)(a) } else Array.fill(first)(setup(size))
        val answers = new Array[B](first)
        val t0 = System.nanoTime
        var i = 0
        while (i < first) {
          answers(i) = run(starts(i))
          i += 1
        }
        System.nanoTime - t0
      }
    }
    if (time > elapsed) {
      val second = step(first)
      if (second <= first) throw new IllegalArgumentException("Iteration size increase failed: %d to %d".format(first,second))
      else exceed(time, size, step, second)
    }
    else (first, elapsed)
  }

  def multibench(smallest: Int, largest: Int, time: Double, n: Int, m: Int = 1) = {
    if (m < 1 || n < 1 || largest < smallest || (n>1 && largest==smallest)) throw new IllegalArgumentException("Poor choice of sizes")
    val frac = (largest.toDouble)/smallest
    (0 until n).map(x => (smallest*math.pow(frac,x/((n-1).toDouble))).toInt).map{ i => 
      val (k,dt) = exceed(time,i)
      if (m==1) i -> Array(dt/k) else {
        i -> ( (dt/k) +: (1 until m).map(_ => exceed(time,i,first=k)).map{ case (j,dt2) => dt2/j }.toArray )
      }
    }.foldLeft(Vector[(Int,Array[Double])]()){ (acc,x) =>
      if (acc.length==0 || acc.last._1 != x._1) acc :+ x
      else acc.dropRight(1) :+ (x._1, acc.last._2 ++ x._2)
    }
  }

  def alpha(data: Seq[(Int,Array[Double])]) = {
    // Use Theil-Sen estimator for calculation of straight-line fit for exponent
    // Assume timing relationship is t(n) = A*n^alpha
    val dat = data.map{ case (i,ad) => math.log(i) -> ad.map(x => math.log(i) -> math.log(x)) }
    val slopes = (for {
      i <- dat.indices
      j <- ((i+1) until dat.length)
      (pi,px) <- dat(i)._2
      (qi,qx) <- dat(j)._2
    } yield (qx - px)/(qi - pi)).sorted
    val mbest = slopes(slopes.length/2)
    val mp05 = slopes(slopes.length/20)
    val mp95 = slopes(slopes.length-(1+slopes.length/20))
    val intercepts = dat.flatMap{ case (i,a) => a.map{ case (li,lx) => lx - li*mbest } }.sorted
    val bbest = intercepts(intercepts.length/2)
    ((mbest,math.exp(bbest)),(mp05,mp95))
  }
}

请注意 multibench 方法预计需要大约sqrt(2) n m *时间来运行,假设使用静态初始化数据并且与您正在运行的任何内容相比相对便宜。以下是一些选择参数需要运行〜15秒的示例:

Note that the multibench method is expected to take about sqrt(2)nm*time to run, assuming that static initialization data is used and is relatively cheap compared to whatever you're running. Here are some examples with parameters chosen to take ~15s to run:

val tl1 = new TimeLord(x => List.range(0,x))(_.sum)  // Should be linear
// Try list sizes 100 to 10000, with each run taking at least 0.1s;
// use 10 different sizes and 10 repeats of each size
scala> tl1.alpha( tl1.multibench(100,10000,0.1,10,10) )
res0: ((Double, Double), (Double, Double)) = ((1.0075537890632216,7.061397125245351E-9),(0.8763463348353099,1.102663784225697))

val longList = List.range(0,100000)
val tl2 = new TimeLord(x=>x)(longList.apply)    // Again, should be linear
scala> tl2.alpha( tl2.multibench(100,10000,0.1,10,10) )
res1: ((Double, Double), (Double, Double)) = ((1.4534378213477026,1.1325696181862922E-10),(0.969955396265306,1.8294175293676322))

// 1.45?!  That's not linear.  Maybe the short ones are cached?
scala> tl2.alpha( tl2.multibench(9000,90000,0.1,100,1) )
res2: ((Double, Double), (Double, Double)) = ((0.9973235607566956,1.9214696731124573E-9),(0.9486294398193154,1.0365312207345019))

// Let's try some sorting
val tl3 = new TimeLord(x=>Vector.fill(x)(util.Random.nextInt))(_.sorted)
scala> tl3.alpha( tl3.multibench(100,10000,0.1,10,10) )
res3: ((Double, Double), (Double, Double)) = ((1.1713142886974603,3.882658025586512E-8),(1.0521099621639414,1.3392622111121666))
// Note the log(n) term comes out as a fractional power
// (which will decrease as the sizes increase)

// Maybe sort some arrays?
// This may take longer to run because we have to recreate the (mutable) array each time
val tl4 = new TimeLord(x=>Array.fill(x)(util.Random.nextInt), false)(java.util.Arrays.sort)
scala> tl4.alpha( tl4.multibench(100,10000,0.1,10,10) )
res4: ((Double, Double), (Double, Double)) = ((1.1216172965292541,2.2206198821180513E-8),(1.0929414090177318,1.1543697719880128))

// Let's time something slow
def kube(n: Int) = (for (i <- 1 to n; j <- 1 to n; k <- 1 to n) yield 1).sum
val tl5 = new TimeLord(x=>x)(kube)
scala> tl5.alpha( tl5.multibench(10,100,0.1,10,10) )
res5: ((Double, Double), (Double, Double)) = ((2.8456382116915484,1.0433534274508799E-7),(2.6416659356198617,2.999094292838751))
// Okay, we're a little short of 3; there's constant overhead on the small sizes

无论如何,对于规定的用例 - 你要检查的地方确保订单不会改变 - 这可能已经足够了,因为在设置测试时你可以稍微玩一下这些值,以确保它们能给出合理的信息。人们还可以创建寻找稳定性的启发式方法,但这可能有点过分。

Anyway, for the stated use case--where you are checking to make sure the order doesn't change--this is probably adequate, since you can play with the values a bit when setting up the test to make sure they give something sensible. One could also create heuristics that search for stability, but that's probably overkill.

(顺便提一下,这里没有明确的预热步骤; Theil-Sen估算器的稳健拟合应该使它不需要合理的大基准。这也是为什么我不使用任何其他的长凳框架;它的任何统计数据只是失去了这个测试的力量。)

(Incidentally, there is no explicit warmup step here; the robust fitting of the Theil-Sen estimator should make it unnecessary for sensibly large benchmarks. This also is why I don't use any other benching framework; any statistics that it does just loses power from this test.)

再次编辑:如果您使用以下内容替换 alpha 方法:

Edit again: if you replace the alpha method with the following:

  // We'll need this math
  @inline private[this] def sq(x: Double) = x*x
  final private[this] val inv_log_of_2 = 1/math.log(2)
  @inline private[this] def log2(x: Double) = math.log(x)*inv_log_of_2
  import math.{log,exp,pow}

  // All the info you need to calculate a y value, e.g. y = x*m+b
  case class Yp(x: Double, m: Double, b: Double) {}

  // Estimators for data order
  //   fx = transformation to apply to x-data before linear fitting
  //   fy = transformation to apply to y-data before linear fitting
  //   model = given x, slope, and intercept, calculate predicted y
  case class Estimator(fx: Double => Double, invfx: Double=> Double, fy: (Double,Double) => Double, model: Yp => Double) {}
  // C*n^alpha
  val alpha = Estimator(log, exp, (x,y) => log(y), p => p.b*pow(p.x,p.m))
  // C*log(n)*n^alpha
  val logalpha = Estimator(log, exp, (x,y) =>log(y/log2(x)), p => p.b*log2(p.x)*pow(p.x,p.m))

  // Use Theil-Sen estimator for calculation of straight-line fit
  case class Fit(slope: Double, const: Double, bounds: (Double,Double), fracrms: Double) {}
  def theilsen(data: Seq[(Int,Array[Double])], est: Estimator = alpha) = {
    // Use Theil-Sen estimator for calculation of straight-line fit for exponent
    // Assume timing relationship is t(n) = A*n^alpha
    val dat = data.map{ case (i,ad) => ad.map(x => est.fx(i) -> est.fy(i,x)) }
    val slopes = (for {
      i <- dat.indices
      j <- ((i+1) until dat.length)
      (pi,px) <- dat(i)
      (qi,qx) <- dat(j)
    } yield (qx - px)/(qi - pi)).sorted
    val mbest = slopes(slopes.length/2)
    val mp05 = slopes(slopes.length/20)
    val mp95 = slopes(slopes.length-(1+slopes.length/20))
    val intercepts = dat.flatMap{ _.map{ case (li,lx) => lx - li*mbest } }.sorted
    val bbest = est.invfx(intercepts(intercepts.length/2))
    val fracrms = math.sqrt(data.map{ case (x,ys) => ys.map(y => sq(1 - y/est.model(Yp(x,mbest,bbest)))).sum }.sum / data.map(_._2.length).sum)
    Fit(mbest, bbest, (mp05,mp95), fracrms)
  }

然后你也可以在有一个日志条件的时候得到指数的估计值 - 存在错误估计来选择对数是否是正确的方式,但是你可以接听电话(即我我假设你最初会监督它并读取掉的数字:

then you can get an estimate of the exponent when there's a log term also--error estimates exist to pick whether the log term or not is the correct way to go, but it's up to you to make the call (i.e. I'm assuming you'll be supervising this initially and reading the numbers that come off):

val tl3 = new TimeLord(x=>Vector.fill(x)(util.Random.nextInt))(_.sorted)
val timings = tl3.multibench(100,10000,0.1,10,10)

// Regular n^alpha fit
scala> tl3.theilsen( timings )
res20: tl3.Fit = Fit(1.1811648421030059,3.353753446942075E-8,(1.1100382697696545,1.3204652930525234),0.05927994882343982)

// log(n)*n^alpha fit--note first value is closer to an integer
//   and last value (error) is smaller
scala> tl3.theilsen( timings, tl3.logalpha )
res21: tl3.Fit = Fit(1.0369167329732445,9.211366397621766E-9,(0.9722967182484441,1.129869067913768),0.04026308919615681)

(编辑:修正了RMS计算,所以它实际上是平均值,加上证明你只需要做一次计时,然后可以尝试两种拟合。)

( fixed the RMS computation so it's actually the mean, plus demonstrated that you only need to do timings once and can then try both fits.)

这篇关于根据经验估算大时间效率的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆