算法估计剩余时间与异质迭代时间密集的环 [英] Algorithm for estimating remaining time for a time-intensive loop with heterogenous iterations

查看:158
本文介绍了算法估计剩余时间与异质迭代时间密集的环的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的指令,例如(伪code)循环:

I have a loop of instructions, such as (pseudocode):

for i = 1 to 1000000
    // Process the ith input
    doSomething(input[i])
end

这需要很长的时间才能完成。我想输出某种进步,更重要的是剩下的时间估计给用户,以便他们可以决定他们是否应该只是坐在那里摆弄自己的大拇指,去获得一些咖啡,去散步,或者参加为期一周的假期欧洲而算法仰卧起坐的数量。

This takes a long time to complete. I would like to output some kind of progress and more importantly remaining time estimate to the user, so that they can decide whether they should just sit there twiddling their thumbs, go get some coffee, go for a walk, or go on a weeklong vacation to Europe while the algorithm crunches its numbers.

为了简化问题,你可以假设迭代次数会很大(例如,大于100,这样你就可以在每一个百分打印进度)。

To simplify matters, you can assume the number of iterations will be large (for instance, larger than 100, so you can print progress at every percentile).

一个常见算法是简单地测量时间的最后一次迭代了,再乘上迭代次数左并给予,作为输出。这打破了,如果每一次迭代可以改变很多在它需要多长时间来执行。

A common algorithm is to simply measure the time the last iteration took, then multiply that by the number of iterations left and give that as output. This breaks down if each iteration can vary a lot in how long it takes to execute.

另一种方法是将自第一次迭代的经过时间,由已完成的反复数,然后乘上其余迭代。这打破了,如果迭代的持续时间不是均匀分布的。例如,如果前几个输入是困难,并得到更容易朝向输入数组的末尾,则该算法会高估剩余时间,直到它几乎完成(此时它会高估非常轻微)。

Another approach is to divide the time passed since the first iteration, by the number of iterations completed, and then multiply that by remaining iterations. This breaks down if durations of iterations are not evenly distributed. For example, if the first few inputs are "difficult" and get easier towards the end of the input array, the algorithm will overestimate remaining time until it is almost finished (at which point it will overestimate very slightly).

那么,如何能得到的剩余时间的更准确的估计到时候每个迭代将是一种非直接的,任意的函数(例如,简单地解析继承和实现的时间来完成每一次迭代是不切实际的)的迭代坐标?

So how can one get a better estimate of remaining time when the time each iteration will take is a non-straightforward, arbitrary function (such that simply analytically deriving and implementing the time-to-complete of each iteration is impractical) of the iteration ordinate?

两个想法我能想象可能是卓有成效的研究途径,但我无法充分发掘自己在这个时候:

Two ideas I can imagine might be fruitful avenues of research, but am unable to fully explore myself at this time:

  • 的时间来完成每一个过去的迭代乘以剩余的迭代指数的平均水平。
  • 跟踪用来完成每一次迭代,然后拟合函数,外推时间。

为什么计算密集型解决方案(如拟合方程)是可以的:

首先,对于真正的大任务,其中本次讨论是值得的,运行时间可能会在数小时或数天进行测量。复杂的数学运算,这些天采取毫秒,因此,增加的负担不会很大 - 在我上面的例子中,清楚地 DoSomething的需要很长时间来侏儒做一些数学的成本,不然我也不会那么在乎precisely估计在首位的剩余时间。

Firstly, for truly large tasks where this discussion is worthwhile, run time may be measured in hours or days. Complicated mathematical operations these days take milliseconds, so the added burden would not be great - in my above example, clearly doSomething takes so long as to dwarf the cost of doing some math, otherwise I would not care so much about precisely estimating remaining time in the first place.

其次,它是可能的,例如,仓迭代成百分。而不是对数据集迭代完成与时间采取操作之后,估计将运行在一个数据集的完成百分比与花时间,它最多有100个数据点。这提供了进一步的复杂性:假设你的任务需要一天或更长时间才能完成。估计剩余时间只有一次,它的每一个百分比为手段齐全估算功能100评估。当你已经走了一天,一个额外的一分半钟的估计剩余时间不是什么大不了的事,但是这已经给你1秒的窗口,拟合方程,什么不可以 - 1第二个是很多。因此,我欢迎计算密集型解决方案。

Second, it is possible to, for instance, bin iterations into percentiles. Then, instead of operating on a dataset of "iterations complete vs. time taken" the estimator would operate on a dataset of "percent complete vs. time taken" which has at most 100 data points. This affords further complexity: Say your task takes a day or more to complete. Estimating remaining time only once each percent of it is complete means 100 evaluations of the estimator function. When you already take a day, an extra minute and a half for estimating remaining time isn't a big deal, but that already gives you a 1 second window for fitting equations and what not - 1 second is a lot of time for doing math on a modern system. As such, I welcome computationally intensive solutions.

TL;博士:如何在很长的任务过度设计一个准确的剩余时间计算函数

tl;dr: How to over-engineer an accurate remaining time estimator function for very lengthy tasks.

推荐答案

在除了Penguino的算法:不是件n和F(N),你可能需要适应的log(n)和日志(F(N))。只要你的复杂性是多项式这会工作。

In addition to Penguino's algorithm: instead of fitting n and f(n) you might want to fit log(n) and log(f(n)). As long as your complexity is polynomial this will work.

这篇关于算法估计剩余时间与异质迭代时间密集的环的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆