智能进度条ETA计算 [英] Smart progress bar ETA computation

查看:212
本文介绍了智能进度条ETA计算的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在许多应用中,我们有一些进度条的文件下载,对于一个COM pression任务,找,等我们都经常使用进度条,让用户知道有事情发生。如果我们知道就好了多少工作已经完成,还剩下多少要做,我们甚至可以给出一个时间的估计,通常由多少时间它采取才能到目前的进度水平推算的一些细节。

In many applications, we have some progress bar for a file download, for a compression task, for a search, etc. We all often use progress bars to let users know something is happening. And if we know some details like just how much work has been done and how much is left to do, we can even give a time estimate, often by extrapolating from how much time it's taken to get to the current progress level.

但是,我们也看到了此剩余时间埃塔显示只是滑稽的恶意程序。它声称的文件副本将在20秒内完成,然后一秒钟后它说,它要采取4天,然后将其再次闪烁为20分钟。这不仅无益,这是令人困惑! 埃塔变化这么大的原因在于,进展速率本身可以变化,程序员的数学可以是过于敏感。

But we've also seen programs which this Time Left "ETA" display is just comically bad. It claims a file copy will be done in 20 seconds, then one second later it says it's going to take 4 days, then it flickers again to be 20 minutes. It's not only unhelpful, it's confusing! The reason the ETA varies so much is that the progress rate itself can vary and the programmer's math can be overly sensitive.

苹果公司只是避免任何准确的prediction,只是让模糊的估计回避这个! 苹果的模糊回避/

Apple sidesteps this by just avoiding any accurate prediction and just giving vague estimates!

这很烦人也一样,我有时间的快速突破,或者是我的任务将要在2秒钟做什么?如果prediction过于模糊,这是毫无意义做出任何prediction的。

That's annoying too, do I have time for a quick break, or is my task going to be done in 2 more seconds? If the prediction is too fuzzy, it's pointless to make any prediction at all.

容易的,但错误的方法

在第一遍ETA计算,大概我们都只是做一样,如果p是小数百分比的已经做了功能和t是它采取到目前为止,我们输出T *(1-P)/ P为时间多久它会采取到结束估计。这个简单的比率作品OK,但它也是可怕特别是在计算结束。如果你缓慢的下载速度保持副本慢慢推进发生在一夜之间,终于在凌晨,东西踢和副本开始全速准备在快100倍,你的ETA在90%以上做可能会说1小时,并在10秒以后你在95%以上,ETA会说30分,这显然是一个embarassingly差的猜测。在这种情况下,10秒是一个非常,非常非常好估计。

As a first pass ETA computation, probably we all just make a function like if p is the fractional percentage that's done already, and t is the time it's taken so far, we output t*(1-p)/p as the estimate of how long it's going to take to finish. This simple ratio works "OK" but it's also terrible especially at the end of computation. If your slow download speed keeps a copy slowly advancing happening overnight, and finally in the morning, something kicks in and the copy starts going at full speed at 100X faster, your ETA at 90% done may say "1 hour", and 10 seconds later you're at 95% and the ETA will say "30 minutes" which is clearly an embarassingly poor guess.. in this case "10 seconds" is a much, much, much better estimate.

在这种情况下,你可能会想改变计算使用的最近速度,而不是平均速度,估计ETA。你拿的平均下载速率或完成率在最后10秒,并使用率项目多久完成会。这表现相当好,在previous隔夜下载 - 这-加速向上,在最末尾的例子,因为它会产生非常好的最终​​完成估计在年底。但是,这仍然有很大的问题..它使你的ETA反弹疯狂,当你的速度在很短的时间周期快速变化,而你得到的做20秒,在2小时内完成,在2秒内完成,在30个完成分钟节目的耻辱快速显示。

When this happens you may think to change the computation to use recent speed, not average speed, to estimate ETA. You take the average download rate or completion rate over the last 10 seconds, and use that rate to project how long completion will be. That performs quite well in the previous overnight-download-which-sped-up-at-the-end example, since it will give very good final completion estimates at the end. But this still has big problems.. it causes your ETA to bounce wildly when your rate varies quickly over a short period of time, and you get the "done in 20 seconds, done in 2 hours, done in 2 seconds, done in 30 minutes" rapid display of programming shame.

的实际问题:

什么是计算任务的完成的估计时间的最佳方式,给出的计算的时间历史?我不是在寻找的链接,GUI工具或Qt库。我问的是算法以产生最明智的和准确的完成时间估计。

What is the best way to compute an estimated time of completion of a task, given the time history of the computation? I am not looking for links to GUI toolkits or Qt libraries. I'm asking about the algorithm to generate the most sane and accurate completion time estimates.

你有过与数学公式的成功?某种平均化,也许通过使用速率超过10秒的速率在与该速度在1小时内1分钟的平均?有些种类的人工过滤,如如果我的新的估计从previous估计变化太大,收敛一点,不要让它反弹太多?某种奇特的历史分析,你整合进度与时间的推进,找到率标准差给在完成统计误差的指标?

Have you had success with math formulas? Some kind of averaging, maybe by using the mean of the rate over 10 seconds with the rate over 1 minute with the rate over 1 hour? Some kind of artificial filtering like "if my new estimate varies too much from the previous estimate, tone it down, don't let it bounce too much"? Some kind of fancy history analysis where you integrate progress versus time advancement to find standard deviation of rate to give statistical error metrics on completion?

你尝试过什么,和什么效果最好?

What have you tried, and what works best?

推荐答案

这是建立这个网站显然使得一个调度系统,在回答这个问题公司员工写code的范围内。它的工作方式是未来的基础上,过去的蒙特卡罗模拟。

Original Answer

The company that created this site apparently makes a scheduling system that answers this question in the context of employees writing code. The way it works is with Monte Carlo simulation of future based on the past.

这是该算法将如何工作,在您的情况:

This is how this algorithm would work in your situation:

您建模你的任务是microtasks序列,说1000人。假设一个小时后,你完成了其中的100个。现在,通过随机选择90完成microtasks,增加他们的时间和10相乘运行模拟余下的900步骤,在这里,你有一个估计值;重复N次,你有N个估计剩余的时间。注意这些估计的平均水平将是约9小时 - 没有惊喜。但是,presenting所产生的分配给用户,你会诚实地传达给他的赔率,如的概率为90%,这将采取另一种3-15小时

You model your task as a sequence of microtasks, say 1000 of them. Suppose an hour later you completed 100 of them. Now you run the simulation for the remaining 900 steps by randomly selecting 90 completed microtasks, adding their times and multiplying by 10. Here you have an estimate; repeat N times and you have N estimates for the time remaining. Note the average between these estimates will be about 9 hours -- no surprises here. But by presenting the resulting distribution to the user you'll honestly communicate to him the odds, e.g. 'with the probability 90% this will take another 3-15 hours'

这个算法,根据定义,产生完整的结果,如果有问题的任务可以被模拟成一堆独立,随机 microtasks。你可以得到一个更好的答案只有当你知道这个任务,从这个模型是如何偏离:例如,安装程序通常有一个下载/解压缩/安装任务列表和速度,缺一不可predict其他。

This algorithm, by definition, produces complete result if the task in question can be modeled as a bunch of independent, random microtasks. You can gain a better answer only if you know how the task deviates from this model: for example, installers typically have a download/unpacking/installing tasklist and the speed for one cannot predict the other.

我不是一个统计大师,但我认为,如果你仔细看到模拟这种方法,它总是返回一个正常分布的大量独立随机变量之和。因此,你不需要执行它。事实上,你甚至不需要存储所有已完成的时候,因为你只需要它们的平方的总和,总和。

I'm not a statistics guru, but I think if you look closer into the simulation in this method, it will always return a normal distribution as a sum of large number of independent random variables. Therefore, you don't need to perform it at all. In fact, you don't even need to store all the completed times, since you'll only need their sum and sum of their squares.

在也许不是很标准的符号,

In maybe not very standard notation,

sigma = sqrt ( sum_of_times_squared-sum_of_times^2 )
scaling = 900/100          // that is (totalSteps - elapsedSteps) / elapsedSteps
lowerBound = sum_of_times*scaling - 3*sigma*sqrt(scaling)
upperBound = sum_of_times*scaling + 3*sigma*sqrt(scaling)

有了这个,你可以输出的消息说,事情会之间结束[下界,上界]从现在开始用一些固定的概率(应该是95%左右,但我可能错过了一些常数因子)。

With this, you can output the message saying that the thing will end between [lowerBound, upperBound] from now with some fixed probability (should be about 95%, but I probably missed some constant factor).

这篇关于智能进度条ETA计算的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆