python中有简单的方法可以将数据点外推到未来吗? [英] Is there easy way in python to extrapolate data points to the future?

查看:362
本文介绍了python中有简单的方法可以将数据点外推到未来吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个简单的numpy数组,每个日期都有一个数据点.像这样:

I have a simple numpy array, for every date there is a data point. Something like this:

>>> import numpy as np
>>> from datetime import date
>>> from datetime import date
>>> x = np.array( [(date(2008,3,5), 4800 ), (date(2008,3,15), 4000 ), (date(2008,3,
20), 3500 ), (date(2008,4,5), 3000 ) ] )

是否有简单的方法可以将数据点外推到将来:date(2008,5,1),date(2008、5、20)等?我了解可以使用数学算法来完成.但是在这里我正在寻找一些低落的果实.实际上,我喜欢numpy.linalg.solve所做的事情,但它似乎不适用于推断.也许我绝对是错的.

Is there easy way to extrapolate data points to the future: date(2008,5,1), date(2008, 5, 20) etc? I understand it can be done with mathematical algorithms. But here I am seeking for some low hanging fruit. Actually I like what numpy.linalg.solve does, but it does not look applicable for the extrapolation. Maybe I am absolutely wrong.

实际上,更具体地说,我正在构建燃尽图(xp术语):"x =日期和y =要完成的工作量",所以我得到了已经完成的冲刺,我想直观地了解一下如果目前的状况持续下去,未来的冲刺将继续.最后,我想预测发布日期.因此,要完成的工作量"的性质总是在消耗图表上下降.我还想获得推断的发布日期:音量变为零时的日期.

Actually to be more specific I am building a burn-down chart (xp term): 'x=date and y=volume of work to be done', so I have got the already done sprints and I want to visualise how the future sprints will go if the current situation persists. And finally I want to predict the release date. So the nature of 'volume of work to be done' is it always goes down on burn-down charts. Also I want to get the extrapolated release date: date when the volume becomes zero.

这一切都是为了向开发团队展示情况.精确度在这里不是很重要:)开发团队的动力是主要因素.这意味着我对非常近似的外推技术绝对满意.

This is all for showing to dev team how things go. The preciseness is not so important here :) The motivation of dev team is the main factor. That means I am absolutely fine with the very approximate extrapolation technique.

推荐答案

外推法很容易产生垃圾.试试这个. 当然可以进行许多不同的推论. 有些会产生明显的垃圾,有些会产生非明显的垃圾,其中许多是不确定的.

It's all too easy for extrapolation to generate garbage; try this. Many different extrapolations are of course possible; some produce obvious garbage, some non-obvious garbage, many are ill-defined.

""" extrapolate y,m,d data with scipy UnivariateSpline """
import numpy as np
from scipy.interpolate import UnivariateSpline
    # pydoc scipy.interpolate.UnivariateSpline -- fitpack, unclear
from datetime import date
from pylab import *  # ipython -pylab

__version__ = "denis 23oct"


def daynumber( y,m,d ):
    """ 2005,1,1 -> 0  2006,1,1 -> 365 ... """
    return date( y,m,d ).toordinal() - date( 2005,1,1 ).toordinal()

days, values = np.array([
    (daynumber(2005,1,1), 1.2 ),
    (daynumber(2005,4,1), 1.8 ),
    (daynumber(2005,9,1), 5.3 ),
    (daynumber(2005,10,1), 5.3 )
    ]).T
dayswanted = np.array([ daynumber( year, month, 1 )
        for year in range( 2005, 2006+1 )
        for month in range( 1, 12+1 )])

np.set_printoptions( 1 )  # .1f
print "days:", days
print "values:", values
print "dayswanted:", dayswanted

title( "extrapolation with scipy.interpolate.UnivariateSpline" )
plot( days, values, "o" )
for k in (1,2,3):  # line parabola cubicspline
    extrapolator = UnivariateSpline( days, values, k=k )
    y = extrapolator( dayswanted )
    label = "k=%d" % k
    print label, y
    plot( dayswanted, y, label=label  )  # pylab

legend( loc="lower left" )
grid(True)
savefig( "extrapolate-UnivariateSpline.png", dpi=50 )
show()

添加了一个 Scipy票证, "FITPACK类的行为在 scipy.interpolate比文档让人相信的复杂得多" 恕我直言,也适用于其他软件文档.

Added: a Scipy ticket says, "The behavior of the FITPACK classes in scipy.interpolate is much more complex than the docs would lead one to believe" -- imho true of other software doc too.

这篇关于python中有简单的方法可以将数据点外推到未来吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆