从OLS中获得简单的预测,其结果与StatsModels的.6到.8有所不同 [英] Getting a simple predict from OLS something different from .6 to .8 of StatsModels
问题描述
很抱歉对此进行交叉发布,但无法通过它,我无法从预测函数中获取输出:
Sorry for cross posting this but can't get past it I cannot get output from the predict function:
我有一个曾经使用过SM .6的OLS模型,现在却无法在.8中使用,而Pandas也从19.2增至20.3,所以这可能是问题所在吗?
I have an OLS model that used to work with SM .6 and now not working in .8 and Pandas increased from 19.2 to 20.3 so that could be the issue?
我只是不明白我需要为预测方法提供什么. 所以我的模型创建如下:
I just don't understand what I need to feed to the predict method. So my model create looks like:
def fit_line2(x, y):
X = sm.add_constant(x, prepend=True) #Add a column of ones to allow the calculation of the intercept
ols_test = sm.OLS(y, X,missing='drop').fit()
"""Return slope, intercept of best fit line."""
X = sm.add_constant(x)
return ols_test
那很好,我得到了一个模型,可以看到摘要. 我曾经这样做是通过使用我在SM .6中工作的最新值(我想对其进行向前预测)使预测提前一个时期. 预测称为:
And that works fine and I get a model out and can see the summary fine. I used to do this to get the prediction one period ahead by using my latest value (on which I want to project forward) worked in SM .6 The predict is called as follows:
yrahead=ols_test.predict(ols_input)
ols输入是从熊猫DF创建的:
ols input is created from a pandas DF:
ols_input=(sm.add_constant(merged2.lastqu[-1:], prepend=True))
lastqu
2018-12-31 13209.0
type:
<class 'pandas.core.frame.DataFrame'>
将预测称为:
yrahead=ols_test.predict(ols_input)
这给了我一个错误: ValueError:形状(1,1)和(2,)不对齐:1(dim 1)!= 2(dim 0)
This gives me an error: ValueError: shapes (1,1) and (2,) not aligned: 1 (dim 1) != 2 (dim 0)
我尝试通过将ols_input更改为简单地输入数字:
I tried simply feeding the number by changing ols_input to:
13209.0
Type:
<class 'numpy.float64'>
那给了我一个类似的错误: ValueError:形状(1,1)和(2,)不对齐:1(dim 1)!= 2(dim 0)
That gave me a similar error: ValueError: shapes (1,1) and (2,) not aligned: 1 (dim 1) != 2 (dim 0)
不确定要去哪里吗?
上面的基本DataFrame表(merged2)看起来像这样,所以最后一行lastqu列包含我要预测的单位的值:
the base DataFrame table (merged2) from the above looks like so the last line lastqu column contains the value I want to predict Units for:
Units lastqu Uperchg lqperchg
2000-12-31 19391.000000 NaN NaN NaN
2001-12-31 35068.000000 5925.0 80.85 NaN
2002-12-31 39279.000000 8063.0 12.01 36.08
2003-12-31 47517.000000 9473.0 20.97 17.49
2004-12-31 51439.000000 11226.0 8.25 18.51
2005-12-31 59674.000000 11667.0 16.01 3.93
2006-12-31 58664.000000 14016.0 -1.69 20.13
2007-12-31 55698.000000 13186.0 -5.06 -5.92
2008-12-31 42235.000000 11343.0 -24.17 -13.98
2009-12-31 40478.333333 7867.0 -4.16 -30.64
2010-12-31 38721.666667 8114.0 -4.34 3.14
2011-12-31 36965.000000 8361.0 -4.54 3.04
2012-12-31 39132.000000 8608.0 5.86 2.95
2013-12-31 43160.000000 9016.0 10.29 4.74
2014-12-31 44520.000000 9785.0 3.15 8.53
2015-12-31 49966.000000 10351.0 12.23 5.78
2016-12-31 53752.000000 10884.0 7.58 5.15
2017-12-31 57571.000000 12109.0 7.10 11.26
2018-12-31 NaN 13209.0 NaN 9.08
所以我将针对最后一个项目的OLS用于2018年的项目单位
So I'm using the OLS against the lastqu to project units for 2018
我坦率地承认自己并没有真正理解SM .6为何如此工作,但是确实如此!
I freely confess to not really understanding why SM .6 worked the way it did, but it did!
推荐答案
与Statsmodels的库作者进行了一些讨论之后,似乎有一个错误,请参见此处的讨论
After some discussion with The library author of Statsmodels it seems there is a bug see the discussion here https://groups.google.com/d/topic/pystatsmodels/a0XsXIiP5ro/discussion
请注意,我针对特定问题的最终解决方案是:
Note my final solution for my specific issue was:
ols_input=np.array([1,merged2.lastqu[-1:].values])
yrahead=ols_test.predict(ols_input)
哪个产生下一个周期的单位.
Which yields the Units for next period..
这篇关于从OLS中获得简单的预测,其结果与StatsModels的.6到.8有所不同的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!