使用带有statsmodels的OLS模型预测值 [英] Predicting values using an OLS model with statsmodels

查看:1468
本文介绍了使用带有statsmodels的OLS模型预测值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用OLS(多次线性回归)计算了一个模型.我将数据划分为训练和测试(每个半个),然后我希望预测标签第二半的值.

I calculated a model using OLS (multiple linear regression). I divided my data to train and test (half each), and then I would like to predict values for the 2nd half of the labels.

model = OLS(labels[:half], data[:half])
predictions = model.predict(data[half:])

问题是我得到并出错: 在预测中,文件"/usr/local/lib/python2.7/dist-packages/statsmodels-0.5.0-py2.7-linux-i686.egg/statsmodels/regression/linear_model.py" 返回np.dot(exog,params) ValueError:矩阵未对齐

The problem is that I get and error: File "/usr/local/lib/python2.7/dist-packages/statsmodels-0.5.0-py2.7-linux-i686.egg/statsmodels/regression/linear_model.py", line 281, in predict return np.dot(exog, params) ValueError: matrices are not aligned

我有以下数组形状: data.shape:(426,215) labels.shape:(426,)

I have the following array shapes: data.shape: (426, 215) labels.shape: (426,)

如果我将输入转置为model.predict,我确实会得到一个结果,但形状为(426,213),所以我也认为它是错误的(我期望一个带有213个数字的向量作为标签预测):

If I transpose the input to model.predict, I do get a result but with a shape of (426,213), so I suppose its wrong as well (I expect one vector of 213 numbers as label predictions):

model.predict(data[half:].T)

有任何想法如何使其正常工作吗?

Any idea how to get it to work?

推荐答案

对于统计模型> = 0.4,如果我没记错的话

For statsmodels >=0.4, if I remember correctly

model.predict不了解参数,因此在调用中需要它们 参见 http://statsmodels.sourceforge.net/stable/generation/statsmodels.regression .linear_model.OLS.predict.html

model.predict doesn't know about the parameters, and requires them in the call see http://statsmodels.sourceforge.net/stable/generated/statsmodels.regression.linear_model.OLS.predict.html

在您的情况下,应该起作用的是拟合模型,然后使用结果实例的预测方法.

What should work in your case is to fit the model and then use the predict method of the results instance.

model = OLS(labels[:half], data[:half])
results = model.fit()
predictions = results.predict(data[half:])

或更短

results = OLS(labels[:half], data[:half]).fit()
predictions = results.predict(data[half:])

http://statsmodels.sourceforge.net/stable/generation/statsmodels .regression.linear_model.RegressionResults.predict.html 缺少文档字符串

注意:这已在开发版本中进行了更改(向后兼容),可以在预测中利用公式"信息 http://statsmodels.sourceforge.net/devel/generation/statsmodels.regression. linear_model.RegressionResults.predict.html

Note: this has been changed in the development version (backwards compatible), that can take advantage of "formula" information in predict http://statsmodels.sourceforge.net/devel/generated/statsmodels.regression.linear_model.RegressionResults.predict.html

这篇关于使用带有statsmodels的OLS模型预测值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆