从 pandas 回归获得回归线以进行绘图 [英] Getting the regression line to plot from a Pandas regression

查看:94
本文介绍了从 pandas 回归获得回归线以进行绘图的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经尝试使用(pandas)pd.ols和(statsmodels)sm.ols来获得回归散点图和回归线,我可以得到散点图,但是我可以似乎没有获得参数来绘制回归线.很明显,我在这里做了一些剪切和粘贴编码:-((使用它作为指导:

I have tried with both the (pandas)pd.ols and the (statsmodels)sm.ols to get a regression scatter plot with the regression line, I can get the scatter plot but I can't seem to get the parameters to get the regression line to plot. It is probably obvious that I am doing some cut and paste coding here :-( (using this as a guide: http://nbviewer.ipython.org/github/weecology/progbio/blob/master/ipynbs/statistics.ipynb

我的数据在pandas DataFrame中,并且x列已合并2 [:-1] .lastqu 并且y数据列合并为2 [:-1].单位 我的代码现在如下: 得到回归:

My data is in a pandas DataFrame and the x column is merged2[:-1].lastqu and the y data column is merged2[:-1].Units My code is now as follows: to get the regression:

def fit_line2(x, y):
    X = sm.add_constant(x, prepend=True) #Add a column of ones to allow the calculation of the intercept
    model = sm.OLS(y, X,missing='drop').fit()
    """Return slope, intercept of best fit line."""
    X = sm.add_constant(x)
    return model
model=fit_line2(merged2[:-1].lastqu,merged2[:-1].Units)
print fit.summary()

^^^^似乎还可以

intercept, slope = model.params  << I don't think this is quite right
plt.plot(merged2[:-1].lastqu,merged2[:-1].Units, 'bo')
plt.hold(True)

^^^^^这样就完成了散点图 ****并且下面没有给我回归线

^^^^^ this gets the scatter plot done ****and the below does not get me a regression line

x = np.array([min(merged2[:-1].lastqu), max(merged2[:-1].lastqu)])
y = intercept + slope * x
plt.plot(x, y, 'r-')
plt.show()

Dataframe的摘要:[:-1]从数据中消除当前时间段,该数据随后将成为投影

A snippit of the Dataframe: the [:-1] eliminates the current period from the data which will subsequently be a projection

Units   lastqu  Uperchg lqperchg    fcast   errpercent  nfcast
date                            
2000-12-31   7177    NaN     NaN     NaN     NaN     NaN     NaN
2001-12-31   10694   2195.000000     0.490038    NaN     10658.719019    1.003310    NaN
2002-12-31   11725   2469.000000

我发现我可以做到:

fig = plt.figure(figsize=(12,8))
fig = sm.graphics.plot_regress_exog(model, "lastqu", fig=fig)

(如 Statsmodels文档中所述) 似乎得到了我想要的主要信息(以及更多),我仍然想知道在先前的代码中哪里出错了!

as described here in the Statsmodels doc which seems to get the main thing I wanted (and more) I'd still like to know where I went wrong in the prior code!

推荐答案

检查数组和变量中的值.

Check what values you have in your arrays and variables.

我的猜测是您的x只是nans,因为您使用Python的min和max.至少在我当前打开的Pandas版本中会发生这种情况.

My guess is that your x is just nans, because you use Python's min and max. At least that happens with the version of Pandas that I have currently open.

最小和最大方法应该起作用,因为它们知道如何处理nan或缺少值

The min and max methods should work, since they know how to handle nans or missing values

>>> x = pd.Series([np.nan,2], index=['const','slope'])
>>> x
const   NaN
slope     2
dtype: float64

>>> min(x)
nan
>>> max(x)
nan

>>> x.min()
2.0
>>> x.max()
2.0

这篇关于从 pandas 回归获得回归线以进行绘图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆