如何在数据帧中添加预测值? [英] How to add predicted values in a dataframe?

查看:0
本文介绍了如何在数据帧中添加预测值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我将预测扩展到五个值from this link。现在,我想要添加新的五个预测值(New_Interest_Rate和New_Unlosives_Rate),这样我就可以将它们与原始时间序列一起绘制成一个新的数字。

import pandas as pd
from sklearn import linear_model
import statsmodels.api as sm

Stock_Market = {'Year': [2017,2017,2017,2017,2017,2017,2017,2017,2017,2017,2017,2017,2016,2016,2016,2016,2016,2016,2016,2016,2016,2016,2016,2016],
                'Month': [12, 11,10,9,8,7,6,5,4,3,2,1,12,11,10,9,8,7,6,5,4,3,2,1],
                'Interest_Rate': [2.75,2.5,2.5,2.5,2.5,2.5,2.5,2.25,2.25,2.25,2,2,2,1.75,1.75,1.75,1.75,1.75,1.75,1.75,1.75,1.75,1.75,1.75],
                'Unemployment_Rate': [5.3,5.3,5.3,5.3,5.4,5.6,5.5,5.5,5.5,5.6,5.7,5.9,6,5.9,5.8,6.1,6.2,6.1,6.1,6.1,5.9,6.2,6.2,6.1],
                'Stock_Index_Price': [1464,1394,1357,1293,1256,1254,1234,1195,1159,1167,1130,1075,1047,965,943,958,971,949,884,866,876,822,704,719]        
                }

df = pd.DataFrame(Stock_Market,columns=['Year','Month','Interest_Rate','Unemployment_Rate','Stock_Index_Price'])

X = df[['Interest_Rate','Unemployment_Rate']] # here we have 2 variables for multiple regression. If you just want to use one variable for simple linear regression, then use X = df['Interest_Rate'] for example.Alternatively, you may add additional variables within the brackets
Y = df['Stock_Index_Price']
 
# with sklearn
regr = linear_model.LinearRegression()
regr.fit(X, Y)

print('Intercept: 
', regr.intercept_)
print('Coefficients: 
', regr.coef_)

# prediction with sklearn
New_Interest_Rate = [2.75, 3, 4, 1, 2]
New_Unemployment_Rate = [5.3, 4, 3, 2, 1]
for i in range(len(New_Interest_Rate)):
    print (str(i+1) + ' - Predicted Stock Index Price: 
', 
           regr.predict([[New_Interest_Rate[i] ,New_Unemployment_Rate[i]]]))

# with statsmodels
X = sm.add_constant(X) # adding a constant

model = sm.OLS(Y, X).fit()
predictions = model.predict(X) 
 
print_model = model.summary()
print(print_model)

我不知道如何追加它,因为当我尝试时,出现错误。

Interest_Rate=Interest_Rate.append(New_Interest_Rate)

TypeError: cannot concatenate object of type "<class 'float'>"; only pd.Series, pd.DataFrame, and pd.Panel (deprecated) objs are valid

我的目标是绘制扩展预测值。我用的是Jupyter笔记本。原始代码来自link。谢谢!

推荐答案

运行您提供的代码似乎可以在我的计算机上运行,但有一些警告消息。我使用的版本是python3.9.7、pandas 1.3.3-1、sklearn-pandas 2.2.0-1和statsModels 0.13.0。我只是将其保存到一个文件中,并使用";python Copypastedcode.py";在终端中运行它。我得到以下输出:

Intercept:
 1798.4039776258544
Coefficients:
 [ 345.54008701 -250.14657137]
/usr/lib/python3.9/site-packages/sklearn/base.py:441: UserWarning: X does not have valid feature names, but LinearRegression was fitted with feature names
  warnings.warn(
1 - Predicted Stock Index Price:
 [1422.86238865]
/usr/lib/python3.9/site-packages/sklearn/base.py:441: UserWarning: X does not have valid feature names, but LinearRegression was fitted with feature names
  warnings.warn(
2 - Predicted Stock Index Price:
 [1834.43795318]
/usr/lib/python3.9/site-packages/sklearn/base.py:441: UserWarning: X does not have valid feature names, but LinearRegression was fitted with feature names
  warnings.warn(
3 - Predicted Stock Index Price:
 [2430.12461156]
/usr/lib/python3.9/site-packages/sklearn/base.py:441: UserWarning: X does not have valid feature names, but LinearRegression was fitted with feature names
  warnings.warn(
4 - Predicted Stock Index Price:
 [1643.6509219]
/usr/lib/python3.9/site-packages/sklearn/base.py:441: UserWarning: X does not have valid feature names, but LinearRegression was fitted with feature names
  warnings.warn(
5 - Predicted Stock Index Price:
 [2239.33758028]
                            OLS Regression Results
==============================================================================
Dep. Variable:      Stock_Index_Price   R-squared:                       0.898
Model:                            OLS   Adj. R-squared:                  0.888
Method:                 Least Squares   F-statistic:                     92.07
Date:                Wed, 20 Oct 2021   Prob (F-statistic):           4.04e-11
Time:                        09:07:19   Log-Likelihood:                -134.61
No. Observations:                  24   AIC:                             275.2
Df Residuals:                      21   BIC:                             278.8
Df Model:                           2
Covariance Type:            nonrobust
=====================================================================================
                        coef    std err          t      P>|t|      [0.025      0.975]
-------------------------------------------------------------------------------------
const              1798.4040    899.248      2.000      0.059     -71.685    3668.493
Interest_Rate       345.5401    111.367      3.103      0.005     113.940     577.140
Unemployment_Rate  -250.1466    117.950     -2.121      0.046    -495.437      -4.856
==============================================================================
Omnibus:                        2.691   Durbin-Watson:                   0.530
Prob(Omnibus):                  0.260   Jarque-Bera (JB):                1.551
Skew:                          -0.612   Prob(JB):                        0.461
Kurtosis:                       3.226   Cond. No.                         394.
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

X没有有效的功能名称...通过更改

可以修复警告
regr.fit(X,Y)

regr.fit(X.values, Y.values) 

如果要使用New_Interest_Rate和New_Unlosios_Rate创建回归,则需要Y再有5个对应的股票价格。如果你试图从利率和失业率来预测股价,我认为这不是你想要做的。以下是您将如何做到这一点:

New_Interest_Rate = [2.75, 3, 4, 1, 2]
New_Unemployment_Rate = [5.3, 4, 3, 2, 1]
New_Stock_Prices = [1,2,3,4,5]
X_new = pd.DataFrame(data={'Interest_Rate': New_Interest_Rate,'Unemployment_Rate': New_Unemployment_Rate})
Y_new = pd.DataFrame(data={'Stock_Index_Price': New_Stock_Prices})
regr = linear_model.LinearRegression()
X = X.append(X_df)
Y = Y.append(Y_df)
regr.fit(X.values, Y.values)

如果您想绘制曲线图,您可以创建一个小函数来从输入数组中获取股票预测,如下所示:

def predict_stock_price(future_interest_rate, future_unemployment_rate):
    return [regr.predict([[i ,j]])[0,0] for i,j in zip(future_interest_rate,future_unemployment_rate)]

prices = predict_stock_price(New_Interest_Rate,New_Unemployment_Rate)
print("list of predicted stock prices:",prices)

predicted_stock_market = {'Month': range(13,13+len(prices)), #just to have a time axis to plot with
                         'Interest_Rate': New_Interest_Rate,
                         'Unemployment_Rate': New_Unemployment_Rate,
                         'Stock_Index_Price': prices}
predicted_df = pd.DataFrame(predicted_stock_market)
predicted_df.plot( x="Month",y="Stock_Index_Price",kind='scatter')
plt.show()

这篇关于如何在数据帧中添加预测值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆