Statsmodels.formula.api OLS不显示拦截的统计值 [英] Statsmodels.formula.api OLS does not show statistical values of intercept
问题描述
我正在运行以下源代码:
I am running the following source code:
import statsmodels.formula.api as sm
# Add one column of ones for the intercept term
X = np.append(arr= np.ones((50, 1)).astype(int), values=X, axis=1)
regressor_OLS = sm.OLS(endog=y, exog=X).fit()
print(regressor_OLS.summary())
其中
X
是一个50x5(添加拦截项之前)的numpy数组,如下所示:
X
is an 50x5 (before adding the intercept term) numpy array which looks like this:
[[0 1 165349.20 136897.80 471784.10]
[0 0 162597.70 151377.59 443898.53]...]
和y
是一个50x1的numpy数组,具有因变量的浮点值.
and y
is a a 50x1 numpy array with float values for the dependent variable.
前两列用于具有三个不同值的虚拟变量.其余各列是三个不同的独立变量.
The first two columns are for a dummy variable with three different values. The rest of the columns are three different indepedent variables.
不过,据说statsmodels.formula.api.OLS
自动添加了一个拦截项(请参阅此处的@stellacia答案:
Although, it is said that the statsmodels.formula.api.OLS
adds automatically an intercept term (see @stellacia's answer here: OLS using statsmodel.formula.api versus statsmodel.api) its summary
does not show the statistical values of the intercept term as it evident below in my case:
OLS Regression Results
==============================================================================
Dep. Variable: Profit R-squared: 0.988
Model: OLS Adj. R-squared: 0.986
Method: Least Squares F-statistic: 727.1
Date: Sun, 01 Jul 2018 Prob (F-statistic): 7.87e-42
Time: 21:40:23 Log-Likelihood: -545.15
No. Observations: 50 AIC: 1100.
Df Residuals: 45 BIC: 1110.
Df Model: 5
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
x1 3464.4536 4905.406 0.706 0.484 -6415.541 1.33e+04
x2 5067.8937 4668.238 1.086 0.283 -4334.419 1.45e+04
x3 0.7182 0.066 10.916 0.000 0.586 0.851
x4 0.3113 0.035 8.885 0.000 0.241 0.382
x5 0.0786 0.023 3.429 0.001 0.032 0.125
==============================================================================
Omnibus: 1.355 Durbin-Watson: 1.288
Prob(Omnibus): 0.508 Jarque-Bera (JB): 1.241
Skew: -0.237 Prob(JB): 0.538
Kurtosis: 2.391 Cond. No. 8.28e+05
==============================================================================
由于这个原因,我在源代码中添加了以下行:
For this reason, I added to my source code the line:
X = np.append(arr= np.ones((50, 1)).astype(int), values=X, axis=1)
如您在我的文章开头所看到的,拦截/常数的统计值如下所示:
as you can see at the beginning of my post and the statistical values of the intercept/constant are shown as below:
OLS Regression Results
==============================================================================
Dep. Variable: Profit R-squared: 0.951
Model: OLS Adj. R-squared: 0.945
Method: Least Squares F-statistic: 169.9
Date: Sun, 01 Jul 2018 Prob (F-statistic): 1.34e-27
Time: 20:25:21 Log-Likelihood: -525.38
No. Observations: 50 AIC: 1063.
Df Residuals: 44 BIC: 1074.
Df Model: 5
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
const 5.013e+04 6884.820 7.281 0.000 3.62e+04 6.4e+04
x1 198.7888 3371.007 0.059 0.953 -6595.030 6992.607
x2 -41.8870 3256.039 -0.013 0.990 -6604.003 6520.229
x3 0.8060 0.046 17.369 0.000 0.712 0.900
x4 -0.0270 0.052 -0.517 0.608 -0.132 0.078
x5 0.0270 0.017 1.574 0.123 -0.008 0.062
==============================================================================
Omnibus: 14.782 Durbin-Watson: 1.283
Prob(Omnibus): 0.001 Jarque-Bera (JB): 21.266
Skew: -0.948 Prob(JB): 2.41e-05
Kurtosis: 5.572 Cond. No. 1.45e+06
==============================================================================
即使我说statsmodels.formula.api.OLS
是自动添加该截取项,为什么当我不给自己添加截取项时也没有显示截取的统计值?
Why the statistical values of the intercept are not showing when I do not add my myself an intercept term even though it is said that statsmodels.formula.api.OLS
is adding this automatically?
推荐答案
除非使用公式,否则模型不会添加任何常量." 因此,请尝试以下示例.变量名称应根据您的数据集进行定义.
"No constant is added by the model unless you are using formulas." Therefore try something like below example. Variable names should be defined according to your data set.
使用
regressor_OLS = smf.ols(formula='Y_variable ~ X_variable', data=df).fit()
而不是
regressor_OLS = sm.OLS(endog=y, exog=X).fit()
这篇关于Statsmodels.formula.api OLS不显示拦截的统计值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!