使用固定效果进行预测 [英] Prediction using Fixed Effects
问题描述
我有一个简单的数据集,为此我应用了一个简单的线性回归模型.现在,我想使用固定效果对模型进行更好的预测.我知道我也可以考虑创建虚拟变量,但是我的真实数据集包含更多的年份并且具有更多的变量,因此我想避免创建虚拟变量.
I have a simple data set for which I applied a simple linear regression model. Now I would like to use fixed effects to make a better prediction on the model. I know that I could also consider making dummy variables, but my real dataset consist of more years and has more variables so I would like to avoid making dummies.
我的数据和代码与此类似:
My data and code is similar to this:
data <- read.table(header = TRUE,
stringsAsFactors = FALSE,
text="CompanyNumber ResponseVariable Year ExplanatoryVariable1 ExplanatoryVariable2
1 2.5 2000 1 2
1 4 2001 3 1
1 3 2002 5 7
2 1 2000 3 2
2 2.4 2001 0 4
2 6 2002 2 9
3 10 2000 8 3")
library(lfe)
library(caret)
fe <- getfe(felm(data = data, ResponseVariable ~ ExplanatoryVariable1 + ExplanatoryVariable2 | Year))
fe
lm.1<-lm(ResponseVariable ~ ExplanatoryVariable1 + ExplanatoryVariable2, data=data)
prediction<- predict(lm.1, data)
prediction
check_model=postResample(pred = prediction, obs = data$ResponseVariable)
check_model
对于我的真实数据集,我将基于我的测试集进行预测,但为简单起见,我也只在这里使用训练集.
For my real dataset I will make a prediction based on my test set but for simplicity I just use the trainingset here as well.
我想借助我发现的固定效果做出预测.但这似乎与固定效果不匹配,任何知道如何使用此fe$effects
的人?
I would like to make a prediction with the help of the fixed effects that I found. But it does not seem to match the fixed effect right, anyone who knows how to use this fe$effects
?
prediction_fe<- predict(lm.1, data) + fe$effect
推荐答案
以下是有关您的设置和正在运行的模型的一些附加评论.
Here's a few extra comments on your setup and the models that you are running.
您适合的主要模型是
lm.1<-lm(ResponseVariable ~ ExplanatoryVariable1 + ExplanatoryVariable2, data=data)
产生
> lm.1
Call:
lm(formula = ResponseVariable ~ ExplanatoryVariable1 + ExplanatoryVariable2,
data = data)
Coefficients:
(Intercept) ExplanatoryVariable1 ExplanatoryVariable2
0.8901 0.7857 0.1923
在此模型上运行predict
函数时,您会得到
When you run the predict
function on this model you get
> predict(lm.1)
1 2 3 4 5 6 7
2.060385 3.439410 6.164590 3.631718 1.659333 4.192205 7.752359
对应于计算(对于观察值1):0.8901 + 1 * 0.7857 + 2 * 0.1923,因此在预测中使用了估计的固定效果. felm
模型稍微复杂些,因为它考虑"了年份成分.模型拟合显示在这里
That corresponds to computing (for observation 1) : 0.8901 + 1*0.7857 + 2*0.1923 so the estimated fixed effects are used in the prediction. The felm
model is slightly more complicated as it "factors out" the year component. The model fit is shown here
> felm(data = data, ResponseVariable ~ ExplanatoryVariable1 + ExplanatoryVariable2 | Year)
ExplanatoryVariable1 ExplanatoryVariable2
0.9726 1.3262
现在,这对应于Year
的校正"或条件,因此如果适合,您将获得相同的结果
Now this correspond to "correcting for" or conditioning on Year
so you get the same result if you fit
> lm(data = data, ResponseVariable ~ ExplanatoryVariable1 + ExplanatoryVariable2 + factor(Year))
Call:
lm(formula = ResponseVariable ~ ExplanatoryVariable1 + ExplanatoryVariable2 +
factor(Year), data = data)
Coefficients:
(Intercept) ExplanatoryVariable1 ExplanatoryVariable2 factor(Year)2001
-2.4848 0.9726 1.3262 0.9105
factor(Year)2002
-7.0286
,然后只丢弃解释变量的所有系数.因此,您无法从felm
提取的固定效果中获取预测(因为您缺少截距和全年的效果),因此只能看到效果大小.
and then just throw away all but the coefficients for the explanatory variables. Thus, you cannnot go from the extracted fixed effects from felm
and obtain the predictions (since you are lacking the intercept and all the year effects) - you can only see the effect sizes.
希望这会有所帮助.
这篇关于使用固定效果进行预测的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!