为什么因子不包含在第一差异模型中? [英] Why factor is not included in first differences model?
问题描述
让我们考虑以下数据:
library(plm)
data("EmplUK", package="plm")
df1<-EmplUK
df1 <- cbind(df1,"Trend" = as.numeric(as.factor(unlist(df1[, 2]))))
> head(df1)
firm year sector emp wage capital output Trend
1 1 1977 7 5.041 13.1516 0.5894 95.7072 2
2 1 1978 7 5.600 12.3018 0.6318 97.3569 3
3 1 1979 7 5.015 12.8395 0.6771 99.6083 4
4 1 1980 7 4.715 13.8039 0.6171 100.5501 5
5 1 1981 7 4.093 14.2897 0.5076 99.5581 6
6 1 1982 7 3.166 14.8681 0.4229 98.6151 7
我想执行一阶差分面板回归.所以:
I want to perform first difference panel regression. So:
> plm(capital~wage+output+Trend,data=df1, model = 'fd')
Model Formula: capital ~ wage + output + Trend
Coefficients:
(Intercept) wage output
0.0111227 -0.0014415 0.0110732
我的问题是:为什么我的 plm 模型中不包含趋势"?有没有可能我可以包含它?
My question is: Why 'Trend' is not included in my plm model? And is there any possibility in which I can include it?
推荐答案
为了计算一阶差分,plm
在内部使用 c("firm", "year")
列.这可以显示为:
For calculating first differences, plm
internally uses c("firm", "year")
columns. This can be shown with:
plm(capital ~ wage + output + Trend,
data=df1[-which(names(df1) %in% c("firm", "year"))],
model='fd') ## throws a warning
# Model Formula: capital ~ wage + output + Trend
#
# Coefficients:
# (Intercept) wage output Trend
# 0.165677 0.076483 0.038369 0.261935
我们可以看到"Trend"
现在出现了(当然结果是错误的).
As we can see "Trend"
appears now (of course the result is wrong).
您可以在查看数据的相关矩阵时看到原因.
You can see the reason when looking in the correlation matrix of your data.
round(cor(df1))
# firm year sector emp wage capital output Trend
# firm 1 0 0 0 0 0 0 0
# year 0 1 0 0 0 0 -1 1
# sector 0 0 1 0 0 0 0 0
# emp 0 0 0 1 0 1 0 0
# wage 0 0 0 0 1 0 0 0
# capital 0 0 0 1 0 1 0 0
# output 0 -1 0 0 0 0 1 -1
# Trend 0 1 0 0 0 0 -1 1
Trend"
和 year"
完全相关,即您正在体验 多重共线性.
"Trend"
and "year"
are perfectly correlated, i.e. you're experiencing multicollinearity.
with(df1, cor(Trend, year))
# [1] 1
使用lm
,这样的系数会显示为NA
,类似于
Using lm
such coefficients would be displayed as NA
, similar to
r <- lm(capital ~ wage + output + factor(year) + factor(firm) + Trend,
data=df1)$coe
r[-grep("year|firm", names(r))]
# (Intercept) wage output Trend
# -2.62878756 0.03206621 0.02363581 NA
而 plm
丢弃它们.
这篇关于为什么因子不包含在第一差异模型中?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!