面板数据回归:稳健的标准误差 [英] Panel data regression: Robust standard errors

查看:981
本文介绍了面板数据回归:稳健的标准误差的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的问题是这样的:我得到NA,在计算稳健的标准误差时应该得到一些值.

my problem is this: I get NA where I should get some values in the computation of robust standard errors.

我正在尝试对具有稳健性的标准错误进行固定效果面板回归.为此,我遵循p的 Arai(2011). 3遵循 Stock/Watson(2006)(后来发表在计量经济学,对于那些谁有访问权).我想通过(M/(M-1)*(N-1)/(N-K)来纠正自由度,以免出现向下偏差,因为我的簇数是有限的,并且我的数据不平衡.

I am trying to do a fixed effect panel regression with cluster-robust standard errors. For this, I follow Arai (2011) who on p. 3 follows Stock/ Watson (2006) (later published in Econometrica, for those who have access). I would like to correct the degrees of freedom by (M/(M-1)*(N-1)/(N-K) against downward bias as my number of clusters is finite and I have unbalanced data.

在[ 1 2 ]和相关问题[

Similar problems have been posted before [1, 2] on StackOverflow and related problems [3] on CrossValidated.

Arai(以及第一个链接中的答案)将以下代码用于函数(我在下面提供我的数据并提供一些进一步的评论):

Arai (and the answer in the 1st link) uses the following code for functions (I provide my data below with some further comment):

gcenter <- function(df1,group) {
    variables <- paste(
        rep("C", ncol(df1)), colnames(df1), sep=".")
    copydf <- df1
    for (i in 1:ncol(df1)) {
        copydf[,i] <- df1[,i] - ave(df1[,i], group,FUN=mean)}
    colnames(copydf) <- variables
    return(cbind(df1,copydf))}

# 1-way adjusting for clusters
clx <- function(fm, dfcw, cluster){
    # R-codes (www.r-project.org) for computing
    # clustered-standard errors. Mahmood Arai, Jan 26, 2008.
    # The arguments of the function are:
    # fitted model, cluster1 and cluster2
    # You need to install libraries `sandwich' and `lmtest'
    # reweighting the var-cov matrix for the within model
    library(sandwich);library(lmtest)
    M <- length(unique(cluster))   
    N <- length(cluster)           
    K <- fm$rank                        
    dfc <- (M/(M-1))*((N-1)/(N-K))  
    uj  <- apply(estfun(fm),2, function(x) tapply(x, cluster, sum));
    vcovCL <- dfc*sandwich(fm, meat=crossprod(uj)/N)*dfcw
    coeftest(fm, vcovCL) }

,其中gcenter计算与平均值的偏差(固定效应).然后,我继续并以DS_CODE作为集群变量(我已将数据命名为数据")进行回归.

,where the gcenter computes deviations from the mean (fixed effect). I then continue and do the regression with DS_CODEbeing my cluster variable (I have named my data 'data').

centerdata <- gcenter(data, data$DS_CODE)
datalm <- lm(C.L1.retE1M ~ C.MCAP_SEC + C.Impact_change + C.Mom + C.BM + C.PD + C.CashGen + C.NITA + C.PE + C.PEdummy + factor(DS_CODE), data=centerdata)
M <- length(unique(data$DS_CODE))
dfcw <- datalm$df / (datalm$df - (M-1))

并要计算

clx(datalm, dfcw, data$DS_CODE)

但是,当我想计算方差的 uj (请参见上面的公式clx)时,我只在开始时得到一些回归值,然后是很多零.如果将此输入 uj 用于方差,则仅NAs结果.

However, when I want to compute uj (see formula clx above) for the variance, I get only at the beginning some values for my regressors, then lots of zeros. If this input uj is used for the variance, only NAs result.

我的数据

由于我的数据可能具有特殊的结构,并且我无法解决问题,因此我将整个内容发布为

Since my data may be of special structure and I can't figure out the problem, I post the entire thing as a link from Hotmail. The reason is that with other data (taken from Arai (2011)) my problem does not occur. Sorry in advance for the mess but I'd be very grateful if you could have a look at it nevertheless. The file is a 5mb .txt file containing purely data.

推荐答案

玩了一段时间后,它对我有用并给了我

After some time playing around, it works for me and gives me:

                         Estimate  Std. Error t value  Pr(>|t|)    
(Intercept)            4.5099e-16  5.2381e-16  0.8610  0.389254    
C.MCAP_SEC            -5.9769e-07  1.2677e-07 -4.7149 2.425e-06 ***
C.Impact_change       -5.3908e-04  7.5601e-05 -7.1306 1.014e-12 ***
C.Mom                  3.7560e-04  3.3378e-03  0.1125  0.910406    
C.BM                  -1.6438e-04  1.7368e-05 -9.4645 < 2.2e-16 ***
C.PD                   6.2153e-02  3.8766e-02  1.6033  0.108885    
C.CashGen             -2.7876e-04  1.4031e-02 -0.0199  0.984149    
C.NITA                -8.1792e-02  3.2153e-02 -2.5438  0.010969 *  
C.PE                  -6.6170e-06  4.0138e-06 -1.6485  0.099248 .  
C.PEdummy              1.3143e-02  4.8864e-03  2.6897  0.007154 ** 
factor(DS_CODE)130324 -5.2497e-16  5.2683e-16 -0.9965  0.319028    
factor(DS_CODE)130409 -4.0276e-16  5.2384e-16 -0.7689  0.441986    
factor(DS_CODE)130775 -4.4113e-16  5.2424e-16 -0.8415  0.400089  
...

这给我们留下了一个问题,为什么它不适合您.我想这与您的数据格式有关.一切都是数字的吗?我转换了列类,对我来说看起来像这样:

This leaves us with the question why it doesn't for you. I guess it has something to do with the format of your data. Is everything numeric? I converted the column classes and it looks like that for me:

str(dat)
'data.frame':   48251 obs. of  12 variables:
 $ DS_CODE      : chr  "902172" "902172" "902172" "902172" ...
 $ DNEW         : num  2e+05 2e+05 2e+05 2e+05 2e+05 ...
 $ MCAP_SEC     : num  78122 71421 81907 80010 82462 ...
 $ NITA         : num  0.135 0.135 0.135 0.135 0.135 ...
 $ CashGen      : num  0.198 0.198 0.198 0.198 0.198 ...
 $ BM           : num  0.1074 0.1108 0.097 0.0968 0.0899 ...
 $ PE           : num  57 55.3 63.1 63.2 68 ...
 $ PEdummy      : num  0 0 0 0 0 0 0 0 0 0 ...
 $ L1.retE1M    : num  -0.72492 0.13177 0.00122 0.07214 -0.07332 ...
 $ Mom          : num  0 0 0 0 0 ...
 $ PD           : num  5.41e-54 1.51e-66 3.16e-80 2.87e-79 4.39e-89 ...
 $ Impact_change: num  0 -10.59 -10.43 0.7 -6.97 ...

str(data)为您带来什么?

这篇关于面板数据回归:稳健的标准误差的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆