R的三明治包装对线性模型中的稳健标准误差产生奇怪的结果 [英] R's sandwich package producing strange results for robust standard errors in linear model

查看:173
本文介绍了R的三明治包装对线性模型中的稳健标准误差产生奇怪的结果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在R中发现异方差稳健的标准错误,而我发现的大多数解决方案都是使用coeftestsandwich包.但是,当我使用这些软件包时,它们似乎产生了奇怪的结果(它们的 way 太重要了).我的教授和我都同意结果看起来不正确.有人可以告诉我我的错误在哪里吗?我使用的是正确的包裹吗?程序包中是否有错误?我应该怎么用呢?还是可以在STATA中重现相同的结果?

I am trying to find heteroskedasticity-robust standard errors in R, and most solutions I find are to use the coeftest and sandwich packages. However, when I use those packages, they seem to produce queer results (they're way too significant). Both my professor and I agree that the results don't look right. Could someone please tell me where my mistake is? Am I using the right package? Does the package have a bug in it? What should I use instead? Or can you reproduce the same results in STATA?

(数据是2010年至2014年3月的样本的CPS数据.我创建了一个MySQL数据库来保存数据,并使用survey包来帮助对其进行分析.)

(The data is CPS data from 2010 to 2014, March samples. I created a MySQL database to hold the data and am using the survey package to help analyze it.)

先谢谢您. (我对代码进行了一些删节,以使其更易于阅读;如果您需要查看更多内容,请告诉我.)

Thank you in advance. (I have abridged the code somewhat to make it easier to read; let me know if you need to see more.)

>male.nat.reg <- svyglm(log(HOURWAGE) ~ AGE + I(AGE^2) + ... + OVERWORK, subset(fwyrnat2010.design, FEMALE == 0))

>summary(male.nat.reg)

Call:
NextMethod(formula = "svyglm", design)

Survey design:
subset(fwyrnat2010.design, FEMALE == 0)

Coefficients:
           Estimate Std. Error t value Pr(>|t|)    
(Intercept)   1.599e+00  6.069e-02  26.350  < 2e-16 ***
AGE           4.030e-02  3.358e-03  12.000  < 2e-16 ***
I(AGE^2)     -4.131e-04  4.489e-05  -9.204 9.97e-16 ***
NOHSDEG      -1.730e-01  1.281e-02 -13.510  < 2e-16 ***
ASSOC         1.138e-01  1.256e-02   9.060 2.22e-15 ***
SOMECOLL      5.003e-02  9.445e-03   5.298 5.11e-07 ***
BACHELOR      2.148e-01  1.437e-02  14.948  < 2e-16 ***
GRADUATE      3.353e-01  3.405e-02   9.848  < 2e-16 ***
INMETRO       3.879e-02  9.225e-03   4.205 4.93e-05 ***
NCHILDOLD     1.374e-02  4.197e-03   3.273 0.001376 ** 
NCHILDYOUNG   2.334e-02  6.186e-03   3.774 0.000247 ***
NOTWHITE     -5.026e-02  8.583e-03  -5.856 3.92e-08 ***
MARRIED      -8.226e-03  1.531e-02  -0.537 0.592018    
NEVERMARRIED -4.644e-02  1.584e-02  -2.932 0.004009 ** 
NOTCITIZEN   -6.759e-02  1.574e-02  -4.295 3.47e-05 ***
STUDENT      -1.231e-01  1.975e-02  -6.231 6.52e-09 ***
VET           3.336e-02  1.751e-02   1.905 0.059091 .  
INUNION       2.366e-01  1.271e-02  18.614  < 2e-16 ***
PROFOCC       2.559e-01  1.661e-02  15.413  < 2e-16 ***
TSAOCC        9.997e-02  1.266e-02   7.896 1.27e-12 ***
FFFOCC        2.076e-02  2.610e-02   0.795 0.427859    
PRODOCC       2.164e-01  1.281e-02  16.890  < 2e-16 ***
LABOROCC      6.074e-02  1.253e-02   4.850 3.60e-06 ***
AFFIND        6.834e-02  2.941e-02   2.324 0.021755 *  
MININGIND     3.034e-01  3.082e-02   9.846  < 2e-16 ***
CONSTIND      1.451e-01  1.524e-02   9.524  < 2e-16 ***
MANUFIND      1.109e-01  1.393e-02   7.963 8.80e-13 ***
UTILIND       1.422e-01  1.516e-02   9.379 3.78e-16 ***
WHOLESALEIND  2.884e-02  1.766e-02   1.633 0.104910    
FININD        6.215e-02  2.084e-02   2.983 0.003436 ** 
BUSREPIND     6.588e-02  1.755e-02   3.753 0.000266 ***
SERVICEIND    5.412e-02  2.403e-02   2.252 0.026058 *  
ENTERTAININD -1.192e-01  3.060e-02  -3.896 0.000159 ***
PROFIND       1.536e-01  1.854e-02   8.285 1.55e-13 ***
OVERWORK      6.738e-02  1.007e-02   6.693 6.59e-10 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for gaussian family taken to be 0.1367476)

Number of Fisher Scoring iterations: 2

>coeftest(male.nat.reg, vcov = vcovHC(male.nat.reg, type = 'HC0'))

z test of coefficients:

                Estimate  Std. Error   z value  Pr(>|z|)    
(Intercept)   1.5992e+00  9.7176e-08  16456481 < 2.2e-16 ***
AGE           4.0296e-02  5.4766e-09   7357823 < 2.2e-16 ***
I(AGE^2)     -4.1314e-04  7.3222e-11  -5642330 < 2.2e-16 ***
NOHSDEG      -1.7305e-01  1.4431e-08 -11991482 < 2.2e-16 ***
ASSOC         1.1378e-01  1.4248e-08   7985751 < 2.2e-16 ***
SOMECOLL      5.0035e-02  9.9689e-09   5019088 < 2.2e-16 ***
BACHELOR      2.1476e-01  2.0588e-08  10430993 < 2.2e-16 ***
GRADUATE      3.3533e-01  8.3327e-08   4024301 < 2.2e-16 ***
INMETRO       3.8790e-02  8.9666e-09   4326013 < 2.2e-16 ***
NCHILDOLD     1.3738e-02  5.2244e-09   2629554 < 2.2e-16 ***
NCHILDYOUNG   2.3344e-02  5.5405e-09   4213300 < 2.2e-16 ***
NOTWHITE     -5.0261e-02  1.0150e-08  -4951908 < 2.2e-16 ***
MARRIED      -8.2263e-03  1.8867e-08   -436026 < 2.2e-16 ***
NEVERMARRIED -4.6440e-02  1.7847e-08  -2602096 < 2.2e-16 ***
NOTCITIZEN   -6.7594e-02  2.4446e-08  -2765080 < 2.2e-16 ***
STUDENT      -1.2306e-01  3.2514e-08  -3785014 < 2.2e-16 ***
VET           3.3356e-02  3.0996e-08   1076125 < 2.2e-16 ***
INUNION       2.3659e-01  1.7786e-08  13301699 < 2.2e-16 ***
PROFOCC       2.5594e-01  2.2177e-08  11540563 < 2.2e-16 ***
TSAOCC        9.9971e-02  1.6707e-08   5983922 < 2.2e-16 ***
FFFOCC        2.0762e-02  2.3625e-08    878801 < 2.2e-16 ***
PRODOCC       2.1638e-01  1.3602e-08  15907683 < 2.2e-16 ***
LABOROCC      6.0741e-02  1.3445e-08   4517854 < 2.2e-16 ***
AFFIND        6.8342e-02  3.2895e-08   2077563 < 2.2e-16 ***
MININGIND     3.0343e-01  3.2948e-08   9209326 < 2.2e-16 ***
CONSTIND      1.4512e-01  2.1871e-08   6635457 < 2.2e-16 ***
MANUFIND      1.1094e-01  1.9636e-08   5649569 < 2.2e-16 ***
UTILIND       1.4216e-01  2.0930e-08   6792029 < 2.2e-16 ***
WHOLESALEIND  2.8842e-02  1.8662e-08   1545525 < 2.2e-16 ***
FININD        6.2147e-02  2.8214e-08   2202691 < 2.2e-16 ***
BUSREPIND     6.5883e-02  2.7866e-08   2364269 < 2.2e-16 ***
SERVICEIND    5.4118e-02  2.4758e-08   2185907 < 2.2e-16 ***
ENTERTAININD -1.1922e-01  2.9474e-08  -4044852 < 2.2e-16 ***
PROFIND       1.5364e-01  3.0132e-08   5098879 < 2.2e-16 ***
OVERWORK      6.7376e-02  1.0981e-08   6135525 < 2.2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

推荐答案

sandwich包是面向对象的,并且基本上依赖于可用的两种方法:estfun()bread(),有关更多详细信息,请参见包vignettes. .对于svyglm类的对象,这些方法不可用,但由于svyglm对象是从glm继承的,因此找到并使用了glm方法.我怀疑这可能会导致在权重因子左右的情况下在调查环境中得出不正确的结果.我对survey软件包不太熟悉,无法提供解决方法. survey维护者也许可以说更多...希望对您有所帮助.

The sandwich package is object-oriented and essentially relies on two methods being available: estfun() and bread(), see the package vignettes for more details. For objects of class svyglm these methods are not available but as svyglm objects inherit from glm the glm methods are found and used. I suspect that this leads to incorrect results in the survey context though, possibly by a weighting factor or so. I'm not familiar enough with the survey package to provide a workaround. The survey maintainer might be able to say more... Hope that helps.

这篇关于R的三明治包装对线性模型中的稳健标准误差产生奇怪的结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆