R plm滞后-等同于Stata中的L1.x? [英] R plm lag - what is the equivalent to L1.x in Stata?

查看:152
本文介绍了R plm滞后-等同于Stata中的L1.x?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用R中的plm包以适应固定效果模型,向模型添加滞后变量的正确语法是什么?类似于Stata中的"L1.variable"命令.

Using the plm package in R to fit a fixed-effects model, what is the correct syntax to add a lagged variable to the model? Similar to the 'L1.variable' command in Stata.

这是我尝试添加滞后变量(这是一个测试模型,可能没有意义):

Here is my attempt adding a lagged variable (this is a test model and it might not make sense):

library(foreign)
nlswork <- read.dta("http://www.stata-press.com/data/r11/nlswork.dta")
pnlswork <- plm.data(nlswork, c('idcode', 'year'))
ffe <- plm(ln_wage ~ ttl_exp+lag(wks_work,1)
           , model = 'within'
           , data = nlswork)
summary(ffe)

R输出:

Oneway (individual) effect Within Model

Call:
plm(formula = ln_wage ~ ttl_exp + lag(wks_work), data = nlswork, 
    model = "within")

Unbalanced Panel: n=3911, T=1-14, N=19619

Residuals :
    Min.  1st Qu.   Median  3rd Qu.     Max. 
-1.77000 -0.10100  0.00293  0.11000  2.90000 

Coefficients :
                Estimate Std. Error t-value  Pr(>|t|)    
ttl_exp       0.02341057 0.00073832 31.7078 < 2.2e-16 ***
lag(wks_work) 0.00081576 0.00010628  7.6755 1.744e-14 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Total Sum of Squares:    1296.9
Residual Sum of Squares: 1126.9
R-Squared:      0.13105
Adj. R-Squared: -0.085379
F-statistic: 1184.39 on 2 and 15706 DF, p-value: < 2.22e-16

但是,与Stata生产的产品相比,我得到了不同的结果.

However, I got different results compared what Stata produces.

在我的实际模型中,我想使用一个滞后值来检测一个内生变量.

In my actual model, I would like to instrument an endogenous variable with its lagged value.

谢谢!

作为参考,这是Stata代码:

For reference, here is the Stata code:

webuse nlswork.dta
xtset idcode year
xtreg ln_wage ttl_exp L1.wks_work, fe

Stata输出:

Fixed-effects (within) regression               Number of obs     =     10,680
Group variable: idcode                          Number of groups  =      3,671

R-sq:                                           Obs per group:
     within  = 0.1492                                         min =          1
     between = 0.2063                                         avg =        2.9
     overall = 0.1483                                         max =          8

                                                F(2,7007)         =     614.60
corr(u_i, Xb)  = 0.1329                         Prob > F          =     0.0000

------------------------------------------------------------------------------
     ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
     ttl_exp |   .0192578   .0012233    15.74   0.000     .0168597    .0216558
             |
    wks_work |
         L1. |   .0015891   .0001957     8.12   0.000     .0012054    .0019728
             |
       _cons |   1.502879   .0075431   199.24   0.000     1.488092    1.517666
-------------+----------------------------------------------------------------
     sigma_u |  .40678942
     sigma_e |  .28124886
         rho |  .67658275   (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(3670, 7007) = 4.71                  Prob > F = 0.0000

推荐答案

lag()就像在plm中那样,使行观察滞后,而不会看"时间变量,即,它会移动变量(每个人) ).如果时间维度存在差距,则可能需要考虑时间变量的值. (到目前为止)有一个未导出的函数plm:::lagt.pseries,它考虑了时间变量,因此可以按您期望的那样处理数据中的空缺.

lag() as it is in plm lags the observations row-wise without "looking" at the time variable, i.e. it shifts the variable (per individual). If there are gaps in the time dimension, you probably want to take the value of the time variable into account. There is the (as of now) unexported function plm:::lagt.pseries which takes the time variable into account and hence handles gaps in data as you might expect.

按以下方式使用它:

library(plm)
library(foreign)
nlswork <- read.dta("http://www.stata-press.com/data/r11/nlswork.dta")
pnlswork <- pdata.frame(nlswork, c('idcode', 'year'))
ffe <- plm(ln_wage ~ ttl_exp + plm:::lagt.pseries(wks_work,1)
           , model = 'within'
           , data = pnlswork)
summary(ffe)

Oneway (individual) effect Within Model

Call:
plm(formula = ln_wage ~ ttl_exp + plm:::lagt.pseries(wks_work, 
    1), data = nlswork, model = "within")

Unbalanced Panel: n=3671, T=1-8, N=10680

Residuals :
   Min. 1st Qu.  Median 3rd Qu.    Max. 
-1.5900 -0.0859  0.0000  0.0957  2.5600 

Coefficients :
                                  Estimate Std. Error t-value  Pr(>|t|)    
ttl_exp                         0.01925775 0.00122330 15.7425 < 2.2e-16 ***
plm:::lagt.pseries(wks_work, 1) 0.00158907 0.00019573  8.1186 5.525e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Total Sum of Squares:    651.49
Residual Sum of Squares: 554.26
R-Squared:      0.14924
Adj. R-Squared: -0.29659
F-statistic: 614.604 on 2 and 7007 DF, p-value: < 2.22e-16

Btw1:最好使用pdata.frame()而不是plm.data(). Btw2:您可以使用plm的is.pconsecutive():

Btw1: Better use pdata.frame() instead of plm.data(). Btw2: You can check for gaps in your data with plm's is.pconsecutive():

is.pconsecutive(pnlswork)
all(is.pconsecutive(pnlswork))

您还可以先使数据连续,然后再使用lag(),如下所示:

You can also make the data consecutive first and then use lag(), like this:

pnlswork2 <- make.pconsecutive(pnlswork)
pnlswork2$wks_work_lag <- lag(pnlswork2$wks_work)
ffe2 <- plm(ln_wage ~ ttl_exp + wks_work_lag
           , model = 'within'
           , data = pnlswork2)
summary(ffe2)

Oneway (individual) effect Within Model

Call:
plm(formula = ln_wage ~ ttl_exp + wks_work_lag, data = pnlswork2, 
    model = "within")

Unbalanced Panel: n=3671, T=1-8, N=10680

Residuals :
   Min. 1st Qu.  Median 3rd Qu.    Max. 
-1.5900 -0.0859  0.0000  0.0957  2.5600 

Coefficients :
               Estimate Std. Error t-value  Pr(>|t|)    
ttl_exp      0.01925775 0.00122330 15.7425 < 2.2e-16 ***
wks_work_lag 0.00158907 0.00019573  8.1186 5.525e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Total Sum of Squares:    651.49
Residual Sum of Squares: 554.26
R-Squared:      0.14924
Adj. R-Squared: -0.29659
F-statistic: 614.604 on 2 and 7007 DF, p-value: < 2.22e-16

或者简单地:

ffe3 <- plm(ln_wage ~ ttl_exp + lag(wks_work)
            , model = 'within'
            , data = pnlswork2) # note: it is the consecutive panel data set here
summary(ffe3)

Oneway (individual) effect Within Model

Call:
plm(formula = ln_wage ~ ttl_exp + lag(wks_work), data = pnlswork2, 
    model = "within")

Unbalanced Panel: n=3671, T=1-8, N=10680

Residuals :
   Min. 1st Qu.  Median 3rd Qu.    Max. 
-1.5900 -0.0859  0.0000  0.0957  2.5600 

Coefficients :
                Estimate Std. Error t-value  Pr(>|t|)    
ttl_exp       0.01925775 0.00122330 15.7425 < 2.2e-16 ***
lag(wks_work) 0.00158907 0.00019573  8.1186 5.525e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Total Sum of Squares:    651.49
Residual Sum of Squares: 554.26
R-Squared:      0.14924
Adj. R-Squared: -0.29659
F-statistic: 614.604 on 2 and 7007 DF, p-value: < 2.22e-16

这篇关于R plm滞后-等同于Stata中的L1.x?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆