立即在data.table中创建一堆滞后变量 [英] Creating a bunch of lagged variables in data.table at once

查看:118
本文介绍了立即在data.table中创建一堆滞后变量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在data.table中同时创建一堆滞后的变量。我想让这些滞后的价值观是站和陆路。我有一些困难。这是我的示例data.table。

  require(data.table)
r< =结构(c(1L,1L,1L,1L,1L,1L,
2L,2L,2L,2L,2L,2L),.Label = c(A,B factor),
landcover = structure(c(2L,2L,2L,4L,4L,4L,1L,1L,1L,
3L,3L,3L)泡沫,混合森林,其他2,
沙),类=因素),cv = c(0.273287412020818,0.453346217936644,
0.235088531585817,0.703112865400233,0.221907230708271,
0.278459655651048,0.376646346809308,0.676646346809308,0.6262970017835398,
0.296458678818467,0.390335320625924,0.712476246695341,
0.535612484651002)),.Names = c(station,landcover,cv
),row.names = c(NA,-12L),class = c(data.table,data.frame
))

#station landcover cv
#1:森林0.2732874
#2:A Mixed Forest 0.4533462
#3:A Mixed Forest 0.2350885
#4:A Sand 0.7031129
#5:A Sand 0.2219072
#6 :A砂0.2784597
#7:B foam 0.3766463
#8:B foam 0.6629700
#9:B foam 0.2964587
#10:B other2 0.3903353
#11 :B other2 0.7124762
#12:B other2 0.5356125

我想创建一堆滞后变量。我甚至不关心在这一点上产生的NA值。如何创建一个看起来像下面的一个data.table没有写这么多的代码?我需要这个仍然在data.table。

  r [,cv.lag1:= c ),head(cv,-1)),by = c(station,landcover)] 
r [,cv.lag2: ),by = c(station,landcover)]
r [,cv.lag3:= c(rep(NA,3),head(cv,-3)站landcover)]
r [,cv.lag4:= c(rep(NA,4),head(cv,-4)
r [,cv.lag5:= c(rep(NA,5),head(cv,-5)),by = c(station,landcover)]
r [,cv。 lag6:= c(rep(NA,6),head(cv,-6))by = c(station,landcover)]
r [,cv.lag7: ,cv.lag8:= c(rep(NA,8),head(cv,-7),head(cv, -8)),by = c(station,landcover)]
r [,cv.lag9:= c(rep(NA,9),head(cv,-9)),by = c (station,landcover)]
r [,cv.lag10:= c(rep(NA,10),head(cv, )]

station landcover cv cv.lag1 cv.lag2 cv.lag3 cv.lag4 cv.lag5 cv.lag6 cv.lag7 cv.lag8 cv.lag9 cv.lag10
1:A混合林0.2732874 NA NA NA NA NA NA NA NA NA NA NA
2:A Mixed Forest 0.4533462 0.2732874 NA NA NA NA NA NA NA NA NA
3:A Mixed Forest 0.2350885 0.4533462 0.2732874 NA NA NA NA NA NA NA NA
4:A Sand 0.7031129 NA NA NA NA NA NA NA NA NA NA
5:A Sand 0.2219072 0.7031129 NA NA NA NA NA NA NA NA NA
6:A Sand 0.2784597 0.2219072 0.7031129 NA NA NA NA NA NA NA NA NA b $ b 7:B泡沫0.3766463 NA NA NA NA NA NA NA NA NA NA
8:B泡沫0.6629700 0.3766463 NA NA NA NA NA NA NA NA NA
9:B泡沫0.2964587 0.6629700 0.3766463 NA NA NA NA NA NA NA NA
10:B other2 0.3903353 NA NA NA NA NA NA NA NA NA NA
11:B other2 0.7124762 0.3903353 NA NA NA NA NA NA NA NA NA
12:B other2 0.5356125 0.7124762 0.3903353 NA NA NA NA NA NA NA NA


  

> r [,c(粘贴(cv.lag,1:10,sep =)):= lapply(1:10, i))),by = list(station,landcover)]

station landcover cv cv.lag1 cv.lag2 cv.lag3 cv.lag4 cv.lag5 cv.lag6 cv.lag7 cv.lag8 cv .lag9 cv.lag10
1:混交林0.2732874 NA NA NA NA NA NA NA NA NA NA
2:A Mixed Forest 0.4533462 0.2732874 NA NA NA NA NA NA NA NA NA
3 :混合林0.2350885 0.4533462 0.2732874 NA NA NA NA NA NA NA NA
4:A Sand 0.7031129 NA NA NA NA NA NA NA NA NA NA
5:A Sand 0.2219072 0.7031129 NA NA NA NA NA NA NA NA NA
6:A砂0.2784597 0.2219072 0.7031129 NA NA NA NA NA NA NA NA
7:B泡沫0.3766463 NA NA NA NA NA NA NA NA NA NA
8:B foam 0.6629700 0.3766463 NA NA NA NA NA NA NA NA NA NA
9:B泡沫0.2964587 0.6629700 0.3766463 NA NA NA NA NA NA NA NA
10:B other2 0.3903353 NA NA NA NA NA NA NA NA NA NA
11:B other2 0.7124762 0.3903353 NA NA NA NA NA NA NA NA NA
12:B other2 0.5356125 0.7124762 0.3903353 NA NA NA NA NA NA NA NA


I am trying to create a bunch of lagged variables all at once in data.table. I want these lagged values to be by station and by landcover. I am having some difficulty. Here is my example data.table.

require(data.table)
    r <- structure(list(station = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 
2L, 2L, 2L, 2L, 2L, 2L), .Label = c("A", "B"), class = "factor"), 
    landcover = structure(c(2L, 2L, 2L, 4L, 4L, 4L, 1L, 1L, 1L, 
    3L, 3L, 3L), .Label = c("foam", "Mixed Forest", "other2", 
    "Sand"), class = "factor"), cv = c(0.273287412020818, 0.453346217936644, 
    0.235088531585817, 0.703112865400233, 0.221907230708271, 
    0.278459655651048, 0.376646346809308, 0.662970017835398, 
    0.296458678818467, 0.390335320625924, 0.712476246695341, 
    0.535612484651002)), .Names = c("station", "landcover", "cv"
), row.names = c(NA, -12L), class = c("data.table", "data.frame"
))

# station    landcover        cv
# 1:       A Mixed Forest 0.2732874
# 2:       A Mixed Forest 0.4533462
# 3:       A Mixed Forest 0.2350885
# 4:       A         Sand 0.7031129
# 5:       A         Sand 0.2219072
# 6:       A         Sand 0.2784597
# 7:       B         foam 0.3766463
# 8:       B         foam 0.6629700
# 9:       B         foam 0.2964587
# 10:       B       other2 0.3903353
# 11:       B       other2 0.7124762
# 12:       B       other2 0.5356125

I want to create a bunch of lagged variables. I am not even concerned about the NA values that will result at this point. How do I create a data.table that looks like the one below without writing so much code? I need this to still be in data.table.

r[, cv.lag1 :=  c(rep(NA,1), head(cv, -1)),by=c("station","landcover")]
r[, cv.lag2 :=  c(rep(NA,2), head(cv, -2)),by=c("station","landcover")]
r[, cv.lag3 :=  c(rep(NA,3), head(cv, -3)),by=c("station","landcover")]
r[, cv.lag4 :=  c(rep(NA,4), head(cv, -4)),by=c("station","landcover")]
r[, cv.lag5 :=  c(rep(NA,5), head(cv, -5)),by=c("station","landcover")]
r[, cv.lag6 :=  c(rep(NA,6), head(cv, -6)),by=c("station","landcover")]
r[, cv.lag7 :=  c(rep(NA,7), head(cv, -7)),by=c("station","landcover")]
r[, cv.lag8 :=  c(rep(NA,8), head(cv, -8)),by=c("station","landcover")]
r[, cv.lag9 :=  c(rep(NA,9), head(cv, -9)),by=c("station","landcover")]
r[, cv.lag10 := c(rep(NA,10), head(cv, -10)),by=c("station","landcover")]

    station    landcover        cv   cv.lag1   cv.lag2 cv.lag3 cv.lag4 cv.lag5 cv.lag6 cv.lag7 cv.lag8 cv.lag9 cv.lag10
 1:       A Mixed Forest 0.2732874        NA        NA      NA      NA      NA      NA      NA      NA      NA       NA
 2:       A Mixed Forest 0.4533462 0.2732874        NA      NA      NA      NA      NA      NA      NA      NA       NA
 3:       A Mixed Forest 0.2350885 0.4533462 0.2732874      NA      NA      NA      NA      NA      NA      NA       NA
 4:       A         Sand 0.7031129        NA        NA      NA      NA      NA      NA      NA      NA      NA       NA
 5:       A         Sand 0.2219072 0.7031129        NA      NA      NA      NA      NA      NA      NA      NA       NA
 6:       A         Sand 0.2784597 0.2219072 0.7031129      NA      NA      NA      NA      NA      NA      NA       NA
 7:       B         foam 0.3766463        NA        NA      NA      NA      NA      NA      NA      NA      NA       NA
 8:       B         foam 0.6629700 0.3766463        NA      NA      NA      NA      NA      NA      NA      NA       NA
 9:       B         foam 0.2964587 0.6629700 0.3766463      NA      NA      NA      NA      NA      NA      NA       NA
10:       B       other2 0.3903353        NA        NA      NA      NA      NA      NA      NA      NA      NA       NA
11:       B       other2 0.7124762 0.3903353        NA      NA      NA      NA      NA      NA      NA      NA       NA
12:       B       other2 0.5356125 0.7124762 0.3903353      NA      NA      NA      NA      NA      NA      NA       NA

解决方案

Thanks to Arun for providing the answer in an elegant one line solution.

r[, c(paste("cv.lag", 1:10, sep="")) := lapply(1:10, function(i) c(rep(NA, i), head(cv, -i))), by=list(station,landcover)]

    station    landcover        cv   cv.lag1   cv.lag2 cv.lag3 cv.lag4 cv.lag5 cv.lag6 cv.lag7 cv.lag8 cv.lag9 cv.lag10
 1:       A Mixed Forest 0.2732874        NA        NA      NA      NA      NA      NA      NA      NA      NA       NA
 2:       A Mixed Forest 0.4533462 0.2732874        NA      NA      NA      NA      NA      NA      NA      NA       NA
 3:       A Mixed Forest 0.2350885 0.4533462 0.2732874      NA      NA      NA      NA      NA      NA      NA       NA
 4:       A         Sand 0.7031129        NA        NA      NA      NA      NA      NA      NA      NA      NA       NA
 5:       A         Sand 0.2219072 0.7031129        NA      NA      NA      NA      NA      NA      NA      NA       NA
 6:       A         Sand 0.2784597 0.2219072 0.7031129      NA      NA      NA      NA      NA      NA      NA       NA
 7:       B         foam 0.3766463        NA        NA      NA      NA      NA      NA      NA      NA      NA       NA
 8:       B         foam 0.6629700 0.3766463        NA      NA      NA      NA      NA      NA      NA      NA       NA
 9:       B         foam 0.2964587 0.6629700 0.3766463      NA      NA      NA      NA      NA      NA      NA       NA
10:       B       other2 0.3903353        NA        NA      NA      NA      NA      NA      NA      NA      NA       NA
11:       B       other2 0.7124762 0.3903353        NA      NA      NA      NA      NA      NA      NA      NA       NA
12:       B       other2 0.5356125 0.7124762 0.3903353      NA      NA      NA      NA      NA      NA      NA       NA

这篇关于立即在data.table中创建一堆滞后变量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆