立即在data.table中创建一堆滞后变量 [英] Creating a bunch of lagged variables in data.table at once
问题描述
我想在data.table中同时创建一堆滞后的变量。我想让这些滞后的价值观是站和陆路。我有一些困难。这是我的示例data.table。
require(data.table)
r< =结构(c(1L,1L,1L,1L,1L,1L,
2L,2L,2L,2L,2L,2L),.Label = c(A,B factor),
landcover = structure(c(2L,2L,2L,4L,4L,4L,1L,1L,1L,
3L,3L,3L)泡沫,混合森林,其他2,
沙),类=因素),cv = c(0.273287412020818,0.453346217936644,
0.235088531585817,0.703112865400233,0.221907230708271,
0.278459655651048,0.376646346809308,0.676646346809308,0.6262970017835398,
0.296458678818467,0.390335320625924,0.712476246695341,
0.535612484651002)),.Names = c(station,landcover,cv
),row.names = c(NA,-12L),class = c(data.table,data.frame
))
#station landcover cv
#1:森林0.2732874
#2:A Mixed Forest 0.4533462
#3:A Mixed Forest 0.2350885
#4:A Sand 0.7031129
#5:A Sand 0.2219072
#6 :A砂0.2784597
#7:B foam 0.3766463
#8:B foam 0.6629700
#9:B foam 0.2964587
#10:B other2 0.3903353
#11 :B other2 0.7124762
#12:B other2 0.5356125
我想创建一堆滞后变量。我甚至不关心在这一点上产生的NA值。如何创建一个看起来像下面的一个data.table没有写这么多的代码?我需要这个仍然在data.table。
r [,cv.lag1:= c ),head(cv,-1)),by = c(station,landcover)]
r [,cv.lag2: ),by = c(station,landcover)]
r [,cv.lag3:= c(rep(NA,3),head(cv,-3)站landcover)]
r [,cv.lag4:= c(rep(NA,4),head(cv,-4)
r [,cv.lag5:= c(rep(NA,5),head(cv,-5)),by = c(station,landcover)]
r [,cv。 lag6:= c(rep(NA,6),head(cv,-6))by = c(station,landcover)]
r [,cv.lag7: ,cv.lag8:= c(rep(NA,8),head(cv,-7),head(cv, -8)),by = c(station,landcover)]
r [,cv.lag9:= c(rep(NA,9),head(cv,-9)),by = c (station,landcover)]
r [,cv.lag10:= c(rep(NA,10),head(cv, )]
station landcover cv cv.lag1 cv.lag2 cv.lag3 cv.lag4 cv.lag5 cv.lag6 cv.lag7 cv.lag8 cv.lag9 cv.lag10
1:A混合林0.2732874 NA NA NA NA NA NA NA NA NA NA NA
2:A Mixed Forest 0.4533462 0.2732874 NA NA NA NA NA NA NA NA NA
3:A Mixed Forest 0.2350885 0.4533462 0.2732874 NA NA NA NA NA NA NA NA
4:A Sand 0.7031129 NA NA NA NA NA NA NA NA NA NA
5:A Sand 0.2219072 0.7031129 NA NA NA NA NA NA NA NA NA
6:A Sand 0.2784597 0.2219072 0.7031129 NA NA NA NA NA NA NA NA NA b $ b 7:B泡沫0.3766463 NA NA NA NA NA NA NA NA NA NA
8:B泡沫0.6629700 0.3766463 NA NA NA NA NA NA NA NA NA
9:B泡沫0.2964587 0.6629700 0.3766463 NA NA NA NA NA NA NA NA
10:B other2 0.3903353 NA NA NA NA NA NA NA NA NA NA
11:B other2 0.7124762 0.3903353 NA NA NA NA NA NA NA NA NA
12:B other2 0.5356125 0.7124762 0.3903353 NA NA NA NA NA NA NA NA
> r [,c(粘贴(cv.lag,1:10,sep =)):= lapply(1:10, i))),by = list(station,landcover)]
station landcover cv cv.lag1 cv.lag2 cv.lag3 cv.lag4 cv.lag5 cv.lag6 cv.lag7 cv.lag8 cv .lag9 cv.lag10
1:混交林0.2732874 NA NA NA NA NA NA NA NA NA NA
2:A Mixed Forest 0.4533462 0.2732874 NA NA NA NA NA NA NA NA NA
3 :混合林0.2350885 0.4533462 0.2732874 NA NA NA NA NA NA NA NA
4:A Sand 0.7031129 NA NA NA NA NA NA NA NA NA NA
5:A Sand 0.2219072 0.7031129 NA NA NA NA NA NA NA NA NA
6:A砂0.2784597 0.2219072 0.7031129 NA NA NA NA NA NA NA NA
7:B泡沫0.3766463 NA NA NA NA NA NA NA NA NA NA
8:B foam 0.6629700 0.3766463 NA NA NA NA NA NA NA NA NA NA
9:B泡沫0.2964587 0.6629700 0.3766463 NA NA NA NA NA NA NA NA
10:B other2 0.3903353 NA NA NA NA NA NA NA NA NA NA
11:B other2 0.7124762 0.3903353 NA NA NA NA NA NA NA NA NA
12:B other2 0.5356125 0.7124762 0.3903353 NA NA NA NA NA NA NA NA
I am trying to create a bunch of lagged variables all at once in data.table. I want these lagged values to be by station and by landcover. I am having some difficulty. Here is my example data.table.
require(data.table)
r <- structure(list(station = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L, 2L, 2L), .Label = c("A", "B"), class = "factor"),
landcover = structure(c(2L, 2L, 2L, 4L, 4L, 4L, 1L, 1L, 1L,
3L, 3L, 3L), .Label = c("foam", "Mixed Forest", "other2",
"Sand"), class = "factor"), cv = c(0.273287412020818, 0.453346217936644,
0.235088531585817, 0.703112865400233, 0.221907230708271,
0.278459655651048, 0.376646346809308, 0.662970017835398,
0.296458678818467, 0.390335320625924, 0.712476246695341,
0.535612484651002)), .Names = c("station", "landcover", "cv"
), row.names = c(NA, -12L), class = c("data.table", "data.frame"
))
# station landcover cv
# 1: A Mixed Forest 0.2732874
# 2: A Mixed Forest 0.4533462
# 3: A Mixed Forest 0.2350885
# 4: A Sand 0.7031129
# 5: A Sand 0.2219072
# 6: A Sand 0.2784597
# 7: B foam 0.3766463
# 8: B foam 0.6629700
# 9: B foam 0.2964587
# 10: B other2 0.3903353
# 11: B other2 0.7124762
# 12: B other2 0.5356125
I want to create a bunch of lagged variables. I am not even concerned about the NA values that will result at this point. How do I create a data.table that looks like the one below without writing so much code? I need this to still be in data.table.
r[, cv.lag1 := c(rep(NA,1), head(cv, -1)),by=c("station","landcover")]
r[, cv.lag2 := c(rep(NA,2), head(cv, -2)),by=c("station","landcover")]
r[, cv.lag3 := c(rep(NA,3), head(cv, -3)),by=c("station","landcover")]
r[, cv.lag4 := c(rep(NA,4), head(cv, -4)),by=c("station","landcover")]
r[, cv.lag5 := c(rep(NA,5), head(cv, -5)),by=c("station","landcover")]
r[, cv.lag6 := c(rep(NA,6), head(cv, -6)),by=c("station","landcover")]
r[, cv.lag7 := c(rep(NA,7), head(cv, -7)),by=c("station","landcover")]
r[, cv.lag8 := c(rep(NA,8), head(cv, -8)),by=c("station","landcover")]
r[, cv.lag9 := c(rep(NA,9), head(cv, -9)),by=c("station","landcover")]
r[, cv.lag10 := c(rep(NA,10), head(cv, -10)),by=c("station","landcover")]
station landcover cv cv.lag1 cv.lag2 cv.lag3 cv.lag4 cv.lag5 cv.lag6 cv.lag7 cv.lag8 cv.lag9 cv.lag10
1: A Mixed Forest 0.2732874 NA NA NA NA NA NA NA NA NA NA
2: A Mixed Forest 0.4533462 0.2732874 NA NA NA NA NA NA NA NA NA
3: A Mixed Forest 0.2350885 0.4533462 0.2732874 NA NA NA NA NA NA NA NA
4: A Sand 0.7031129 NA NA NA NA NA NA NA NA NA NA
5: A Sand 0.2219072 0.7031129 NA NA NA NA NA NA NA NA NA
6: A Sand 0.2784597 0.2219072 0.7031129 NA NA NA NA NA NA NA NA
7: B foam 0.3766463 NA NA NA NA NA NA NA NA NA NA
8: B foam 0.6629700 0.3766463 NA NA NA NA NA NA NA NA NA
9: B foam 0.2964587 0.6629700 0.3766463 NA NA NA NA NA NA NA NA
10: B other2 0.3903353 NA NA NA NA NA NA NA NA NA NA
11: B other2 0.7124762 0.3903353 NA NA NA NA NA NA NA NA NA
12: B other2 0.5356125 0.7124762 0.3903353 NA NA NA NA NA NA NA NA
Thanks to Arun for providing the answer in an elegant one line solution.
r[, c(paste("cv.lag", 1:10, sep="")) := lapply(1:10, function(i) c(rep(NA, i), head(cv, -i))), by=list(station,landcover)]
station landcover cv cv.lag1 cv.lag2 cv.lag3 cv.lag4 cv.lag5 cv.lag6 cv.lag7 cv.lag8 cv.lag9 cv.lag10
1: A Mixed Forest 0.2732874 NA NA NA NA NA NA NA NA NA NA
2: A Mixed Forest 0.4533462 0.2732874 NA NA NA NA NA NA NA NA NA
3: A Mixed Forest 0.2350885 0.4533462 0.2732874 NA NA NA NA NA NA NA NA
4: A Sand 0.7031129 NA NA NA NA NA NA NA NA NA NA
5: A Sand 0.2219072 0.7031129 NA NA NA NA NA NA NA NA NA
6: A Sand 0.2784597 0.2219072 0.7031129 NA NA NA NA NA NA NA NA
7: B foam 0.3766463 NA NA NA NA NA NA NA NA NA NA
8: B foam 0.6629700 0.3766463 NA NA NA NA NA NA NA NA NA
9: B foam 0.2964587 0.6629700 0.3766463 NA NA NA NA NA NA NA NA
10: B other2 0.3903353 NA NA NA NA NA NA NA NA NA NA
11: B other2 0.7124762 0.3903353 NA NA NA NA NA NA NA NA NA
12: B other2 0.5356125 0.7124762 0.3903353 NA NA NA NA NA NA NA NA
这篇关于立即在data.table中创建一堆滞后变量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!