事件时间到面板数据 [英] time to event for panel data
问题描述
我有一个国家年份的面板数据集。我想计算事件发生后的时间,以及每个国家/地区的活动总数,随着时间的流逝我会逐渐减少。我在 doBy
包中使用了 timeSinceEvent
函数,该函数返回一个具有所需值的数据框,
I have a panel data set of country years. I would like to calculate time since event, as well as get a running total of events per country which I can decay over time. I am using the timeSinceEvent
function in the doBy
package, which returns a data frame which has the values that I want, but I am having trouble applying this to my main df.
structure(list(ccode.a = c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 20L, 20L, 20L, 20L, 20L,
20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L,
20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L,
20L, 20L, 20L, 31L, 31L, 31L, 31L, 31L, 31L, 31L, 31L, 31L, 31L,
31L, 31L, 31L, 31L, 31L, 31L, 31L, 31L, 31L, 31L, 31L, 31L, 31L,
31L, 31L, 31L, 31L, 31L, 31L, 31L, 31L, 31L, 31L, 31L, 40L, 40L,
40L, 40L, 40L, 40L, 40L, 40L, 40L, 40L, 40L, 40L, 40L, 40L, 40L,
40L, 40L, 40L, 40L, 40L, 40L, 40L, 40L, 40L, 40L, 40L, 40L, 40L,
40L, 40L, 40L, 40L, 40L, 40L, 41L, 41L, 41L, 41L, 41L, 41L, 41L,
41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L,
41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L,
41L, 42L, 42L, 42L, 42L, 42L, 42L, 42L, 42L, 42L, 42L, 42L, 42L,
42L, 42L, 42L, 42L, 42L, 42L, 42L, 42L, 42L, 42L, 42L, 42L, 42L,
42L, 42L, 42L, 42L, 42L), year = c(1975, 1976, 1977, 1978, 1979,
1980, 1981, 1982, 1983, 1984, 1985, 1986, 1987, 1988, 1989, 1990,
1991, 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001,
2002, 2003, 2004, 2005, 2006, 2007, 2008, 1975, 1976, 1977, 1978,
1979, 1980, 1981, 1982, 1983, 1984, 1985, 1986, 1987, 1988, 1989,
1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000,
2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 1975, 1976, 1977,
1978, 1979, 1980, 1981, 1982, 1983, 1984, 1985, 1986, 1987, 1988,
1989, 1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999,
2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 1975, 1976,
1977, 1978, 1979, 1980, 1981, 1982, 1983, 1984, 1985, 1986, 1987,
1988, 1989, 1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997, 1998,
1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 1975,
1976, 1977, 1978, 1979, 1980, 1981, 1982, 1983, 1984, 1985, 1986,
1987, 1988, 1989, 1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997,
1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008,
1975, 1976, 1977, 1978, 1979, 1980, 1981, 1982, 1983, 1984, 1985,
1986, 1987, 1988, 1989, 1990, 1991, 1992, 1993, 1994, 1995, 1996,
1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004), onset.a = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0)), .Names = c("ccode.a", "year",
"onset.a"), row.names = c(NA, 200L), class = "data.frame")
我尝试使用此功能:
last.step <- function(x) {
temp <- timeSinceEvent(x$onset.a, x$year)
cbind(x[,1],temp) #timeSinceEvent cuts off the country ID
}
result <- do.call("rbind", by(data, data$ccode.a, last.step))
以及
test <- by(data, data$ccode.a, function(x) timeSinceEvent(data$onset.a, data$year))
无济于事。我逐步完成了该功能,它似乎正在执行我想要的操作,但是我想我调用它的方式有问题吗?
To little avail. I stepped through the function, and it seems to be doing what I want, but I guess there is a problem in the way that I am calling it?
推荐答案
最终不得不修改 doBy
包中的 timeSinceEvent
。这是有效的最终代码。 lselzer表示感谢,指出了 plyr
中的 rbind.fill
,RoyalTS指出了 timeSinceEvent
当 yvar
参数全为零时,返回 null
。
Ended up having to modify timeSinceEvent
in the doBy
package a bit. Here is the final code that worked. Kudos to lselzer for pointing out rbind.fill
in plyr
and RoyalTS for pointing out that timeSinceEvent
returns null
when the yvar
argument is all zeros.
panel.tse <- function(yvar, tvar = seq_along(yvar)){
if (!(is.numeric(yvar) | is.logical(yvar))){
stop("yvar must be either numeric or logical")
}
yvar[is.na(yvar)] <- 0
event.idx <- which(yvar == 1)
run <- cumsum(yvar)
un <- unique(run)
tlist <- list()
for (i in 1:length(un)){
v <- un[[i]]
y <- yvar[run == v]
t <- tvar[run == v]
t <- t - t[1]
tlist[[i]] <- t
}
timeAfterEvent <- unlist(tlist)
timeAfterEvent[run == 0] <- NA
run[run == 0] <- NA
ans <- cbind(data.frame(yvar = yvar, tvar = tvar), run, tae = timeAfterEvent)
return(ans)
}
last.step <- function(x) {
temp <- panel.tse(x$onset.a, x$year)
cbind(x[,1],temp)
}
result <- do.call(rbind.fill, by(data, data$ccode.a, last.step))
这篇关于事件时间到面板数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!