在R中生成移动总和变量 [英] Generating a moving sum variable in R
问题描述
我怀疑这是一个具有多个解决方案的简单问题,但是我对R还是有点新手,详尽的搜索并没有得出与我想做的事情相吻合的答案。
由于缺乏更好的用语,我试图为数据框中的变量创建移动总和。这将是3年和5年的总和,滞后一年。因此,1986年一个观测值的5年总和就是1981、1982、1983、1984和1985年以前所有观测值的总和。这是我想做的一个例子,其中总和变量是观察年之前五年中所有 x
的总和。
国家x x5yrsum
A 1980 9 NA
A 1981 3 NA
A 1982 5 NA
A 1983 6 NA
A 1984 9 NA
A 1985 7 32
A 1986 9 30
A 1987 4 36
.....................
B 1990 0不适用
B 1991 4不适用
B 1992 2不适用
B 1993 6不适用
B 1994 3不适用
B 1995 7 15
B 1996 0 22
这是不平衡的面板数据。我怀疑 ddply
是合适的,但我不知道确切的编码。
任何输入都会
您可以在<$ c $中使用过滤器
c> ddply (或任何其他实现 split-apply-combine方法的函数):
库(plyr)
ddply(DF,。(国家),变换,
x5yrsum2 = as.numeric(filter(x,c(0,rep(1,5)),sides = 1) ))
#国家年份x x5yrsum x5yrsum2
#1 A 1980 9 NA NA
#2 A 1981 3 NA NA
#3 A 1982 5 NA NA
#4 A 1983 6不适用不适用
#5 A 1984 9不适用不适用
#6 A 1985 7 32 32
#7 A 1986 9 30 30
#8 A 1987 4 36 36
#9 B 1990 0不适用不适用
#10 B 1991 4不适用不适用
#11 B 1992 2不适用不适用
#12 B 1993 6 NA NA
#13 B 1994 3 NA NA
#14 B 1995 7 15 15
#15 B 1996 0 22 22
I suspect this is a somewhat simple question with multiple solutions, but I'm still a bit of a novice in R and an exhaustive search didn't yield answers that spoke well to what I'm wanting to do.
I'm trying to create, for lack of better term, "moving sums" for a variable in my data frame. These would be 3-year and 5-year sums, lagged one year. So, a 5-year sum for an observation in 1986 would be the sum of all previous observations in 1981, 1982, 1983, 1984, and 1985. Here is an example of what I would like to do, where the sum variable is the sum of all x
in the five years prior to the observation year.
country year x x5yrsum
A 1980 9 NA
A 1981 3 NA
A 1982 5 NA
A 1983 6 NA
A 1984 9 NA
A 1985 7 32
A 1986 9 30
A 1987 4 36
.....................
B 1990 0 NA
B 1991 4 NA
B 1992 2 NA
B 1993 6 NA
B 1994 3 NA
B 1995 7 15
B 1996 0 22
This is unbalanced panel data. I suspect ddply
would be appropriate, but I wouldn't know the exact coding for it.
Any input would be appreciated.
You can use filter
in ddply
(or any other function implementing the "split-apply-combine" approach):
library(plyr)
ddply(DF, .(country), transform,
x5yrsum2 = as.numeric(filter(x,c(0,rep(1,5)),sides=1)))
# country year x x5yrsum x5yrsum2
# 1 A 1980 9 NA NA
# 2 A 1981 3 NA NA
# 3 A 1982 5 NA NA
# 4 A 1983 6 NA NA
# 5 A 1984 9 NA NA
# 6 A 1985 7 32 32
# 7 A 1986 9 30 30
# 8 A 1987 4 36 36
# 9 B 1990 0 NA NA
# 10 B 1991 4 NA NA
# 11 B 1992 2 NA NA
# 12 B 1993 6 NA NA
# 13 B 1994 3 NA NA
# 14 B 1995 7 15 15
# 15 B 1996 0 22 22
这篇关于在R中生成移动总和变量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!