R - 如何为唯一组序列的缺失值添加行? [英] R - How to add rows for missing values for unique group sequences?
问题描述
我的问题与上一个问题类似 在data.frame中为缺失值添加行的最快方法?
My problem is similar to this previous question Fastest way to add rows for missing values in a data.frame?
当最小/最大值因组而异时,我不知道如何添加用NA"填充的行.
I can't figure out how to add rows padded with "NA" when the min/max is different by group.
> red<-data.frame(project = c(6, 6, 6, 6, 6, 9, 9, 9), period =c(1, 2, 5:7, 2, 4, 5), v3=letters[1:8], v4=c("red", "yellow", recursive = T))
> red
project period v3 v4
1 6 1 a red
2 6 2 b yellow
3 6 5 c red
4 6 6 d yellow
5 6 7 e red
6 9 2 f yellow
7 9 4 g red
8 9 5 h yellow
我希望它看起来像:
project period v3 v4
6 1 a red
6 2 b yellow
6 3 NA NA
6 4 NA NA
6 5 c red
6 6 d yellow
6 7 e red
9 2 f yellow
9 3 NA NA
9 4 g red
9 5 h yellow
当我使用时
library(data.table)
DT=as.data.table(red)
setkey(DT, project, period)
DT[CJ(unique(project), seq(min(period), max(period)))]
它使每个项目组有7个时期;项目 6 应该有周期 1-7,但项目 9 应该有周期 2-5.
it made each project group have 7 periods; Project 6 should have periods 1-7, but Project 9 should have periods 2-5.
我试过摆弄 .SD[which.max(period)], by=project]
但没有雪茄.
I've tried fiddling with .SD[ which.max(period)], by=project]
but no cigar.
我认为它应该是 seq() 中的简单内容,但我尝试了 seq(min(period, by=project))
没有运气
I thought it should be something simple in the seq(), but I tried seq(min(period, by=project))
with no luck
谢谢!
推荐答案
DT[setkey(DT[, .(min(period):max(period)), by = project], project, V1)]
# project period v3 v4
# 1: 6 1 a red
# 2: 6 2 b yellow
# 3: 6 3 NA NA
# 4: 6 4 NA NA
# 5: 6 5 c red
# 6: 6 6 d yellow
# 7: 6 7 e red
# 8: 9 2 f yellow
# 9: 9 3 NA NA
#10: 9 4 g red
#11: 9 5 h yellow
这篇关于R - 如何为唯一组序列的缺失值添加行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!