为每个具有重复和缺失观察值的组生成 id [英] generate id for each group with repeated and missing observations
问题描述
我有一个数据集,其中的个人观察了数周.有些人在几周内没有观察,有些人在同一周内有几次观察.我需要创建一个特定于个人的每周 ID(代码中的 id_week).如果一个人在一周内有两次或更多次观察,则两次观察的 id_week 应该相同.如果一个人在给定的一周内没有观察,下周的观察应该是从最后一个观察点开始的结果.这将导致以下数据:
I have a dataset with individuals observed over several weeks. Some individuals have no observations in some weeks, and some have several observations during the same week. I need to create a weekly ID(id_week in the code) that would be individual-specific. If an individual have two or more observations in one week, id_week should be the same for both observations. If an individual have no observations in a given week, the observation in a next week should be consuequent from the last observed point. This would result in a following data:
dt<-data.frame(individ=c(1,1,1,2,2,2,3,3,3,3),week=c(1,2,2,1,2,4,1,3,4,4),id_week=c(1,2,2,1,2,3,1,2,3,3))
我有 tride dt[, id := .GRP, by = .(individ, week)]
但它只给了我几个星期的 ID,不考虑个人.我也尝试过 dplyr 解决方案,但它没有考虑一周内的重复观察,为每一行分配一个 ID,这不是我需要的.
I have tride dt[, id := .GRP, by = .(individ, week)]
but it gives me just ID for weeks, not taken individuals into account. I also tried dplyr solution but it does not account for repeated observations within one week, assigning an ID to every line, which is not what I need.
dt%>%
group_by(individ)%>%
mutate(pp = row_number(week))
推荐答案
一个使用 data.table
的选项:
setDT(dt)[, id_week := rleid(week), individ]
这篇关于为每个具有重复和缺失观察值的组生成 id的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!