为每个组生成具有重复和缺失观测值的ID [英] generate id for each group with repeated and missing observations

查看:100
本文介绍了为每个组生成具有重复和缺失观测值的ID的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据集,其中有几个星期观察到的个体.有些人在几周内没有观察到的东西,有些人在同一周内有一些观察到的东西.我需要创建一个特定于个人的每周ID(代码中的id_week).如果一个人在一周内有两个或多个观察值,则两个观察值的id_week应该相同.如果某人在给定的一周内没有观察到数据,则应该从最后一个观察点开始进行下一周的观察.这将导致以下数据:

I have a dataset with individuals observed over several weeks. Some individuals have no observations in some weeks, and some have several observations during the same week. I need to create a weekly ID(id_week in the code) that would be individual-specific. If an individual have two or more observations in one week, id_week should be the same for both observations. If an individual have no observations in a given week, the observation in a next week should be consuequent from the last observed point. This would result in a following data:

dt<-data.frame(individ=c(1,1,1,2,2,2,3,3,3,3),week=c(1,2,2,1,2,4,1,3,4,4),id_week=c(1,2,2,1,2,3,1,2,3,3))

我有部落dt[, id := .GRP, by = .(individ, week)],但它只给我几周的ID,没有考虑个人.我还尝试了dplyr解决方案,但它不能解决一个星期内的重复观察问题,而是为每行分配一个ID,这不是我所需要的.

I have tride dt[, id := .GRP, by = .(individ, week)] but it gives me just ID for weeks, not taken individuals into account. I also tried dplyr solution but it does not account for repeated observations within one week, assigning an ID to every line, which is not what I need.

dt%>%
group_by(individ)%>%
mutate(pp = row_number(week))

推荐答案

使用data.table的选项:

setDT(dt)[, id_week := rleid(week), individ]

这篇关于为每个组生成具有重复和缺失观测值的ID的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆