在数据框中创建新列:组中的索引(组之间不唯一) [英] Create an new column in data frame : index in group (not unique between groups)
问题描述
我有一个包含两列的数据框:第一列包含每个人所属的组,第二列包含个人的ID。见下面:
I have a data frame with two columns: the first column contains the group to which each individual belongs, and the second the individual's ID. See below:
df <- data.frame( group=c('G1','G1','G1','G1','G2','G2','G2','G2'),
indiv=c('indiv1','indiv1','indiv2','indiv2','indiv3',
'indiv3','indiv4','indiv4'))
group indiv
1 G1 indiv1
2 G1 indiv1
3 G1 indiv2
4 G1 indiv2
5 G2 indiv3
6 G2 indiv3
7 G2 indiv4
8 G2 indiv4
我想在我的数据框架中创建一个新列(保留长格式)与组中每个人的索引,即:
I would like to create a new column in my data frame (retaining the long format) with the index of each individual in the group, that is:
group indiv Ineed
1 G1 indiv1 1
2 G1 indiv1 1
3 G1 indiv2 2
4 G1 indiv2 2
5 G2 indiv3 1
6 G2 indiv3 1
7 G2 indiv4 2
8 G2 indiv4 2
我已经尝试过data.table .N或者.GRP方法,没有成功(data.table的好工作!)。
I have tried with the data.table .N or .GRP methods, without success (nice work on data.table by the way!).
任何帮助非常感谢!
推荐答案
您可以使用新的 rleid
function here(从开发版本v> = 1.9.5)
You could use the new rleid
function here (from the development version v >= 1.9.5)
setDT(df)[, Ineed := rleid(indiv), group][]
# group indiv Ineed
# 1: G1 indiv1 1
# 2: G1 indiv1 1
# 3: G1 indiv2 2
# 4: G1 indiv2 2
# 5: G2 indiv3 1
# 6: G2 indiv3 1
# 7: G2 indiv4 2
# 8: G2 indiv4 2
或者,您可以转换为因子(以便创建唯一的组),然后将其转换回(如果使用CRAN稳定版本v <= 1.9.4)
Or you could convert to factors (in order to create unique groups) and then convert them back to numeric (if you using the CRAN stable version v <= 1.9.4)
setDT(df)[, Ineed := as.numeric(factor(indiv)), group][]
# group indiv Ineed
# 1: G1 indiv1 1
# 2: G1 indiv1 1
# 3: G1 indiv2 2
# 4: G1 indiv2 2
# 5: G2 indiv3 1
# 6: G2 indiv3 1
# 7: G2 indiv4 2
# 8: G2 indiv4 2
这篇关于在数据框中创建新列:组中的索引(组之间不唯一)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!