在数据框中创建新列:组中的索引(组之间不唯一) [英] Create an new column in data frame : index in group (not unique between groups)

查看:97
本文介绍了在数据框中创建新列:组中的索引(组之间不唯一)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含两列的数据框:第一列包含每个人所属的组,第二列包含个人的ID。见下面:

I have a data frame with two columns: the first column contains the group to which each individual belongs, and the second the individual's ID. See below:

df <- data.frame( group=c('G1','G1','G1','G1','G2','G2','G2','G2'), 
      indiv=c('indiv1','indiv1','indiv2','indiv2','indiv3',
              'indiv3','indiv4','indiv4'))

   group   indiv
1     G1  indiv1
2     G1  indiv1
3     G1  indiv2
4     G1  indiv2
5     G2  indiv3
6     G2  indiv3
7     G2  indiv4
8     G2  indiv4

我想在我的数据框架中创建一个新列(保留长格式)与组中每个人的索引,即:

I would like to create a new column in my data frame (retaining the long format) with the index of each individual in the group, that is:

   group   indiv  Ineed
1     G1  indiv1      1
2     G1  indiv1      1
3     G1  indiv2      2
4     G1  indiv2      2
5     G2  indiv3      1
6     G2  indiv3      1
7     G2  indiv4      2
8     G2  indiv4      2

我已经尝试过data.table .N或者.GRP方法,没有成功(data.table的好工作!)。

I have tried with the data.table .N or .GRP methods, without success (nice work on data.table by the way!).

任何帮助非常感谢!

推荐答案

您可以使用新的 rleid function here(从开发版本v> = 1.9.5)

You could use the new rleid function here (from the development version v >= 1.9.5)

setDT(df)[, Ineed := rleid(indiv), group][]
#    group  indiv Ineed
# 1:    G1 indiv1     1
# 2:    G1 indiv1     1
# 3:    G1 indiv2     2
# 4:    G1 indiv2     2
# 5:    G2 indiv3     1
# 6:    G2 indiv3     1
# 7:    G2 indiv4     2
# 8:    G2 indiv4     2

或者,您可以转换为因子(以便创建唯一的组),然后将其转换回(如果使用CRAN稳定版本v <= 1.9.4)

Or you could convert to factors (in order to create unique groups) and then convert them back to numeric (if you using the CRAN stable version v <= 1.9.4)

setDT(df)[, Ineed := as.numeric(factor(indiv)), group][]
#    group  indiv Ineed
# 1:    G1 indiv1     1
# 2:    G1 indiv1     1
# 3:    G1 indiv2     2
# 4:    G1 indiv2     2
# 5:    G2 indiv3     1
# 6:    G2 indiv3     1
# 7:    G2 indiv4     2
# 8:    G2 indiv4     2

这篇关于在数据框中创建新列:组中的索引(组之间不唯一)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆