如何为列中的每组相同的值分配唯一的ID号 [英] How to assign a unique ID number to each group of identical values in a column

查看:185
本文介绍了如何为列中的每组相同的值分配唯一的ID号的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含多列的数据框。我想创建一个名为id的新列,为sample列中的每组相同的值提供唯一的ID号。



示例数据:

 #dput(df)
df< - structure(list(index = 1:30,val = c(14L,22L,1L,25L,3L,34L,
35L,36L,24L,35L,33L,31L,30L,30L ,29L,28L,26L,12L,41L,
36L,32L,37L,56L,34L,23L,24L,28L,22L,10L,19L),样品= c(5L,
6L, 6L,7L,7L,7L,8L,9L,10L,11L,11L,12L,13L,14L,14L,
15L,15L,15L,16L,17L,18L,18L,19L,19L,19L, 20L,21L,22L,
23L,23L)),.Names = c(index,val,sample),class =data.frame,
row.names = c(NA,-30L))

头(df)
指数val样本
1 1 14 5
2 2 22 6
3 3 1 6
4 4 25 7
5 5 3 7
6 6 34 7

我想要得到以下结论:

  index val sample id 
1 1 14 5 1
2 2 22 6 2
3 3 1 6 2
4 4 25 7 3
5 5 3 7 3
6 6 34 7 3

我无法做到这一点,因为任何建议都会赞赏



谢谢!

解决方案


$ b

  df2<  -  transform(df,id = as.numeric(factor(sample)))
/ pre>



我认为这个(从在R中创建一个唯一的ID 应该更有效率,尽管有点难以记住:

  df3<  -  transform(df,id = match(sample,unique(sample)))
all.equal(df2,df3) ## TRUE


I have a data frame with a number of columns. I would like to create a new column called "id" that gives a unique id number to each group of identical values in the "sample" column.

Example data:

# dput(df)
df <- structure(list(index = 1:30, val = c(14L, 22L, 1L, 25L, 3L, 34L, 
35L, 36L, 24L, 35L, 33L, 31L, 30L, 30L, 29L, 28L, 26L, 12L, 41L, 
36L, 32L, 37L, 56L, 34L, 23L, 24L, 28L, 22L, 10L, 19L), sample = c(5L, 
6L, 6L, 7L, 7L, 7L, 8L, 9L, 10L, 11L, 11L, 12L, 13L, 14L, 14L, 
15L, 15L, 15L, 16L, 17L, 18L, 18L, 19L, 19L, 19L, 20L, 21L, 22L, 
23L, 23L)), .Names = c("index", "val", "sample"), class = "data.frame", 
row.names = c(NA, -30L))

head(df)
  index val sample 
1     1  14      5  
2     2  22      6  
3     3   1      6  
4     4  25      7  
5     5   3      7  
6     6  34      7  

What I would like to end up with:

  index val sample id
1     1  14      5  1
2     2  22      6  2
3     3   1      6  2
4     4  25      7  3
5     5   3      7  3
6     6  34      7  3

I am having trouble doing this as any advice would be much appreciated.

Thanks!

解决方案

How about

df2 <- transform(df,id=as.numeric(factor(sample)))

?

I think this (cribbed from Creating a unique ID in R) should be slightly more efficient, although perhaps a little harder to remember:

df3 <- transform(df, id=match(sample, unique(sample)))
all.equal(df2,df3)  ## TRUE

这篇关于如何为列中的每组相同的值分配唯一的ID号的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆