根据两列R分配ID [英] Assign an ID based on two columns R
本文介绍了根据两列R分配ID的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一些看起来像这样的数据。我想通过电子邮件和wk_id分配一个 ID。
I have some data that looks like this. I want to assign an 'ID' by email and wk_id.
row_num email wk_id
1 aaaa 1/4/15
2 aaaa 1/11/15
3 aaaa 1/25/15
4 bbbb 6/29/14
5 bbbb 9/7/14
6 cccc 11/16/14
7 cccc 11/30/14
8 cccc 12/7/14
9 cccc 12/14/14
10 cccc 12/21/14
11 cccc 12/28/14
12 cccc 1/4/15
13 cccc 1/25/15
我希望数据看起来像这样。
I want the data to look like this.
row_num email wk_id ID
1 aaaa 1/4/15 1
2 aaaa 1/11/15 2
3 aaaa 1/25/15 3
4 bbbb 6/29/14 1
5 bbbb 9/7/14 2
6 cccc 11/16/14 1
7 cccc 11/30/14 2
8 cccc 12/7/14 3
9 cccc 12/14/14 4
10 cccc 12/21/14 5
11 cccc 12/28/14 6
12 cccc 1/4/15 7
13 cccc 1/25/15 8
我不知道如何在每次遇到新的电子邮件地址时重置计数器。我试过 data.table
和 ddply
,但还是不太明白。
I can't figure out how to get the "counter" to reset everytime it hits a new email address. I've tried data.table
and ddply
but still can't quite get it.
推荐答案
您可以尝试:
library(dplyr)
df %>%
group_by(email) %>%
mutate(ID = row_number())
哪个给出:
#Source: local data frame [13 x 4]
#Groups: email
#
# row_num email wk_id ID
#1 1 aaaa 1/4/15 1
#2 2 aaaa 1/11/15 2
#3 3 aaaa 1/25/15 3
#4 4 bbbb 6/29/14 1
#5 5 bbbb 9/7/14 2
#6 6 cccc 11/16/14 1
#7 7 cccc 11/30/14 2
#8 8 cccc 12/7/14 3
#9 9 cccc 12/14/14 4
#10 10 cccc 12/21/14 5
#11 11 cccc 12/28/14 6
#12 12 cccc 1/4/15 7
#13 13 cccc 1/25/15 8
或使用数据。表
library(data.table)
setDT(df)[, ID:= 1:.N, email]
或 ave
来自 base R
df$ID <- with(df, ave(row_num, email, FUN=seq_along))
这篇关于根据两列R分配ID的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文