按日期为每个组添加订购的ID [英] Add ordered ID for each group by date

查看:51
本文介绍了按日期为每个组添加订购的ID的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想向数据框中的每个组添加一个有序的ID(按日期).我可以使用dplyr( R-添加在组内依序计数但重复重复的列):

I want to add an ordered ID (by date) to each group in a data frame. I can do this using dplyr (R - add column that counts sequentially within groups but repeats for duplicates):

# Example data
date <- rep(c("2016-10-06 11:56:00","2016-10-05 11:56:00","2016-10-05 11:56:00","2016-10-07 11:56:00"),2)
date <- as.POSIXct(date)
group <- c(rep("A",4), rep("B",4))    
df <- data.frame(group, date)

# dplyr - dense_rank
df2 <- df %>% group_by(group) %>% 
       mutate(m.test=dense_rank(date))

   group                date m.test
  <fctr>              <dttm>  <int>
1      A 2016-10-06 11:56:00      2
2      A 2016-10-05 11:56:00      1
3      A 2016-10-05 11:56:00      1
4      A 2016-10-07 11:56:00      3
5      B 2016-10-06 11:56:00      2
6      B 2016-10-05 11:56:00      1
7      B 2016-10-05 11:56:00      1
8      B 2016-10-07 11:56:00      3

因此,我的新列 m.test date 对每个 group 进行排名.如果我使用 rleid data.table ,它似乎不起作用(05/10在06/10之后排名):

So my new column m.test ranks each group by date. If I use rleid and data.table, it doesn't seem to work (05/10 ranked after 06/10):

df3 <- setDT(df)[, m.test := rleid(date), by = group]

   group                date m.test
1:     A 2016-10-06 11:56:00      1
2:     A 2016-10-05 11:56:00      2
3:     A 2016-10-05 11:56:00      2
4:     A 2016-10-07 11:56:00      3
5:     B 2016-10-06 11:56:00      1
6:     B 2016-10-05 11:56:00      2
7:     B 2016-10-05 11:56:00      2
8:     B 2016-10-07 11:56:00      3

我弄错了语法吗?

推荐答案

感谢@docendo discimus,使用 data.table 执行此操作的正确方法是 frank(...,ties.method =密集"):

Thanks to @docendo discimus, the correct way to do this with data.table is frank(..., ties.method = "dense"):

df4 <- setDT(df)[, m.test := frank(date, ties.method = "dense"), by = group]

这篇关于按日期为每个组添加订购的ID的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆