如何按组获取具有最早时间戳的数据帧的行? [英] How to get rows, by group, of data frame with earliest timestamp?
本文介绍了如何按组获取具有最早时间戳的数据帧的行?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
df <- data.frame(group=c(1,2,4,2,1,4,2,3,3),
ts=c("2014-02-13","2014-06-01","2014-02-14","2014-02-11","2013-02-01","2014-02-02","2014-03-21","2014-12-01","2014-02-11"),
letter=letters[1:9])
df$ts <- as.Date(df$ts,format='%Y-%m-%d')
我想找到一个操作来生成包含每组最小时间戳的完整行,在这种情况下,
I want to find an operation that will produce the complete rows containing the minimum timestamp per group, in this case,
group ts letter
1 2013-02-01 e
4 2014-02-02 f
2 2014-02-11 d
3 2014-02-11 i
一个快速而肮脏(和缓慢)的基础 R 解决方案是
A quick and dirty (and slow) base R solution would be
dfo <- data.frame(df[order(df$ts,decreasing=F),],index=seq(1:nrow(df)))
mins <- tapply(dfo$index,dfo$group,min)
dfo[dfo$index %in% mins,]
直觉上,我认为如果有一种方法可以按组添加订单索引,那么我可以只过滤到该列的值为 1 的位置,但我不确定如何在没有大量子集和重新加入的情况下执行它.
Intuitively, I think if there was a way to add an order index by group then I could just filter to where that column's value is 1, but I'm not sure how to execute it without lots of subsetting and rejoining.
推荐答案
你可以使用 dplyr
library(dplyr)
group_by(df, group) %>% summarise(min = min(ts), letter = letter[which.min(ts)])
# group min letter
# 1 1 2013-02-01 e
# 2 2 2014-02-11 d
# 3 3 2014-02-11 i
# 4 4 2014-02-02 f
您也可以切片
已排序的行
group_by(df, group) %>%
mutate(rank = row_number(ts)) %>%
arrange(rank) %>%
slice(1)
这篇关于如何按组获取具有最早时间戳的数据帧的行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文