如何仅按组将具有最低和最高值的行保留在特定列中? [英] How do I only keep the rows with the lowest and highest value in a certain column, by groups?

查看:109
本文介绍了如何仅按组将具有最低和最高值的行保留在特定列中?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

简而言之,我该怎么做

structure(list(id = c(1, 2, 3, 4, 5, 6), user = c(1, 1, 1, 2, 
2, 2), value = c(1, 3, 5, 2, 5, 9)), .Names = c("id", "user", 
"value"), row.names = c(NA, -6L), class = "data.frame")

对此?

structure(list(id = c(1, 3, 4, 6), user = c(1, 1, 2, 2), value = c(1, 
5, 2, 9)), .Names = c("id", "user", "value"), row.names = c(NA, 
-4L), class = "data.frame")

含义是,对于每个用户,只需保留与最低和最高相对应的两行。

Meaning, for each user, need to keep only the two rows corresponding to the lowest and highest value.

如果可能的话,我想使用 dplyr 解决方案。否则,任何解决方案都可以。

I'd like a solution using dplyr, if possible. Otherwise, any solution is fine.

推荐答案

我们可以将 slice which.min / which.max 按用户分组后

We can use slice with which.min/which.max after grouping by 'user'

library(dplyr)
df1 %>%
   group_by(user) %>%
   slice(c(which.min(value), which.max(value)))
#   id  user value
#  <dbl> <dbl> <dbl>
#1     1     1     1
#2     3     1     5
#3     4     2     2
#4     6     2     9






或者另一个选择是 arrange 切片。按用户分组后,排列将值按升序排列,每个用户和切片的第一个最后一行


Or another option is arrange with slice. After grouping by 'user', arrange the 'value' in ascending for each 'user' and slice the first and last row

df1 %>% 
     group_by(user) %>%
     arrange(value) %>% 
     slice(c(1, n()))






如果存在最小值和/或最大值'值'并且想要保留所有 min max 行,请使用 filter


If there are ties for min and/or max 'value' and wanted to keep all the min and max rows, use filter

df1 %>%
     group_by(user) %>% 
     filter(value %in% c(min(value), max(value)))

这篇关于如何仅按组将具有最低和最高值的行保留在特定列中?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆