如何使用 Dplyr 的 Summarize 和 which() 来查找最小值/最大值 [英] How to use Dplyr's Summarize and which() to lookup min/max values

查看:17
本文介绍了如何使用 Dplyr 的 Summarize 和 which() 来查找最小值/最大值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下数据:

Name <- c("Sam", "Sarah", "Jim", "Fred", "James", "Sally", "Andrew", "John", "Mairin", "Kate", "Sasha", "Ray", "Ed")
Age <- c(22,12,31,35,58,82,17,34,12,24,44,67,43)
Group <- c("A", "B", "B", "B", "B", "C", "C", "D", "D", "D", "D", "D", "D") 
data <- data.frame(Name, Age, Group)

我想用 dplyr 来

And I'd like to use dplyr to

(1) 按Group"对数据进行分组(2) 显示每个组内的最小和最大年龄(3) 显示最小和最大年龄的人名

(1) group the data by "Group" (2) show the min and max Age within each Group (3) show the Name of the person with the min and max ages

以下代码执行此操作:

data %>% group_by(Group) %>%
     summarize(minAge = min(Age), minAgeName = Name[which(Age == min(Age))], 
               maxAge = max(Age), maxAgeName = Name[which(Age == max(Age))])

哪个效果好:

  Group minAge minAgeName maxAge maxAgeName
1     A     22        Sam     22        Sam
2     B     12      Sarah     58      James
3     C     17     Andrew     82      Sally
4     D     12     Mairin     67        Ray

但是,如果有多个最小值或最大值,我会遇到问题:

However, I have a problem if there are multiple min or max values:

Name <- c("Sam", "Sarah", "Jim", "Fred", "James", "Sally", "Andrew", "John", "Mairin", "Kate", "Sasha", "Ray", "Ed")
Age <- c(22,31,31,35,58,82,17,34,12,24,44,67,43)
Group <- c("A", "B", "B", "B", "B", "C", "C", "D", "D", "D", "D", "D", "D") 
data <- data.frame(Name, Age, Group)

> data %>% group_by(Group) %>%
+   summarize(minAge = min(Age), minAgeName = Name[which(Age == min(Age))], 
+             maxAge = max(Age), maxAgeName = Name[which(Age == max(Age))])
Error: expecting a single value

我正在寻找两种解决方案:

I'm looking for two solutions:

(1) 显示哪个最小或最大名称无关紧要,只显示那个(即找到的第一个值)(2) 如果有关系",则显示所有最小值和最大值

(1) where it doesn't matter which min or max name is shown, just that one is shown (i.e., the first value found) (2) where if there are "ties" all minimum values and maximum values are shown

如果不清楚,请告诉我,并提前致谢!

Please let me know if this isn't clear and thanks in advance!

推荐答案

您可以使用 which.minwhich.max 来获取第一个值.

You can use which.min and which.max to get the first value.

data %>% group_by(Group) %>%
  summarize(minAge = min(Age), minAgeName = Name[which.min(Age)], 
            maxAge = max(Age), maxAgeName = Name[which.max(Age)])

要获取所有值,请使用例如粘贴适当的 collapse 参数.

To get all values, use e.g. paste with an appropriate collapse argument.

data %>% group_by(Group) %>%
  summarize(minAge = min(Age), minAgeName = paste(Name[which(Age == min(Age))], collapse = ", "), 
            maxAge = max(Age), maxAgeName = paste(Name[which(Age == max(Age))], collapse = ", "))

这篇关于如何使用 Dplyr 的 Summarize 和 which() 来查找最小值/最大值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆