如何使用 Dplyr 的 Summarize 和 which() 来查找最小值/最大值 [英] How to use Dplyr's Summarize and which() to lookup min/max values
问题描述
我有以下数据:
Name <- c("Sam", "Sarah", "Jim", "Fred", "James", "Sally", "Andrew", "John", "Mairin", "Kate", "Sasha", "Ray", "Ed")
Age <- c(22,12,31,35,58,82,17,34,12,24,44,67,43)
Group <- c("A", "B", "B", "B", "B", "C", "C", "D", "D", "D", "D", "D", "D")
data <- data.frame(Name, Age, Group)
我想用 dplyr 来
And I'd like to use dplyr to
(1) 按Group"对数据进行分组(2) 显示每个组内的最小和最大年龄(3) 显示最小和最大年龄的人名
(1) group the data by "Group" (2) show the min and max Age within each Group (3) show the Name of the person with the min and max ages
以下代码执行此操作:
data %>% group_by(Group) %>%
summarize(minAge = min(Age), minAgeName = Name[which(Age == min(Age))],
maxAge = max(Age), maxAgeName = Name[which(Age == max(Age))])
哪个效果好:
Group minAge minAgeName maxAge maxAgeName
1 A 22 Sam 22 Sam
2 B 12 Sarah 58 James
3 C 17 Andrew 82 Sally
4 D 12 Mairin 67 Ray
但是,如果有多个最小值或最大值,我会遇到问题:
However, I have a problem if there are multiple min or max values:
Name <- c("Sam", "Sarah", "Jim", "Fred", "James", "Sally", "Andrew", "John", "Mairin", "Kate", "Sasha", "Ray", "Ed")
Age <- c(22,31,31,35,58,82,17,34,12,24,44,67,43)
Group <- c("A", "B", "B", "B", "B", "C", "C", "D", "D", "D", "D", "D", "D")
data <- data.frame(Name, Age, Group)
> data %>% group_by(Group) %>%
+ summarize(minAge = min(Age), minAgeName = Name[which(Age == min(Age))],
+ maxAge = max(Age), maxAgeName = Name[which(Age == max(Age))])
Error: expecting a single value
我正在寻找两种解决方案:
I'm looking for two solutions:
(1) 显示哪个最小或最大名称无关紧要,只显示那个(即找到的第一个值)(2) 如果有关系",则显示所有最小值和最大值
(1) where it doesn't matter which min or max name is shown, just that one is shown (i.e., the first value found) (2) where if there are "ties" all minimum values and maximum values are shown
如果不清楚,请告诉我,并提前致谢!
Please let me know if this isn't clear and thanks in advance!
推荐答案
您可以使用 which.min
和 which.max
来获取第一个值.
You can use which.min
and which.max
to get the first value.
data %>% group_by(Group) %>%
summarize(minAge = min(Age), minAgeName = Name[which.min(Age)],
maxAge = max(Age), maxAgeName = Name[which.max(Age)])
要获取所有值,请使用例如粘贴适当的 collapse
参数.
To get all values, use e.g. paste with an appropriate collapse
argument.
data %>% group_by(Group) %>%
summarize(minAge = min(Age), minAgeName = paste(Name[which(Age == min(Age))], collapse = ", "),
maxAge = max(Age), maxAgeName = paste(Name[which(Age == max(Age))], collapse = ", "))
这篇关于如何使用 Dplyr 的 Summarize 和 which() 来查找最小值/最大值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!