包含最大值和因子 [英] Aggregate with max and factors

查看：90 发布时间：2020/5/8 0:06:04 r max aggregate min factors

本文介绍了包含最大值和因子的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个data.frame，其中包含因子列，我想在该列上计算最大值(或最小值或分位数).我无法在因素上使用这些功能，但我想使用.

I have a data.frame with columns of factors, on which I want to compute a max (or min, or quantiles). I can't use these functions on factors, but I want to.

下面是一些示例:

set.seed(3)
df1 <- data.frame(id = rep(1:5,each=2),height=sample(c("low","medium","high"),size = 10,replace=TRUE))
df1$height <- factor(df1$height,c("low","medium","high"))
df1$height_num <- as.numeric(df1$height)
# > df1
#    id height height_num
# 1   1    low          1
# 2   1   high          3
# 3   2 medium          2
# 4   2    low          1
# 5   3 medium          2
# 6   3 medium          2
# 7   4    low          1
# 8   4    low          1
# 9   5 medium          2
# 10  5 medium          2

我可以轻松做到这一点:

I can easily do this:

aggregate(height_num ~ id,df1,max)
#   id height_num
# 1  1          3
# 2  2          2
# 3  3          2
# 4  4          1
# 5  5          2

但不是这样:

aggregate(height ~ id,df1,max)
# Error in Summary.factor(c(2L, 2L), na.rm = FALSE) : 
#   ‘max’ not meaningful for factors

我想采用最大的高度"，并在汇总表中保持与原始表相同的级别.在我的真实数据中，我有很多列，并且我希望对因子进行排序，以保持图的清洁和一致.

I want to take the biggest "height", and keep in my aggregated table the same levels as in the original table. In my real data I have many columns and I want to keep my factors sorted to keep my plots clean and consistent.

我可以这样做，并且在其他聚合函数中也使用以下结构:

I can do it this way, and use the following structure in other aggregating functions as well :

use_factors <- function(x,FUN){factor(levels(x)[FUN(as.numeric(x))],levels(x))}
aggregate(height ~ id,df1,use_factors,max)
#   id height
# 1  1   high
# 2  2 medium
# 3  3 medium
# 4  4    low
# 5  5 medium

或者我可以重载我认为的max min median和quantile函数但是我觉得我肯定在重新发明轮子.

Or I could overload the max min median and quantile functions I suppose But I feel I'm surely reinventing the wheel.

有一种简单的方法吗?

推荐答案

实际上，如果使用有序因子，则可以进行所需的聚合.

Actually, you can do the aggregation that you want, if you use an ordered factor.

set.seed(3)
df1 <- data.frame(id = rep(1:5,each=2),height=sample(c("low","medium","high"),size = 10,replace=TRUE))
df1$height <- factor(df1$height,c("low","medium","high"), ordered = TRUE)
df1$height_num <- as.numeric(df1$height)

aggregate(height~id, df1, max) 
  id height
1  1   high
2  2 medium
3  3 medium
4  4    low
5  5 medium

这篇关于包含最大值和因子的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

包含最大值和因子 [英] Aggregate with max and factors

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

包含最大值和因子 [英] Aggregate with max and factors

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭