跨多列的每一行的最小值(或最大值) [英] minimum (or maximum) value of each row across multiple columns

查看:46
本文介绍了跨多列的每一行的最小值(或最大值)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在为每行列的最小值(或最大值)寻找解决方案.喜欢:

I am looking for a solution for min(or max) value for each row of columns. Like:

# my data.frame is df:

library(tibble)
df <- tribble(
~name, ~type_1, ~type_2, ~type_3,
"a",   1,   5, 2,
"b",   2,   2, 6,
"c",   3,   8, 2
)

# and output should be result_df:

result_df <- tribble(
~name, ~type_1, ~type_2, ~type_3, ~min_val, ~min_col,
"a",   1,          5,     2,          1, "type_1",
"b",   8,          2,     6,          2, "type_2",
"c",   3,          8,     0,          0 ,"type_3"
)

我尝试了 rowwise pmax 函数,但是没有用.我可以使用收集"和分组",但是我想知道是否有按列/按行的解决方案.

I tried rowwise and pmax function but it did not work. I can use gather and grouping but I want to know is there column/row-wise solution.

这种方法对于均值,中位数函数也将很有用.

This approach will be also useful for mean, median functions.

感谢您的帮助.

推荐答案

一种相当通用的方法是整形以临时整形为长形,这使计算更加容易-普通分组的 mutate .

A fairly generalizable approach is to reshape to temporarily reshape to long form, which makes the calculations easier—an ordinary grouped mutate.

library(tidyr)
library(dplyr)

df <- tribble(
    ~name, ~type_1, ~type_2, ~type_3,
    "a",   1,   5, 2,
    "b",   8,   2, 6,
    "c",   3,   8, 2
)

df %>% 
    gather(type, type_val, contains('type')) %>% 
    group_by(name) %>% 
    mutate(min_val = min(type_val), 
           min_col = type[type_val == min_val]) %>% 
    spread(type, type_val)
#> # A tibble: 3 x 6
#> # Groups:   name [3]
#>   name  min_val min_col type_1 type_2 type_3
#>   <chr>   <dbl> <chr>    <dbl>  <dbl>  <dbl>
#> 1 a           1 type_1       1      5      2
#> 2 b           2 type_2       8      2      6
#> 3 c           2 type_3       3      8      2

在实践中,最好通过删除 spread 调用将数据保留为长格式.

In practice, it may be preferable to leave the data in long form by dropping the spread call.

注意事项:

  • 如果多个值可以等于最小值(或最大值或中位数或其他值),则 type_val == min_val 将具有两个真实值,因此必须进一步总结以减少它是一个数字,例如 which.min 如何返回 first 最小值.
  • 从规模上讲,重塑可能是昂贵的,因此更可取但更优化的方法(例如利用 max.col )是优选的.
  • If more than one value can be equal to the min (or max or median or whatever), type_val == min_val will have two true values, and will thus have to be further summarized to reduce it to a single number, e.g. how which.min returns the first minimum.
  • At scale, reshaping may be expensive, so more convoluted but optimized approaches (e.g. leveraging max.col) may be preferable.

这篇关于跨多列的每一行的最小值(或最大值)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆