跨多列的每一行的最小值(或最大值) [英] minimum (or maximum) value of each row across multiple columns
问题描述
我正在为每行列的最小值(或最大值)寻找解决方案.喜欢:
I am looking for a solution for min(or max) value for each row of columns. Like:
# my data.frame is df:
library(tibble)
df <- tribble(
~name, ~type_1, ~type_2, ~type_3,
"a", 1, 5, 2,
"b", 2, 2, 6,
"c", 3, 8, 2
)
# and output should be result_df:
result_df <- tribble(
~name, ~type_1, ~type_2, ~type_3, ~min_val, ~min_col,
"a", 1, 5, 2, 1, "type_1",
"b", 8, 2, 6, 2, "type_2",
"c", 3, 8, 0, 0 ,"type_3"
)
我尝试了 rowwise
和 pmax
函数,但是没有用.我可以使用收集"和分组",但是我想知道是否有按列/按行的解决方案.
I tried rowwise
and pmax
function but it did not work. I can use gather and grouping but I want to know is there column/row-wise solution.
这种方法对于均值,中位数函数也将很有用.
This approach will be also useful for mean, median functions.
感谢您的帮助.
推荐答案
一种相当通用的方法是整形以临时整形为长形,这使计算更加容易-普通分组的 mutate
.
A fairly generalizable approach is to reshape to temporarily reshape to long form, which makes the calculations easier—an ordinary grouped mutate
.
library(tidyr)
library(dplyr)
df <- tribble(
~name, ~type_1, ~type_2, ~type_3,
"a", 1, 5, 2,
"b", 8, 2, 6,
"c", 3, 8, 2
)
df %>%
gather(type, type_val, contains('type')) %>%
group_by(name) %>%
mutate(min_val = min(type_val),
min_col = type[type_val == min_val]) %>%
spread(type, type_val)
#> # A tibble: 3 x 6
#> # Groups: name [3]
#> name min_val min_col type_1 type_2 type_3
#> <chr> <dbl> <chr> <dbl> <dbl> <dbl>
#> 1 a 1 type_1 1 5 2
#> 2 b 2 type_2 8 2 6
#> 3 c 2 type_3 3 8 2
在实践中,最好通过删除 spread
调用将数据保留为长格式.
In practice, it may be preferable to leave the data in long form by dropping the spread
call.
注意事项:
- 如果多个值可以等于最小值(或最大值或中位数或其他值),则
type_val == min_val
将具有两个真实值,因此必须进一步总结以减少它是一个数字,例如which.min
如何返回 first 最小值. - 从规模上讲,重塑可能是昂贵的,因此更可取但更优化的方法(例如利用
max.col
)是优选的.
- If more than one value can be equal to the min (or max or median or whatever),
type_val == min_val
will have two true values, and will thus have to be further summarized to reduce it to a single number, e.g. howwhich.min
returns the first minimum. - At scale, reshaping may be expensive, so more convoluted but optimized approaches (e.g. leveraging
max.col
) may be preferable.
这篇关于跨多列的每一行的最小值(或最大值)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!