dplyr :: if_else是否同时评估TRUE和FALSE? [英] Does dplyr::if_else evaluate both TRUE and FALSE at the same time?

查看:47
本文介绍了dplyr :: if_else是否同时评估TRUE和FALSE?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

考虑以下示例:

library(dplyr)

# sample data 
set.seed(1)
mydf <- data.frame(value = as.logical(sample(0:1, 15, replace = TRUE)), group = rep(letters[1:3],each = 5), index = 1:5)

# finds either index of first "TRUE" value by group, or the last value. 
# works with base::ifelse
mydf %>% group_by(group) %>% mutate(max_value = ifelse(all(!value), max(index), index[min(which(value))]))
#> # A tibble: 15 x 4
#> # Groups:   group [3]
#>    value group index   max_value
#>    <lgl> <fct> <int>      <int>
#>  1 FALSE a         1          2
#>  2 TRUE  a         2          2
#>  3 FALSE a         3          2
#>  4 FALSE a         4          2
#>  5 TRUE  a         5          2
#>  6 FALSE b         1          4
#>  7 FALSE b         2          4
#>  8 FALSE b         3          4
#>  9 TRUE  b         4          4
#> 10 TRUE  b         5          4
#> 11 FALSE c         1          5
#> 12 FALSE c         2          5
#> 13 FALSE c         3          5
#> 14 FALSE c         4          5
#> 15 FALSE c         5          5

# the same gives a warning with dplyr::if_else
mydf %>% group_by(group) %>% mutate(max_value = if_else(all(!value), max(index), index[min(which(value))]))

#> Warning in min(which(value)): no non-missing arguments to min; returning Inf

#> # A tibble: 15 x 4
#> # Groups:   group [3]
#>    value group index  max_value
#>    <lgl> <fct> <int>      <int>
#>  1 FALSE a         1          2
#>  2 TRUE  a         2          2
#>  3 FALSE a         3          2
#>  4 FALSE a         4          2
#>  5 TRUE  a         5          2
#>  6 FALSE b         1          4
#>  7 FALSE b         2          4
#>  8 FALSE b         3          4
#>  9 TRUE  b         4          4
#> 10 TRUE  b         5          4
#> 11 FALSE c         1          5
#> 12 FALSE c         2          5
#> 13 FALSE c         3          5
#> 14 FALSE c         4          5
#> 15 FALSE c         5          5

如代码中所注释- dplyr :: if_else 确实会导致警告

As commented in the code - dplyr::if_else does result in the warning

以min(which(value))表示的警告:min没有不可缺少的参数;返回Inf

Warning in min(which(value)): no non-missing arguments to min; returning Inf

删除所有FALSE"组c-不再发出警告:

Removing the "all FALSE" group c - no warning any more:

mydf_allTRUE <- mydf
mydf_allTRUE[14, 'value'] <- TRUE

mydf_allTRUE %>% group_by(group) %>% mutate(max_value = if_else(all(!value), max(index), index[min(which(value))]))
#> # A tibble: 15 x 4
#> # Groups:   group [3]
#>    value group index max_value
#>    <lgl> <fct> <int>     <int>
#>  1 FALSE a         1         2
#>  2 TRUE  a         2         2
#>  3 FALSE a         3         2
#>  4 FALSE a         4         2
#>  5 TRUE  a         5         2
#>  6 FALSE b         1         4
#>  7 FALSE b         2         4
#>  8 FALSE b         3         4
#>  9 TRUE  b         4         4
#> 10 TRUE  b         5         4
#> 11 FALSE c         1         4
#> 12 FALSE c         2         4
#> 13 FALSE c         3         4
#> 14 TRUE  c         4         4
#> 15 FALSE c         5         4

reprex软件包(v0.3.0)于2019年12月22日创建sup>

Created on 2019-12-22 by the reprex package (v0.3.0)

让我感到困惑的是(我相信)我以 FALSE 部分( index [min(which(value)))必须包含一个值.为什么这会发出警告?这是有问题的,因为我有成千上万个组的数据,并且大多数数据都在"FALSE"位中,并且警告使计算极其缓慢.

What confuses me, is that (I believe that) I constructed the TRUE part in a way that the FALSE part (index[min(which(value))]) must contain a value. Why does this then give a warning? It is problematic, because I have data with several thousand groups and most of them are in the "FALSE" bit and the warnings make the computation extremely slow.

我很高兴使用 base :: ifelse ,但是我只是想知道 dplyr :: if_else 是如何同时评估TRUE和FALSE方面的,这是否在某种程度上是相同的时间?

I am happy to use base::ifelse, but I just wondered how dplyr::if_else is evaluating both TRUE and FALSE sides, is this somehow at the same time?

推荐答案

问题是因为我们正在检查以下情况:有些组返回的 NULL与 which(value)`

The issue is because we are checking cases where there are groups that return NULL withwhich(value)`

min(NULL)
#[1] Inf

警告消息:在min(NULL)中:min没有非丢失的参数;返回Inf

Warning message: In min(NULL) : no non-missing arguments to min; returning Inf


一个选项是通过使用 [1] 进行索引使哪个输出,以返回 NA


An option is to subject the which output by indexing with [1] to return NA

mydf %>%
   group_by(group) %>%
   mutate(max_value = if_else(all(!value), max(index), index[which(value)[1]]))
# A tibble: 15 x 4
# Groups:   group [3]
#   value group index max_value
#   <lgl> <fct> <int>     <int>
# 1 FALSE a         1         2
# 2 TRUE  a         2         2
# 3 FALSE a         3         2
# 4 FALSE a         4         2
# 5 TRUE  a         5         2
# 6 FALSE b         1         4
# 7 FALSE b         2         4
# 8 FALSE b         3         4
# 9 TRUE  b         4         4
#10 TRUE  b         5         4
#11 FALSE c         1         5
#12 FALSE c         2         5
#13 FALSE c         3         5
#14 FALSE c         4         5
#15 FALSE c         5         5


在这种情况下,由于我们要返回单个元素,因此 if/else 会更合适

mydf %>%
    group_by(group) %>%
    mutate(max_value = if(all(!value)) max(index) else index[which(value)[1]])
# A tibble: 15 x 4
# Groups:   group [3]
#   value group index max_value
#   <lgl> <fct> <int>     <int>
# 1 FALSE a         1         2
# 2 TRUE  a         2         2
# 3 FALSE a         3         2
# 4 FALSE a         4         2
# 5 TRUE  a         5         2
# 6 FALSE b         1         4
# 7 FALSE b         2         4
# 8 FALSE b         3         4
# 9 TRUE  b         4         4
#10 TRUE  b         5         4
#11 FALSE c         1         5
#12 FALSE c         2         5
#13 FALSE c         3         5
#14 FALSE c         4         5
#15 FALSE c         5         5

这篇关于dplyr :: if_else是否同时评估TRUE和FALSE?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆