过时的数据掩码.在`dplyr::mutate()`结束后解析`xxxxxx`为时已晚 [英] Obsolete data mask. Too late to resolve `xxxxxx` after the end of `dplyr::mutate()`

查看:37
本文介绍了过时的数据掩码.在`dplyr::mutate()`结束后解析`xxxxxx`为时已晚的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

作为我对 这篇文章,我提出了一种完全通用的机制,通过该机制可以通过存储在另一个数据帧中的条件过滤一个数据帧.OP 叫我出去(该死!)并要求我实施.

As part of my answer to this post, I suggested a completely generic mechanism by which one data frame could be filtered by conditions stored in another. The OP has called me out (damn!) and asked me for an implementation.

我的解决方案要求我在过滤器数据框中存储函数.这是可能的:这篇文章展示了方法.

My solution requires me to store functions in the filter dataframe. This is possible: this post shows how.

作为一个基本的例子,考虑

As a basic example, consider

library(tidyverse)

longFilterTable <- tribble(
  ~var,   ~value,
  "gear", list(3),
) %>% 
  mutate(
    func=pmap(
      list(value),
      ~function(x) x == ..1[[1]]
    )
  )

longFilterTable
# A tibble: 1 x 3
  var   value      func  
  <chr> <list>     <list>
1 gear  <list [1]> <fn>  

这是一种非常复杂的说法,仅选择那些 gear3 的行(mtcars).这有效:

This is a very convoluted way of saying "select only those rows (of mtcars) for which gear is 3. This works:

mtcars %>% filter(longFilterTable$func[[1]](gear)) %>% head(3)
                     mpg cyl  disp  hp drat    wt  qsec vs am gear carb
Hornet 4 Drive      21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1
Hornet Sportabout   18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2
Valiant             18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1
Duster 360          14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4
<11 rows deleted for brevity>

现在假设我希望标准具有更大的灵活性.例如,我可能想要选择一个值范围或一个固定值.这似乎是上面过滤器数据集的合理扩展:

Now suppose I want more flexibility in the criterion. I might, for example, want to select either a range of values or a fixed value. This seems to be a reasonable extension of the filter dataset above:

longFilterTable <- tribble(
  ~var,   ~value,         ~condition,
  "gear", list(3),        "equal",
  "wt",   list(3,4, 3.9), "range",
) %>% 
  mutate(
    func=pmap(
      list(value, condition),
      ~function(x) {
        case_when(
          condition == "equal" ~ x == ..1[[1]],
          condition == "range" ~ x >= ..1[[1]][1] & x <= ..1[[1]][2],
          TRUE ~ x
        )
      }
    )
  )

longFilterTable
# A tibble: 2 x 4
  var   value      condition func  
  <chr> <list>     <chr>     <list>
1 gear  <list [1]> equal     <fn>  
2 wt    <list [3]> range     <fn>  

但是现在当我尝试应用过滤器时,我得到:

But now when I try to apply the filter, I get:

mtcars %>% filter(longFilterTable$func[[1]](gear))
 Show Traceback
 
 Rerun with Debug
 Error: Problem with `filter()` input `..1`.
x Obsolete data mask.
x Too late to resolve `condition` after the end of `dplyr::mutate()`.
ℹ Did you save an object that uses `condition` lazily in a column in the `dplyr::mutate()` expression ?
ℹ Input `..1` is `longFilterTable$func[[1]](gear)`.

我玩过 deparse()substitute()expression()force()eval() ,但无济于事.谁能找到解决办法?

I've played around with various combinations of deparse(), substitute(), expression(), force() and eval(), but to no avail. Can anyone find a solution?

推荐答案

你的问题是 case_when 的所有选项总是被评估和检查正确的输出格式

Your problem is that all options of case_when are always evaluated and checked for correct output format

x <- 1

dplyr::case_when(x < 2 ~ TRUE,
                 x < 0 ~ FALSE)
#> [1] TRUE

dplyr::case_when(x < 2 ~ TRUE,
                 x < 0 ~ stop())
#> Error in eval_tidy(pair$rhs, env = default_env):

在您的情况下,您想使用第一个选项,检查是否相等.然而,范围条件也被评估,但没有第二个值存储在 value 列表中,结果只是一个 NA 的向量,因此错误.从 case_when 切换到常规 if else 子句可以解决这个问题.

In your case, you want to use the first option, checking for equality. However, the range condition is also evaluated yet no second value is stored in the value list, the outcome is an vector of NAs only, hence the error. Switching from case_when to a regular if else clause solves this issue.

library(purrr)
library(dplyr)
longFilterTable <- tribble(
  ~var,   ~value,         ~condition,
  "gear", list(3),        "equal",
  "wt",   list(3.4, 3.9), "range",
) %>% 
  mutate(
    func=pmap(
      list(value, condition),
      ~function(x) {
        if(..2 == "equal") x == ..1[[1]]
        else if (..2 == "range") x >= ..1[[1]] & x <= ..1[[2]]
        else TRUE
      }
    )
  )


mtcars %>% filter(longFilterTable$func[[2]](drat))
#>                mpg cyl  disp  hp drat    wt  qsec vs am gear carb
#> Mazda RX4     21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4
#> Mazda RX4 Wag 21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4
#> Datsun 710    22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1
#> Merc 240D     24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2
#> Toyota Corona 21.5   4 120.1  97 3.70 2.465 20.01  1  0    3    1

这篇关于过时的数据掩码.在`dplyr::mutate()`结束后解析`xxxxxx`为时已晚的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆