Tidyeval:将列列表作为保证传递给select() [英] Tidyeval: pass list of columns as quosure to select()

查看:87
本文介绍了Tidyeval:将列列表作为保证传递给select()的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想将一堆列传递给mutate()中的pmap().以后,我想选择那些相同的列.

I want to pass a bunch of columns to pmap() inside mutate(). Later, I want to select those same columns.

此刻,我正在将列名称列表传递给pmap(),这很好用,尽管我不知道这是否是正确"的方法.但是我不知道如何为select()使用相同的quosure/list.

At the moment, I'm passing a list of column names to pmap() as a quosure, which works fine, although I have no idea whether this is the "right" way to do it. But I can't figure out how to use the same quosure/list for select().

我几乎没有tidyeval的经验,我只是通过玩耍而达到了这一目标.我想必须有一种方法可以对pmap()select()使用相同的内容,最好不必将我的每个列名都放在引号中,但是我还没有找到它.

I've got almost no experience with tidyeval, I've only got this far by playing around. I imagine there must be a way to use the same thing both for pmap() and select(), preferably without having to put each of my column names in quotation marks, but I haven't found it yet.

library(dplyr)
library(rlang)
library(purrr)

df <- tibble(a = 1:3,
             b = 101:103) %>% 
    print
#> # A tibble: 3 x 2
#>       a     b
#>   <int> <int>
#> 1     1   101
#> 2     2   102
#> 3     3   103

cols_quo <- quo(list(a, b))

df2 <- df %>% 
    mutate(outcome = !!cols_quo %>% 
               pmap_int(function(..., word) {
                   args <- list(...)

                   # just to be clear this isn't what I actually want to do inside pmap
                   return(args[[1]] + args[[2]])
               })) %>% 
    print()
#> # A tibble: 3 x 3
#>       a     b outcome
#>   <int> <int>   <int>
#> 1     1   101     102
#> 2     2   102     104
#> 3     3   103     106

# I get why this doesn't work, but I don't know how to do something like this that does
df2 %>% 
    select(!!cols_quo)
#> Error in .f(.x[[i]], ...): object 'a' not found

推荐答案

这有点棘手,因为此问题涉及多种语义. pmap()获取一个列表,并将每个元素作为自己的参数传递给函数(在某种意义上,它等效于!!!).因此,您的引用函数需要引用其参数,并以某种方式将列列表传递给pmap().

This is a bit tricky because of the mix of semantics involved in this problem. pmap() takes a list and passes each element as its own argument to a function (it's kind of equivalent to !!! in that sense). Your quoting function thus needs to quote its arguments and somehow pass a list of columns to pmap().

我们的报价功能可以采用以下两种方法之一.引用(即延迟)列表创建,或立即创建带引号的表达式的实际列表:

Our quoting function can go one of two ways. Either quote (i.e., delay) the list creation, or create an actual list of quoted expressions right away:

quoting_fn1 <- function(...) {
  exprs <- enquos(...)

  # For illustration purposes, return the quoted inputs instead of
  # doing something with them. Normally you'd call `mutate()` here:
  exprs
}

quoting_fn2 <- function(...) {
  expr <- quo(list(!!!enquos(...)))

  expr
}

由于我们的第一个变量除了返回引用输入的列表外什么也不做,因此实际上等效于quos():

Since our first variant does nothing but return a list of quoted inputs, it's actually equivalent to quos():

quoting_fn1(a, b)
#> <list_of<quosure>>
#>
#> [[1]]
#> <quosure>
#> expr: ^a
#> env:  global
#>
#> [[2]]
#> <quosure>
#> expr: ^b
#> env:  global

第二个版本返回带引号的表达式,该表达式指示R创建带引号的输入的列表:

The second version returns a quoted expression that instructs R to create a list with quoted inputs:

quoting_fn2(a, b)
#> <quosure>
#> expr: ^list(^a, ^b)
#> env:  0x7fdb69d9bd20

两者之间存在细微但重要的区别.第一个版本创建一个实际的列表对象:

There is a subtle but important difference between the two. The first version creates an actual list object:

exprs <- quoting_fn1(a, b)
typeof(exprs)
#> [1] "list"

另一方面,第二个版本不返回列表,而是返回用于创建列表的表达式:

On the other hand, the second version does not return a list, it returns an expression for creating a list:

expr <- quoting_fn2(a, b)
typeof(expr)
#> [1] "language"

让我们找出哪个版本更适合与pmap()接口.但是首先,我们将为pmapped函数命名,以使代码更清晰,更容易尝试:

Let's find out which version is more appropriate for interfacing with pmap(). But first we'll give a name to the pmapped function to make the code clearer and easier to experiment with:

myfunction <- function(..., word) {
  args <- list(...)
  # just to be clear this isn't what I actually want to do inside pmap
  args[[1]] + args[[2]]
}

很难理解整洁的评估是如何工作的,部分原因是我们通常无法观察到取消报价的步骤.我们将使用rlang::qq_show()来显示用!!取消引用expr(延迟列表)和exprs(实际列表)的结果:

Understanding how tidy eval works is hard in part because we usually don't get to observe the unquoting step. We'll use rlang::qq_show() to reveal the result of unquoting expr (the delayed list) and exprs (the actual list) with !!:

rlang::qq_show(
  mutate(df, outcome = pmap_int(!!expr, myfunction))
)
#> mutate(df, outcome = pmap_int(^list(^a, ^b), myfunction))

rlang::qq_show(
  mutate(df, outcome = pmap_int(!!exprs, myfunction))
)
#> mutate(df, outcome = pmap_int(<S3: quosures>, myfunction))

当我们取消对延迟列表的引用时,mutate()list(a, b)调用pmap_int()并在数据框中进行评估,这正是我们所需要的:

When we unquote the delayed list, mutate() calls pmap_int() with list(a, b), evaluated in the data frame, which is exactly what we need:

mutate(df, outcome = pmap_int(!!expr, myfunction))
#> # A tibble: 3 x 3
#>       a     b outcome
#>   <int> <int>   <int>
#> 1     1   101     102
#> 2     2   102     104
#> 3     3   103     106

另一方面,如果我们取消引用引号表达式的实际列表,则会出现错误:

On the other hand, if we unquote an actual list of quoted expressions, we get an error:

mutate(df, outcome = pmap_int(!!exprs, myfunction))
#> Error in mutate_impl(.data, dots) :
#>   Evaluation error: Element 1 is not a vector (language).

这是因为列表中带引号的表达式未在数据框中求值.实际上,根本没有对它们进行评估. pmap()照原样获取带引号的表达式,这是无法理解的.回顾qq_show()向我们展示的内容:

That's because the quoted expressions inside the list are not evaluated in the data frame. In fact, they are not evaluated at all. pmap() gets the quoted expressions as is, which it doesn't understand. Recall what qq_show() has shown us:

#> mutate(df, outcome = pmap_int(<S3: quosures>, myfunction))

尖括号内的所有内容均按原样传递.这表明我们应该以某种方式使用!!!来内联周围表达式中的数量列表的每个元素.试试吧:

Anything inside angular brackets is passed as is. This is a sign that we should somehow have used !!! instead, to inline each element of the list of quosures in the surrounding expression. Let's try it:

rlang::qq_show(
  mutate(df, outcome = pmap_int(!!!exprs, myfunction))
)
#> mutate(df, outcome = pmap_int(^a, ^b, myfunction))

嗯...看起来不对.我们应该将一个列表传递给pmap_int(),在这里它将每个加引号的输入作为单独的参数.确实,我们收到类型错误:

Hmm... Doesn't look right. We're supposed to pass a list to pmap_int(), and here it gets each quoted input as separate argument. Indeed we get a type error:

mutate(df, outcome = pmap_int(!!!exprs, myfunction))
#> Error in mutate_impl(.data, dots) :
#>   Evaluation error: `.x` is not a list (integer).

这很容易解决,只需插入对list()的调用即可:

That's easy to fix, just splice into a call to list():

rlang::qq_show(
  mutate(df, outcome = pmap_int(list(!!!exprs), myfunction))
)
#> mutate(df, outcome = pmap_int(list(^a, ^b), myfunction))

瞧瞧!

mutate(df, outcome = pmap_int(list(!!!exprs), myfunction))
#> # A tibble: 3 x 3
#>       a     b outcome
#>   <int> <int>   <int>
#> 1     1   101     102
#> 2     2   102     104
#> 3     3   103     106

这篇关于Tidyeval:将列列表作为保证传递给select()的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆