R中具有mapply的子集参数的非标准评估 [英] Non-standard evaluation of subset argument with mapply in R

查看:81
本文介绍了R中具有mapply的子集参数的非标准评估的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我不能将xtabsaggregate(或我测试过的任何函数,包括ftablelm)的subset参数与mapply一起使用.以下调用失败,并带有subset参数,但它们不能在以下情况下工作:

I can not use the subset argument of xtabs or aggregate (or any function I tested, including ftable and lm) with mapply. The following calls fail with the subset argument, but they work without:

mapply(FUN = xtabs,
       formula = list(~ wool,
                      ~ wool + tension),
       subset = list(breaks < 15,
                     breaks < 20),
       MoreArgs = list(data = warpbreaks))

# Error in mapply(FUN = xtabs, formula = list(~wool, ~wool + tension), subset = list(breaks <  : 
#   object 'breaks' not found
# 
# expected result 1/2:
# wool
# A B 
# 2 2
# 
# expected result 2/2:
#     tension
# wool L M H
#    A 0 4 3
#    B 2 2 5

mapply(FUN = aggregate,
       formula = list(breaks ~ wool,
                      breaks ~ wool + tension),
       subset = list(breaks < 15,
                     breaks < 20),
       MoreArgs = list(data = warpbreaks,
                       FUN = length))

# Error in mapply(FUN = aggregate, formula = list(breaks ~ wool, breaks ~  : 
#   object 'breaks' not found
# 
# expected result 1/2:
#   wool breaks
# 1    A      2
# 2    B      2
# 
# expected result 2/2:
#   wool tension breaks
# 1    B       L      2
# 2    A       M      4
# 3    B       M      2
# 4    A       H      3
# 5    B       H      5

错误似乎是由于未在正确的环境中评估subset参数所致.我知道我可以使用data = warpbreaks[warpbreaks$breaks < 20, ]作为data参数的子集,但我希望提高对R的了解.

The errors seem to be due to subset arguments not being evaluated in the right environment. I know I can subset in the data argument with data = warpbreaks[warpbreaks$breaks < 20, ] as a workaround, but I am looking to improve my knowledge of R.

我的问题是:

  • 如何将subset参数与mapply一起使用?我尝试使用match.calleval.parent,但到目前为止没有成功(更多信息请参见我的先前的问题). /li>
  • 为什么在data = warpbreaks中评估formula自变量,但是 subset参数不是吗?
  • How can I use subset arguments with mapply? I tried with match.call and eval.parent, but without success so far (more details in my previous questions).
  • Why is the formula argument evaluated in data = warpbreaks, but the subset argument is not?

推荐答案

简短的答案是,当您创建要作为参数传递给函数的列表时,将在创建时对其进行求值.您收到的错误是因为R尝试创建要在调用环境中传递的列表.

The short answer is that when you create a list to pass as an argument to a function, it is evaluated at the point of creation. The error you are getting is because R tries to create the list you want to pass in the calling environment.

要更清楚地了解这一点,假设您尝试在调用mapply之前创建要传递的参数:

To see this more clearly, suppose you try creating the arguments you want to pass ahead of calling mapply:

f_list <- list(~ wool, ~ wool + tension)
d_list <- list(data = warpbreaks)
mapply(FUN = xtabs, formula = f_list, MoreArgs = d_list)
#> [[1]]
#> wool
#>  A  B 
#> 27 27 
#> 
#> [[2]]
#>     tension
#> wool L M H
#>    A 9 9 9
#>    B 9 9 9

创建公式列表没有问题,因为只有在需要时才对它们进行评估,并且当然可以从全局环境访问warpbreaks,因此对mapply的调用有效.

There is no problem with creating a list of formulas, because these are not evaluated until needed, and of course warpbreaks is accessible from the global environment, hence this call to mapply works.

当然,如果您尝试在mapply调用之前创建以下列表:

Of course, if you try to create the following list ahead of the mapply call:

subset_list <- list(breaks < 15, breaks < 20)

然后R会告诉您找不到breaks.

Then R will tell you that breaks isn't found.

但是,如果您在搜索路径中使用warpbreaks创建列表,则不会有问题:

However, if you create the list with warpbreaks in the search path, then you won't have a problem:

subset_list <- with(warpbreaks, list(breaks < 15, breaks < 20))
subset_list
#> [[1]]
#>  [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
#> [14]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE
#> [27] FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
#> [40] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE
#> [53] FALSE FALSE
#> 
#> [[2]]
#>  [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE  TRUE
#> [14]  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE FALSE FALSE  TRUE
#> [27] FALSE FALSE  TRUE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE
#> [40]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE
#> [53]  TRUE FALSE

所以您会认为我们可以将其传递给mapply,一切都会好起来的,但是现在我们收到了一个新错误:

so you would think that we could just pass this to mapply and everything would be fine, but now we get a new error:

mapply(FUN = xtabs, formula = f_list, subset = subset_list, MoreArgs = d_list)
#> Error in eval(substitute(subset), data, env) : object 'dots' not found

那我们为什么要得到这个?

So why are we getting this?

问题出在传递给mapply的任何调用eval的函数,或者它们本身调用使用eval的函数.

The problem lies in any functions passed to mapply that call eval, or that themselves call a function that uses eval.

如果查看mapply的源代码,您会发现它采用了传递的额外参数,并将它们放在名为dots的列表中,然后将其传递给内部mapply调用:

If you look at the source code for mapply you will see that it takes the extra arguments you have passed and puts them in a list called dots, which it will then pass to an internal mapply call:

mapply
#> function (FUN, ..., MoreArgs = NULL, SIMPLIFY = TRUE, USE.NAMES = TRUE) 
#> {
#>     FUN <- match.fun(FUN)
#>     dots <- list(...)
#>     answer <- .Internal(mapply(FUN, dots, MoreArgs))
#> ...

如果您的FUN本身调用了另一个对其任何参数调用eval的函数,则它将尝试eval对象dots,该对象在被调用.通过在match.call包装器上执行mapply可以很容易地看到这一点:

If your FUN itself calls another function that calls eval on any of its arguments, it will therefore try to eval the object dots, which won't exist in the environment in which the eval is called. This is easy to see by doing an mapply on a match.call wrapper:

mapply(function(x) match.call(), x = list(1))
[[1]]
(function(x) match.call())(x = dots[[1L]][[1L]])

因此,我们的错误的一个最小的可重现示例是

So a minimal reproducible example of our error is

mapply(function(x) eval(substitute(x)), x = list(1))
#> Error in eval(substitute(x)) : object 'dots' not found


那是什么解决方案?看来您已经找到了一个非常好的选择,那就是手动设置希望传递的数据帧的子集.其他人可能建议您探索purrr::map以获得更优雅的解决方案.


So what's the solution? It seems like you have already hit on a perfectly good one, that is, manually subsetting the data frame you wish to pass. Others may suggest that you explore purrr::map to get a more elegant solution.

但是, 可能使mapply做您想要的事情,而秘密只是修改FUN以将其转换为子集上的xtabs的匿名包装.苍蝇:

However, it is possible to get mapply to do what you want, and the secret is just to modify FUN to turn it into an anonymous wrapper of xtabs that subsets on the fly:

mapply(FUN = function(formula, subset, data) xtabs(formula, data[subset,]), 
       formula = list(~ wool, ~ wool + tension),
       subset = with(warpbreaks, list(breaks < 15, breaks < 20)),
       MoreArgs = list(data = warpbreaks))
#> [[1]]
#> wool
#> A B 
#> 2 2 
#> 
#> [[2]]
#>     tension
#> wool L M H
#>    A 0 4 3
#>    B 2 2 5

这篇关于R中具有mapply的子集参数的非标准评估的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆