用dplyr和lazyeval编程 [英] Programming with dplyr and lazyeval

查看：169 发布时间：2017/7/13 21:45:01 r dplyr lazy-evaluation

本文介绍了用dplyr和lazyeval编程的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我以一种保持非标准评估的方式重新构造垃圾。假设我想创建一个永远选择并重命名的函数。

I am having issues refactoring dplyr in a way that preserves non-standard evaluation. Lets say I want to create a function that always selects and renames.

library(lazyeval)
library(dplyr)

df <- data.frame(a = c(1,2,3), f = c(4,5,6), lm = c(7, 8 , 9))

select_happy<- function(df, col){
    col <- lazy(col)
    fo <- interp(~x, x=col)
    select_(df, happy=fo)
}

f <- function(){
    print('foo')
}

select_happy（）是根据这篇文章的答案写的当库函数使用非标准评估时，重构R代码。 select_happy（）适用于在全局环境中未定义或定义的列名称。但是，当列名也是另一个命名空间中的函数的名称时，它遇到问题。

select_happy() is written according to the answer to this post Refactor R code when library functions use non-standard evaluation. select_happy() works on column names that are either undefined or defined in the global environment. However, it runs into issues when a column name is also the name of a function in another namespace.

select_happy(df, a)
#   happy
# 1     1
# 2     2
# 3     3

select_happy(df, f)
#   happy
# 1     4
# 2     5
# 3     6

select_happy(df, lm)
# Error in eval(expr, envir, enclos) (from #4) : object 'datafile' not found

environment(f)
# <environment: R_GlobalEnv>

environment(lm)
# <environment: namespace:stats>

在f和lm上调用 lazy（）显示懒惰对象的差异，其中lm的函数定义出现在懒惰对象中，对于f，它只是函数的名称。

Calling lazy() on f and lm shows a difference in the lazy object, where the function definition for lm is appearing in the lazy object, and for f it is just the name of the function.

lazy(f)
# <lazy>
#   expr: f
#   env:  <environment: R_GlobalEnv>

lazy(lm)
# <lazy>
#   expr: function (formula, data, subset, weights, na.action, method = "qr",  ...
#   env:  <environment: R_GlobalEnv>

替代似乎与lm配合使用。

substitute appears to work with lm.

 select_happy<- function(df, col){
     col <- substitute(col) # <- substitute() instead of lazy()
     fo <- interp(~x, x=col)
     select_(df, happy=fo)
}

select_happy(df, lm)
#   happy
# 1     7 
# 2     8
# 3     9

但是，阅读 <$ c后的小插曲$ c> lazyeval 似乎 lazy 应该作为替代。此外，常规的选择功能很好。

However, after reading the vignette on lazyeval it seems that lazy should serve as a superior substitute for substitute. Additionally, the regular select function works just fine.

select(df, happy=lm)
#   happy
# 1     7
# 2     8
# 3     9

我的问题是我如何写 select_happy （），以便它以 select（）的所有方式工作？我很困难地围绕着范围界定和非标准评估。更一般地说，与dplyr进行编程的坚实策略可以避免这些和其他问题？

My question is how can I write select_happy() so that it works in all the ways that select() does? I'm having a hard time wrapping my head around the scoping and non-standard evaluation. More generally, what would be a solid strategy for programming with dplyr that could avoid these and other issues?

编辑

我测试了docendo discimus的解决方案，它工作得很好，但我想知道是否有一种方法来使用参数，而不是点。我认为能够使用 interp（）也很重要，因为您可能希望将输入输入更复杂的公式，例如我之前链接的帖子。我认为这个问题的核心在于， lazy_dots（）正在捕获与 lazy（）。我想了解为什么他们的行为不同，以及如何使用 lazy（）获得与 lazy_dots（）。


I tested out docendo discimus's solution and it worked great, but I would like to know if there is a way to use arguments, rather than dots, for the function. I think it is also important to be able to use interp() because you might want to feed input into a more complicated formula, like in the post I linked to earlier. I think the core of the issue come down to the fact that lazy_dots() is capturing the expression differently from lazy(). I would like to understand why they are behaving differently, and how to use lazy() to get the same functionality as lazy_dots().
g <- function(...){
    lazy_dots(...)
}

h <-  function(x){
    lazy(x)
}

g(lm)[[1]]
# <lazy>
#   expr: lm
#   env:  <environment: R_GlobalEnv>
h(lm)
# <lazy>
#   expr: function (formula, data, subset, weights, na.action, method = "qr",  ...
#   env:  <environment: R_GlobalEnv> 

甚至将 .follow__symbols 更改为 FALSE  for  lazy（），以便与 lazy_dots（）不起作用。 p> 
 
 
Even changing .follow__symbols to FALSE for lazy() so that it is the same as lazy_dots() does not work. 
lazy
# function (expr, env = parent.frame(), .follow_symbols = TRUE) 
# {
#     .Call(make_lazy, quote(expr), environment(), .follow_symbols)
# }
# <environment: namespace:lazyeval>

lazy_dots
# function (..., .follow_symbols = FALSE) 
# {
#     if (nargs() == 0) 
#         return(structure(list(), class = "lazy_dots"))
#     .Call(make_lazy_dots, environment(), .follow_symbols)
# }
# <environment: namespace:lazyeval>


h2 <-  function(x){
    lazy(x, .follow_symbols=FALSE)
}

h2(lm)
# <lazy>
#  expr: x
#  env:  <environment: 0xe4a42a8>

我只是觉得有什么困难。
I just feel really kind of stuck as to what to do.
推荐答案
一个选项可能是写入 select_happy 几乎与标准选择函数：
One option may be to make write select_happy almost the same way as the standard select function:
select_happy<- function(df, ...){
  select_(df, .dots = setNames(lazy_dots(...), "happy"))
}

f <- function(){
  print('foo')
}

> select_happy(df, a)
  happy
1     1
2     2
3     3
> 
> select_happy(df, f)
  happy
1     4
2     5
3     6
> 
> select_happy(df, lm)
  happy
1     7
2     8
3     9

请注意，标准选择函数的函数定义是：
Note that the function definition of the standard select function is:
> select
function (.data, ...) 
{
    select_(.data, .dots = lazyeval::lazy_dots(...))
}
<environment: namespace:dplyr>

另请注意，通过此定义， select_happy 接受要选择的多个列，但会将任何其他列命名为NA：
Also note that by this definition, select_happy accepts multiple columns to be selected, but will name any additional columns "NA":
> select_happy(df, lm, a)
  happy NA
1     7  1
2     8  2
3     9  3

当然，您可以对这种情况进行一些修改，例如：
Of course you could make some modifications for such cases, for example:
select_happy<- function(df, ...){
  dots <- lazy_dots(...)
  n <- length(dots)
  if(n == 1) newnames <- "happy" else newnames <- paste0("happy", seq_len(n))
  select_(df, .dots = setNames(dots, newnames))
}

> select_happy(df, f)
  happy
1     4
2     5
3     6

> select_happy(df, lm, a)
  happy1 happy2
1      7      1
2      8      2
3      9      3


                        这篇关于用dplyr和lazyeval编程的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

用dplyr和lazyeval编程 [英] Programming with dplyr and lazyeval

问题描述

推荐答案

相关文章

其他开发语言最新文章

热门教程

热门工具

登录关闭

用dplyr和lazyeval编程 [英] Programming with dplyr and lazyeval

问题描述

推荐答案

相关文章

其他开发语言最新文章

热门教程

热门工具

登录 关闭

登录关闭