创建带有传递给dplyr :: filter的参数的函数,解决nse的最佳方法是什么? [英] Creating a function with an argument passed to dplyr::filter what is the best way to work around nse?

查看:64
本文介绍了创建带有传递给dplyr :: filter的参数的函数,解决nse的最佳方法是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

非标准评估真的很方便 使用dplyr的动词.但是当使用那些 具有函数参数的动词. 例如,假设我们要创建一个函数 给我给定物种的行数.

Non standard evaluation is really handy when using dplyr's verbs. But it can be problematic when using those verbs with function arguments. For example let us say that I want to create a function that gives me the number of rows for a given species.

# Load packages and prepare data
library(dplyr)
library(lazyeval)
# I prefer lowercase column names
names(iris) <- tolower(names(iris))
# Number of rows for all species
nrow(iris)
# [1] 150

示例不起作用

此功能无法正常工作,因为species 在虹膜数据帧的上下文中解释 而不是在 函数参数:

Example not working

This function doesn't work as expected because species is interpreted in the context of the iris data frame instead of being interpreted in the context of the function argument:

nrowspecies0 <- function(dtf, species){
    dtf %>%
        filter(species == species) %>%
        nrow()
}
nrowspecies0(iris, species = "versicolor")
# [1] 150

3个实施示例

要解决非标准评估, 我通常在参数后面加上下划线:

3 examples of implementation

To work around non standard evaluation, I usually append the argument with an underscore :

nrowspecies1 <- function(dtf, species_){
    dtf %>%
        filter(species == species_) %>%
        nrow()
}

nrowspecies1(iris, species_ = "versicolor")
# [1] 50
# Because of function name completion the argument
# species works too
nrowspecies1(iris, species = "versicolor")
# [1] 50

这并不完全令人满意 因为它将函数参数的名称更改为 一些不太用户友好的东西.还是依靠自动补全 恐怕这不是编程的好习惯. 为了保留一个不错的参数名称, 我可以做的:

It is not completely satisfactory since it changes the name of the function argument to something less user friendly. Or it relies on autocompletion which I'm afraid is not a good practice for programming. To keep a nice argument name, I could do :

nrowspecies2 <- function(dtf, species){
    species_ <- species
    dtf %>%
        filter(species == species_) %>%
        nrow()
}
nrowspecies2(iris, species = "versicolor")
# [1] 50

解决非标准评估的另一种方法 基于此答案. interp()在以下情况下解释species 功能环境:

Another way to work around non standard evaluation based on this answer. interp() interprets species in the context of the function environment:

nrowspecies3 <- function(dtf, species){
    dtf %>%
        filter_(interp(~species == with_species, 
                       with_species = species)) %>%
        nrow()
}
nrowspecies3(iris, species = "versicolor")
# [1] 50

考虑到上面的3个功能, 什么是实现此过滤器功能的首选方法(最可靠)? 还有其他方法吗?

Considering the 3 function above, what is the preferred - most robust - way to implement this filter function? Are there any other ways?

推荐答案

@eddi 的答案对于此处的情况是正确的. 我正在写另一个答案,以解决如何使用dplyr动词编写函数的更大要求.您会注意到,最终,它使用nrowspecies2之类的东西来避免species == species重言式.

The answer from @eddi is correct about what's going on here. I'm writing another answer that addresses the larger request of how to write functions using dplyr verbs. You'll note that, ultimately, it uses something like nrowspecies2 to avoid the species == species tautology.

编写一个包装将与NSE一起使用的dplyr动词的函数,请编写两个函数:

首先使用lazyevaldplyr动词的SE版本.因此,在这种情况下,filter_.

First write a version that requires quoted inputs, using lazyeval and an SE version of the dplyr verb. So in this case, filter_.

nrowspecies_robust_ <- function(data, species){ 
  species_ <- lazyeval::as.lazy(species) 
  condition <- ~ species == species_ # *
  tmp <- dplyr::filter_(data, condition) # **
  nrow(tmp)
} 
nrowspecies_robust_(iris, ~versicolor) 

第二制作使用NSE的版本:

Second make a version that uses NSE:

nrowspecies_robust <- function(data, species) { 
  species <- lazyeval::lazy(species) 
  nrowspecies_robust_(data, species) 
} 
nrowspecies_robust(iris, versicolor) 

* =如果您想做更复杂的事情,则可能需要在此处使用lazyeval::interp,如下面链接的提示所示

* = if you want to do something more complex, you may need to use lazyeval::interp here as in the tips linked below

** =同样,如果您需要更改输出名称,请参见.dots参数

** = also, if you need to change output names, see the .dots argument

另一个不错的资源是 NSE上的dplyr插图,其中说明了.dotsinterplazyeval软件包中的其他功能

Another good resource is the dplyr vignette on NSE, which illustrates .dots, interp, and other functions from the lazyeval package

有关lazyeval的更多详细信息查看其插图

For even more details on lazyeval see it's vignette

有关使用NSE的基本R工具的全面讨论(许多lazyeval可以帮助您避免使用),请参见

For a thorough discussion of the base R tools for working with NSE (many of which lazyeval helps you avoid), see the chapter on NSE in Advanced R

这篇关于创建带有传递给dplyr :: filter的参数的函数,解决nse的最佳方法是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆