传递参数给dplyr函数 [英] Pass arguments to dplyr functions

查看:158
本文介绍了传递参数给dplyr函数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用 dplyr 参数化以下计算,它可以找出哪些值 Sepal.Length 与更多关联超过一个值 Sepal.Width

I want to parameterise the following computation using dplyr that finds which values of Sepal.Length are associated with more than one value of Sepal.Width:

library(dplyr)

iris %>%
    group_by(Sepal.Length) %>%
    summarise(n.uniq=n_distinct(Sepal.Width)) %>%
    filter(n.uniq > 1)

通常我会写这样的东西:

Normally I would write something like this:

not.uniq.per.group <- function(data, group.var, uniq.var) {
    iris %>%
        group_by(group.var) %>%
        summarise(n.uniq=n_distinct(uniq.var)) %>%
        filter(n.uniq > 1)
}

但是,这种方法抛出错误,因为 dplyr 使用非标准评估。这个函数应该如何写?

However, this approach throws errors because dplyr uses non-standard evaluation. How should this function be written?

推荐答案

你需要使用标准的评估版本的 dplyr 函数(只是附加'_'到函数名,即。 group_by _ & 总结_ )并将字符串传递给你的函数,然后你需要变成符号。要参数summaryise的参数,您需要使用 lazyeval 包中定义的 interp()。具体地:

You need to use the standard evaluation versions of the dplyr functions (just append '_' to the function names, ie. group_by_ & summarise_) and pass strings to your function, which you then need to turn into symbols. To parameterise the argument of summarise_, you will need to use interp(), which is defined in the lazyeval package. Concretely:

library(dplyr)
library(lazyeval)

not.uniq.per.group <- function(df, grp.var, uniq.var) {
    df %>%
        group_by_(grp.var) %>%
        summarise_( n_uniq=interp(~n_distinct(v), v=as.name(uniq.var)) ) %>%
        filter(n_uniq > 1)
}

not.uniq.per.group(iris, "Sepal.Length", "Sepal.Width")

dplyr 小插曲为非更多细节的标准评估。

See the dplyr vignette for non standard evaluation for more details.

这篇关于传递参数给dplyr函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆