如何将 dplyr 中的动态列名传递给自定义函数? [英] How to pass dynamic column names in dplyr into custom function?
问题描述
我有一个具有以下结构的数据集:
I have a dataset with the following structure:
Classes ‘tbl_df’ and 'data.frame': 10 obs. of 7 variables:
$ GdeName : chr "Aeugst am Albis" "Aeugst am Albis" "Aeugst am Albis" "Aeugst am Albis" ...
$ Partei : chr "BDP" "CSP" "CVP" "EDU" ...
$ Stand1971: num NA NA 4.91 NA 3.21 ...
$ Stand1975: num NA NA 5.389 0.438 4.536 ...
$ Stand1979: num NA NA 6.2774 0.0195 3.4355 ...
$ Stand1983: num NA NA 4.66 1.41 3.76 ...
$ Stand1987: num NA NA 3.48 1.65 5.75 ...
我想提供一个允许计算任何值之间差异的函数,我想使用 dplyr
s mutate
函数来做到这一点,如下所示:(假设参数 from
和 to
作为参数传递)
I want to provide a function which allows to compute the difference between any value, and I would like to do this using dplyr
s mutate
function like so: (assume the parameters from
and to
are passed as arguments)
from <- "Stand1971"
to <- "Stand1987"
data %>%
mutate(diff = from - to)
当然,这不起作用,因为 dplyr
使用非标准评估.我知道现在有一个使用 mutate_
的优雅解决方案,我已经阅读了 这个小插图,但我仍然无法理解它.
Of course, this doesn't work, as dplyr
uses non-standard evaluation. And I know there's now an elegant solution to the problem using mutate_
, and I've read this vignette, but I still can't get my head around it.
怎么办?
这是可重现示例的数据集的前几行
Here's the first few rows of the dataset for a reproducible example
structure(list(GdeName = c("Aeugst am Albis", "Aeugst am Albis",
"Aeugst am Albis", "Aeugst am Albis", "Aeugst am Albis", "Aeugst am Albis",
"Aeugst am Albis", "Aeugst am Albis", "Aeugst am Albis", "Aeugst am Albis"
), Partei = c("BDP", "CSP", "CVP", "EDU", "EVP", "FDP", "FGA",
"FPS", "GLP", "GPS"), Stand1971 = c(NA, NA, 4.907306434, NA,
3.2109535926, 18.272143463, NA, NA, NA, NA), Stand1975 = c(NA,
NA, 5.389079711, 0.4382328556, 4.5363022622, 18.749259742, NA,
NA, NA, NA), Stand1979 = c(NA, NA, 6.2773722628, 0.0194647202,
3.4355231144, 25.294403893, NA, NA, NA, 2.7055961071), Stand1983 = c(NA,
NA, 4.6609804428, 1.412940467, 3.7563539244, 26.277246489, 0.8529335746,
NA, NA, 2.601878177), Stand1987 = c(NA, NA, 3.4767860929, 1.6535933856,
5.7451770193, 22.146844746, NA, 3.7453183521, NA, 13.702211858
)), .Names = c("GdeName", "Partei", "Stand1971", "Stand1975",
"Stand1979", "Stand1983", "Stand1987"), class = c("tbl_df", "data.frame"
), row.names = c(NA, -10L))
推荐答案
使用最新版dplyr(>=0.7),可以使用rlang
!!
(bang-bang) 运算符.
Using the latest version of dplyr (>=0.7), you can use the rlang
!!
(bang-bang) operator.
library(tidyverse)
from <- "Stand1971"
to <- "Stand1987"
data %>%
mutate(diff=(!!as.name(from))-(!!as.name(to)))
您只需要使用 as.name
将字符串转换为名称,然后将它们插入到表达式中.不幸的是,我似乎不得不使用比我想要的更多的括号,但是 !!
运算符似乎属于一个奇怪的操作顺序.
You just need to convert the strings to names with as.name
and then insert them into the expression. Unfortunately I seem to have to use a few more parenthesis than I would like, but the !!
operator seems to fall in a weird order-of-operations order.
原始答案,dplyr (0.3-<0.7):
从那个小插图(vignette("nse","dplyr")
),使用lazyeval的interp()
函数
From that vignette (vignette("nse","dplyr")
), use lazyeval's interp()
function
library(lazyeval)
from <- "Stand1971"
to <- "Stand1987"
data %>%
mutate_(diff=interp(~from - to, from=as.name(from), to=as.name(to)))
这篇关于如何将 dplyr 中的动态列名传递给自定义函数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!