如何将dplyr中的动态列名传递给自定义函数? [英] How to pass dynamic column names in dplyr into custom function?
问题描述
我有一个具有以下结构的数据集:
I have a dataset with the following structure:
Classes ‘tbl_df’ and 'data.frame': 10 obs. of 7 variables:
$ GdeName : chr "Aeugst am Albis" "Aeugst am Albis" "Aeugst am Albis" "Aeugst am Albis" ...
$ Partei : chr "BDP" "CSP" "CVP" "EDU" ...
$ Stand1971: num NA NA 4.91 NA 3.21 ...
$ Stand1975: num NA NA 5.389 0.438 4.536 ...
$ Stand1979: num NA NA 6.2774 0.0195 3.4355 ...
$ Stand1983: num NA NA 4.66 1.41 3.76 ...
$ Stand1987: num NA NA 3.48 1.65 5.75 ...
我想提供一个函数,可以计算任何值之间的差,我想使用 dplyr
I want to provide a function which allows to compute the difference between any value, and I would like to do this using dplyr
s mutate
function like so: (assume the parameters from
and to
are passed as arguments)
from <- "Stand1971"
to <- "Stand1987"
data %>%
mutate(diff = from - to)
当然,这不起作用,因为 dplyr
使用非标准评估。而且我知道现在可以使用 mutate _
来解决问题,并且已经阅读了此小插图,但我仍然无法解决。
Of course, this doesn't work, as dplyr
uses non-standard evaluation. And I know there's now an elegant solution to the problem using mutate_
, and I've read this vignette, but I still can't get my head around it.
该怎么办?
以下是可重现示例的数据集的前几行
Here's the first few rows of the dataset for a reproducible example
structure(list(GdeName = c("Aeugst am Albis", "Aeugst am Albis",
"Aeugst am Albis", "Aeugst am Albis", "Aeugst am Albis", "Aeugst am Albis",
"Aeugst am Albis", "Aeugst am Albis", "Aeugst am Albis", "Aeugst am Albis"
), Partei = c("BDP", "CSP", "CVP", "EDU", "EVP", "FDP", "FGA",
"FPS", "GLP", "GPS"), Stand1971 = c(NA, NA, 4.907306434, NA,
3.2109535926, 18.272143463, NA, NA, NA, NA), Stand1975 = c(NA,
NA, 5.389079711, 0.4382328556, 4.5363022622, 18.749259742, NA,
NA, NA, NA), Stand1979 = c(NA, NA, 6.2773722628, 0.0194647202,
3.4355231144, 25.294403893, NA, NA, NA, 2.7055961071), Stand1983 = c(NA,
NA, 4.6609804428, 1.412940467, 3.7563539244, 26.277246489, 0.8529335746,
NA, NA, 2.601878177), Stand1987 = c(NA, NA, 3.4767860929, 1.6535933856,
5.7451770193, 22.146844746, NA, 3.7453183521, NA, 13.702211858
)), .Names = c("GdeName", "Partei", "Stand1971", "Stand1975",
"Stand1979", "Stand1983", "Stand1987"), class = c("tbl_df", "data.frame"
), row.names = c(NA, -10L))
推荐答案
使用最新版本的dplyr(> = 0.7),您可以使用 rlang
!!
(bang-bang)运算符。
Using the latest version of dplyr (>=0.7), you can use the rlang
!!
(bang-bang) operator.
library(tidyverse)
from <- "Stand1971"
to <- "Stand1987"
data %>%
mutate(diff=(!!as.name(from))-(!!as.name(to)))
您只需要将字符串转换为具有 as.name的名称
,然后将它们插入表达式中。不幸的是,我似乎需要使用比我更多的括号,但是 !!
运算符似乎处于一种奇怪的操作顺序中。
You just need to convert the strings to names with as.name
and then insert them into the expression. Unfortunately I seem to have to use a few more parenthesis than I would like, but the !!
operator seems to fall in a weird order-of-operations order.
原始答案,dplyr(0.3- <0.7):
从该插图( vignette( nse, dplyr)
),使用lazyeval的 interp()
函数
From that vignette (vignette("nse","dplyr")
), use lazyeval's interp()
function
library(lazyeval)
from <- "Stand1971"
to <- "Stand1987"
data %>%
mutate_(diff=interp(~from - to, from=as.name(from), to=as.name(to)))
这篇关于如何将dplyr中的动态列名传递给自定义函数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!