将dplyr中的动态列名称传递给自定义函数? [英] Hot to pass dynamic column names in dplyr into custom function?
问题描述
我有一个具有以下结构的数据集:
类'tbl_df'和'data.frame':10个obs。的7个变量:
/ pre>
$ GdeName:chrAeugst am AlbisAeugst am AlbisAeugst am AlbisAeugst am Albis...
$ Partei:chrBDPCSP CVPEDU...
$ Stand1971:num NA NA 4.91 NA 3.21 ...
$ Stand1975:num NA NA 5.389 0.438 4.536 ...
$ Stand1979:num NA NA 6.2774 0.0195 3.4355 ...
$ Stand1983:num NA NA 4.66 1.41 3.76 ...
$ Stand1987:num NA NA 3.48 1.65 5.75 ...
我想提供一个允许计算任何值之间的差异的函数,我想使用
dplyr
smutate
函数如下:(假设参数从
和作为参数传递)
从< - Stand1971
到< - Stand1987
data%>%
mutate(diff = from - to)
当然,这不起作用,因为
dplyr
使用非标准评估。而且我知道现在使用mutate _
这个问题的优雅解决方案,我已经阅读了这个小插曲,但我仍然不能让我的头脑。
该怎么办?
以下是可重现示例的数据集的前几行
结构(list(GdeName = c(Aeugst am Albis,Aeugst am Albis,
Aeugst am Albis,Aeugst am Albis,Aeugst am Albis,Aeugst am Albis,
Aeugst am Albis,Aeugst am Albis,Aeugst am Albis,Aeugst am Albis
),Partei = c(BDP,CSP,CVP ,EVP,FDP,FGA,
FPS,GLP,GPS),Stand1971 = c(NA,NA,4.907306434,NA,
3.2109535926,18.272143463 ,NA,NA,NA,NA),Stand1975 = c(NA,
NA,5.389079711,0.4382328556,4.5363022622,18.749259742,NA,
NA,NA,NA),Stand1979 = c(NA,NA ,6.2773722628,0.0194647202,
3.4355231144,25.294403893,NA,NA,NA,2.7055961071),Stand1983 = c(NA,
NA,4.6609804428,1.412940467,37563539244,26.277246489,0.8529335746,
NA, NA,2.601878177),Stand1987 = c(NA,NA,3.4767860929,1.6535933856,
5.7451770193,22.146844746,NA,3.7453183521,NA,13.702211858
)), .Names = c(GdeName,Partei,Stand1971,Stand1975,
Stand1979,Stand1983,Stand1987),class = c(tbl_df,data.frame
),row.names = c(NA,-10L))
解决方案从该vignette(
vignette(nse,dplyr)
),使用lazyeval的interp() code> function
library(lazyeval)
from< - Stand1971
到< - Stand1987
data%>%
mutate_(diff = interp(〜from - to,from = as.name(from),to = as .name(to)))
I have a dataset with the following structure:
Classes ‘tbl_df’ and 'data.frame': 10 obs. of 7 variables: $ GdeName : chr "Aeugst am Albis" "Aeugst am Albis" "Aeugst am Albis" "Aeugst am Albis" ... $ Partei : chr "BDP" "CSP" "CVP" "EDU" ... $ Stand1971: num NA NA 4.91 NA 3.21 ... $ Stand1975: num NA NA 5.389 0.438 4.536 ... $ Stand1979: num NA NA 6.2774 0.0195 3.4355 ... $ Stand1983: num NA NA 4.66 1.41 3.76 ... $ Stand1987: num NA NA 3.48 1.65 5.75 ...
I want to provide a function which allows to compute the difference between any value, and I would like to do this using
dplyr
smutate
function like so: (assume the parametersfrom
andto
are passed as arguments)from <- "Stand1971" to <- "Stand1987" data %>% mutate(diff = from - to)
Of course, this doesn't work, as
dplyr
uses non-standard evaluation. And I know there's now an elegant solution to the problem usingmutate_
, and I've read this vignette, but I still can't get my head around it.What to do?
Here's the first few rows of the dataset for a reproducible example
structure(list(GdeName = c("Aeugst am Albis", "Aeugst am Albis", "Aeugst am Albis", "Aeugst am Albis", "Aeugst am Albis", "Aeugst am Albis", "Aeugst am Albis", "Aeugst am Albis", "Aeugst am Albis", "Aeugst am Albis" ), Partei = c("BDP", "CSP", "CVP", "EDU", "EVP", "FDP", "FGA", "FPS", "GLP", "GPS"), Stand1971 = c(NA, NA, 4.907306434, NA, 3.2109535926, 18.272143463, NA, NA, NA, NA), Stand1975 = c(NA, NA, 5.389079711, 0.4382328556, 4.5363022622, 18.749259742, NA, NA, NA, NA), Stand1979 = c(NA, NA, 6.2773722628, 0.0194647202, 3.4355231144, 25.294403893, NA, NA, NA, 2.7055961071), Stand1983 = c(NA, NA, 4.6609804428, 1.412940467, 3.7563539244, 26.277246489, 0.8529335746, NA, NA, 2.601878177), Stand1987 = c(NA, NA, 3.4767860929, 1.6535933856, 5.7451770193, 22.146844746, NA, 3.7453183521, NA, 13.702211858 )), .Names = c("GdeName", "Partei", "Stand1971", "Stand1975", "Stand1979", "Stand1983", "Stand1987"), class = c("tbl_df", "data.frame" ), row.names = c(NA, -10L))
解决方案From that vignette (
vignette("nse","dplyr")
), use lazyeval'sinterp()
functionlibrary(lazyeval) from <- "Stand1971" to <- "Stand1987" data %>% mutate_(diff=interp(~from - to, from=as.name(from), to=as.name(to)))
这篇关于将dplyr中的动态列名称传递给自定义函数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!