dplyr和非标准评估(NSE) [英] dplyr and Non-standard evaluation (NSE)

查看:70
本文介绍了dplyr和非标准评估(NSE)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试编写一个函数,该函数使用dplyr接受数据框的名称和要汇总的列,然后返回汇总的数据框。我已经尝试了lazyeval包中的许多interp()排列,但是我花了太多时间尝试使其工作。因此,我在此处编写了该函数的静态版本:

  summarize.df.static<-function( ){
temp_df<-mtcars%&%;%
group_by(cyl)%&%;%
summary(qsec = mean(qsec),
mpg = mean(mpg) )
return(temp_df)
}

new_df<-summary.df.static()
head(new_df)

这是我停留的动态版本的开始:

  summarize.df.dynamic<-function(df_in,sum_metric_in){
temp_df<-df_in%>%
group_by(cyl)%>%
summary_(qsec =平均值(qsec),
sum_metric_in =平均值(sum_metric_in))#interp()的某些组合
return(temp_df)
}

new_df<-summary.df.dynamic(mtcars, mpg)
头(new_df)

请注意,我希望本示例中的列名也来自传入的参数(在本例中为mpg)。另请注意,qsec列是静态的,即未传入。



以下是 docendo discimus发布的正确答案:

  summarize.df.dynamic<-函数(df_in,sum_metric_in){
temp_df<-df_in%>%
group_by(cyl)%&%;%
summary_(qsec =〜mean (qsec),
xyz = interp(〜mean(var),var = as.name(sum_metric_in)))

names(temp_df)[names(temp_df)== xyz ]<-sum_metric_in
return(temp_df)
}

new_df<-summary.df.dynamic(mtcars, mpg)
head(new_df)

#cyl qsec mpg
#1 4 19.13727 26.66364
#2 6 17.97714 19.74286
#3 8 16.77214 15.10000

new_df< -summary.df.dynamic(mtcars, disp)
head(new_df)

#cyl qsec disp
#1 4 19.13727 105.1364
#2 6 17.97714 183.3143
#3 8 16.77214 353.1000


解决方案

具体示例(使用静态 qsec等),您可以这样做:

 库(dplyr)
库(延迟)
summary.df<-函数(data,sum_metric_in){
data<-data%&%;%
group_by(cyl)%>%
summary_(qsec =〜mean(qsec),
xyz = interp(〜mean(var),var = as.name(sum_metric_in)))

names(data)[names(data)== xyz]<-sum_metric_in
data
}

summary.df(mtcars, mpg)
#来源:本地数据帧[3 x 3]

#cyl qsec mpg
#1 4 19.13727 26.66364
#2 6 17.97714 19.74286
#3 8 16.77214 15.10000

AFAIK您不能(还?)提供输入 sum_metric_in 到dplyr :: rename,您通常会用它来重命名该列,这就是为什么我在示例中做了不同的事情。

I'm trying to write a function that takes in the name of a data frame and a column to summarize by using dplyr, then returns the summarized data frame. I've tried a bunch of permutations of interp() from the lazyeval package, but I've spent way too much time trying to get it to work. So, I wrote a "static" version of the function I want here:

summarize.df.static <- function(){
  temp_df <- mtcars %>%
    group_by(cyl) %>%
    summarize(qsec = mean(qsec),
              mpg=mean(mpg))
  return(temp_df)
}

new_df <- summarize.df.static()
head(new_df)

Here is the start of the dynamic version I'm stuck on:

summarize.df.dynamic <- function(df_in,sum_metric_in){
  temp_df <- df_in %>%
    group_by(cyl) %>%
    summarize_(qsec = mean(qsec),
              sum_metric_in=mean(sum_metric_in)) # some mix of interp()
  return(temp_df)
}

new_df <- summarize.df.dynamic(mtcars,"mpg")
head(new_df)

Note that I want the column name in this example to come from the parameter passed-in as well (mpg in this case). Also note that the qsec column is static, ie not passed-in.

Below is the correct answer posted by "docendo discimus":

summarize.df.dynamic<- function(df_in, sum_metric_in){
  temp_df <- df_in %>%
    group_by(cyl) %>%
    summarize_(qsec = ~mean(qsec), 
               xyz = interp(~mean(var), var = as.name(sum_metric_in))) 

  names(temp_df)[names(temp_df) == "xyz"] <- sum_metric_in  
  return(temp_df)
}

new_df <- summarize.df.dynamic(mtcars,"mpg")
head(new_df)

#  cyl     qsec      mpg
#1   4 19.13727 26.66364
#2   6 17.97714 19.74286
#3   8 16.77214 15.10000

new_df <- summarize.df.dynamic(mtcars,"disp")
head(new_df)

#  cyl     qsec     disp
#1   4 19.13727 105.1364
#2   6 17.97714 183.3143
#3   8 16.77214 353.1000

解决方案

For the specific example (with static "qsec" etc) you could do:

library(dplyr)
library(lazyeval)
summarize.df <- function(data, sum_metric_in){
  data <- data %>%
    group_by(cyl) %>%
    summarize_(qsec = ~mean(qsec), 
               xyz = interp(~mean(var), var = as.name(sum_metric_in))) 

  names(data)[names(data) == "xyz"] <- sum_metric_in  
  data
}

summarize.df(mtcars, "mpg")
#Source: local data frame [3 x 3]
#
#  cyl     qsec      mpg
#1   4 19.13727 26.66364
#2   6 17.97714 19.74286
#3   8 16.77214 15.10000

AFAIK you cannot (yet?) supply the input "sum_metric_in" to dplyr::rename which you would typically use to rename the column, which is why I did it different in the example.

这篇关于dplyr和非标准评估(NSE)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆