指定dplyr列名 [英] specify dplyr column names

查看:169
本文介绍了指定dplyr列名的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果我不知道列名,但是想通过一个变量来指定列,那么我怎么能将列名传递给dplyr?

How can I pass column names to dplyr if I do not know the column name, but want to specify it through a variable?

这样做:

require(dplyr)
df <- as.data.frame(matrix(seq(1:9),ncol=3,nrow=3))
df$group <- c("A","B","A")
gdf <- df %.% group_by(group) %.% summarise(m1 =mean(V1),m2 =mean(V2),m3 =mean(V3))

但这不是

require(dplyr)
someColumn = "group"
df <- as.data.frame(matrix(seq(1:9),ncol=3,nrow=3))
df$group <- c("A","B","A")
gdf <- df %.% group_by(someColumn) %.% summarise(m1 =mean(V1),m2 =mean(V2),m3 =mean(V3))


推荐答案

我刚刚在按dplyr中的多个列进行分组,使用字符串向量输入,但是很好的措施:允许您使用字符串对列进行操作的功能已添加到 dplyr 中。它们具有与常规 dplyr 函数相同的名称,但以下划线结尾。这些功能在此小插曲中有详细描述。

I just gave a similar answer over at Group by multiple columns in dplyr, using string vector input, but for good measure: functions that allow you to operate on columns using strings have been added to dplyr. These have the same name as the regular dplyr functions, but end in an underscore. The functions are described in detail in this vignette.

从OP中给出 df someColumn 现在可以起作用:

Given df and someColumn from the OP, this now works a treat:

gdf <- df %>% group_by_(someColumn) %>% summarise(m1=mean(V1),m2=mean(V2),m3=mean(V3))

它是 group_by _ 而不是 group_by ,而%>% operator被用作%。%已被弃用。

Note that it is group_by_, rather than group_by, and the %>% operator is used as %.% is deprecated.

这篇关于指定dplyr列名的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆