dplyr group_by 的带括号或其他标点符号的列名 [英] column name with brackets or other punctuations for dplyr group_by
问题描述
我有一个导入的数据框,它的列名带有各种标点符号,包括括号,例如BILLNG.STATUS.(COMPLETED./.INCOMPLTE)
.
I have an imported data frame that has column names with various punctuations including parentheses, e.g. BILLNG.STATUS.(COMPLETED./.INCOMPLTE)
.
我试图使用 dplyr
中的 group_by
来做一些总结,比如
I was trying to use group_by
from dplyr
to do some summarizing, something like
df <- df %>% group_by(ORDER.NO, BILLNG.STATUS.(COMPLETED./.INCOMPLTE))
导致错误 Error in mutate_impl(.data, dots) :找不到函数BILLNG.STATUS".
除了改变列名之外,有没有办法直接在group_by
中处理这样的列名?
Short of changing the column names, is there a way to handle such column names directly in group_by
?
推荐答案
我认为如果您将非法"列名称括在反引号中,您就可以完成这项工作.例如,假设我从这个数据框(称为 df
)开始:
I think you can make this work if you enclose the "illegal" column names in backticks. For example, let's say I start with this data frame (called df
):
BILLING.STATUS.(COMPLETED./.INCOMPLETE) ORDER.VALUE.(USD)
1 A 0.01544196
2 A 0.95522706
3 B 1.13479303
4 B 1.22848285
然后我可以这样总结:
dat %>% group_by(`BILLING.STATUS.(COMPLETED./.INCOMPLETE)`) %>%
summarise(count=n(),
mean = mean(`ORDER.VALUE.(USD)`))
给予:
BILLING.STATUS.(COMPLETED./.INCOMPLETE) count mean
1 A 2 0.4853345
2 B 2 1.1816379
反引号也可用于引用或创建带有空格的变量名.您可以在 SO 上找到许多与 dplyr
和反引号相关的问题,并且在 Quotes
的帮助中也有一些关于反引号的讨论.
Backticks also come in handy for referring to or creating variable names with whitespace. You can find a number of questions related to dplyr
and backticks on SO, and there's also some discussion of backticks in the help for Quotes
.
这篇关于dplyr group_by 的带括号或其他标点符号的列名的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!