R中集合内的名称列 [英] Name columns within aggregate in R

查看:95
本文介绍了R中集合内的名称列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道我可以在汇总数据后*重命名列:

I know I can *re*name columns after I aggregate the data:

blubb <- aggregate(dat$two ~ dat$one, ...)
colnames(blubb) <- c("One", "Two")

这没错。但是,有一种方法可以一次性汇总和命名列吗?像这样:

Nothing wrong with that. But is there a way to aggregate and name the columns in one go? Sort of like:

blubb <- aggregate( ... , cols = c("One", "Two"))

以某种方式捕获原始列名并做特别好(且防错字)例如:

It would be escpecially nice (and typo-proof) to somehow catch the original column names and do like:

blubb <- aggregate( ... , cols = c(name_of_dat$one, name_of_dat$two."_Mean"))


推荐答案

您可以使用 setNames 如下:

blubb <- setNames(aggregate(dat$two ~ dat$one, ...), c("One", "Two"))

或者,可以绕过光滑的公式方法,并使用如下语法:

Alternatively, you can bypass the slick formula method, and use syntax like:

blubb <- aggregate(list(One = dat$one), list(Two = dat$two), ...)






更新



此更新只是为了帮助您开始自己开发解决方案。


Update

This update is to just help get you started on deriving a solution on your own.

如果您检查 stats ::: aggregate.formula 的代码,则会看到到结尾的以下几行:

If you inspect the code for stats:::aggregate.formula, you'll see the following lines towards the end:

if (is.matrix(mf[[1L]])) {
    lhs <- as.data.frame(mf[[1L]])
    names(lhs) <- as.character(m[[2L]][[2L]])[-1L]
    aggregate.data.frame(lhs, mf[-1L], FUN = FUN, ...)
}
else aggregate.data.frame(mf[1L], mf[-1L], FUN = FUN, ...)

如果要做的就是附加函数名对于汇总的变量,也许可以将其更改为:

If all that you want to do is append the function name to the variable that was aggregated, perhaps you can change that to something like:

if (is.matrix(mf[[1L]])) {
  lhs <- as.data.frame(mf[[1L]])
  names(lhs) <- as.character(m[[2L]][[2L]])[-1L]
  myOut <- aggregate.data.frame(lhs, mf[-1L], FUN = FUN, ...)
  colnames(myOut) <- c(names(mf[-1L]), 
                       paste(names(lhs), deparse(substitute(FUN)), sep = "."))
}
else {
  myOut <- aggregate.data.frame(mf[1L], mf[-1L], FUN = FUN, ...)
  colnames(myOut) <- c(names(mf[-1L]), 
                       paste(strsplit(gsub("cbind\\(|\\)|\\s", "", 
                                           names(mf[1L])), ",")[[1]],
                             deparse(substitute(FUN)), sep = "."))
} 
myOut

这基本上是使用 deparse(substitute(FUN)) FUN 输入的值。 c $ c>,因此您可以修改函数以接受自定义后缀,甚至可以接受后缀向量。可以通过一些工作对此进行一些改进,但是我不会这样做!

This basically captures the value entered for FUN by using deparse(substitute(FUN)), so you can probably modify the function to accept a custom suffix, or perhaps even a vector of suffixes. This can probably be improved a bit with some work, but I'm not going to do it!

这里是一个要点(应用了此概念),创建了一个名为 myAgg的函数。

Here is a Gist with this concept applied, creating a function named "myAgg".

以下是一些示例输出,仅包含结果列名称 :

Here is some sample output of just the resulting column names:

> names(myAgg(weight ~ feed, data = chickwts, mean))
[1] "feed"        "weight.mean"
> names(myAgg(breaks ~ wool + tension, data = warpbreaks, sum))
[1] "wool"       "tension"    "breaks.sum"
> names(myAgg(weight ~ feed, data = chickwts, FUN = function(x) mean(x^2)))
[1] "feed"                         "weight.function(x) mean(x^2)"

请注意,仅聚合变量名称会更改。但也请注意,如果您使用自定义函数,则会得到一个非常奇怪的列名!

Notice that only the aggregated variable name changes. But notice also that if you use a custom function, you'll end up with a really strange column name!

这篇关于R中集合内的名称列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆