聚合()是否保证结果将按分组列排序? [英] Does aggregate() guarantee that the result will be ordered by the grouping columns?
问题描述
我注意到 aggregate()
似乎返回按分组列排序的结果。这是保证吗?
I've noticed that aggregate()
appears to return its result ordered by the grouping column(s). Is this a guarantee? Can this be relied upon in surrounding logic?
几个例子:
set.seed(1); df <- data.frame(group=sample(letters[1:3],10,replace=T),value=1:10);
aggregate(value~group,df,sum);
## group value
## 1 a 16
## 2 b 22
## 3 c 17
并且有两个小组(注意,第二个小组首先被订购,然后第一个打破平局):
And with two groups (notice the second group is ordered first, then the first group to break ties):
set.seed(1); df <- data.frame(group1=sample(letters[1:3],10,replace=T),group2=sample(letters[4:6],10,replace=T),value=1:10);
aggregate(value~group1+group2,df,sum);
## group1 group2 value
## 1 a d 1
## 2 b d 2
## 3 b e 9
## 4 c e 10
## 5 a f 15
## 6 b f 11
## 7 c f 7
注意:我问是因为我刚想出在合并两个数据帧时在R 中进行聚集,至少在编写本文时为当前形式,这取决于 aggregate()
返回其结果的顺序,
Note: I'm asking because I just came up with an answer for Aggregating while merging two dataframes in R which, at least in its current form at the time of writing, depends on aggregate()
returning its result ordered by the grouping column.
推荐答案
是的,只要您了解因子的自然排序是通过整数键即可。您可以在代码中看到以下内容:
Yes, as long as you understand the natural ordering of factors to be by their integer keys. You can see this in the code:
y <- as.data.frame(by, stringsAsFactors = FALSE)
... # y becomes the "integerized" dataframe of index vectors
grp <- rank(do.call(paste, c(lapply(rev(y), ident), list(sep = "."))),
ties.method = "min")
y <- y[match(sort(unique(grp)), grp, 0L), , drop = FALSE]
...
这篇关于聚合()是否保证结果将按分组列排序?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!