如何按组对变量求和 [英] How to sum a variable by group
问题描述
我有一个包含两列的数据框。第一列包含第一,第二,第三等类别,第二列具有代表我从类别中看到特定组的次数的数字。
I have a data frame with two columns. First column contains categories such as "First", "Second", "Third", and the second column has numbers that represent the number of times I saw the specific groups from "Category".
例如:
Category Frequency
First 10
First 15
First 5
Second 2
Third 14
Third 20
Second 3
我想按类别对数据进行排序并汇总所有频率:
I want to sort the data by Category and sum all the Frequencies:
Category Frequency
First 30
Second 5
Third 34
我将如何在R中做到这一点?
How would I do this in R?
推荐答案
使用聚合
:
aggregate(x$Frequency, by=list(Category=x$Category), FUN=sum)
Category x
1 First 30
2 Second 5
3 Third 34
在上面的示例中,多次暗淡可以在列表
中指定。可以通过 cbind
合并同一数据类型的多个聚合指标:
In the example above, multiple dimensions can be specified in the list
. Multiple aggregated metrics of the same data type can be incorporated via cbind
:
aggregate(cbind(x$Frequency, x$Metric2, x$Metric3) ...
(嵌入@thelatemail评论),汇总
也具有公式界面
aggregate(Frequency ~ Category, x, sum)
或如果要聚合多个列,则可以使用。
表示法(也适用于一列)
Or if you want to aggregate multiple columns, you could use the .
notation (works for one column too)
aggregate(. ~ Category, x, sum)
或 tapply
:
tapply(x$Frequency, x$Category, FUN=sum)
First Second Third
30 5 34
使用此数据:
Using this data:
x <- data.frame(Category=factor(c("First", "First", "First", "Second",
"Third", "Third", "Second")),
Frequency=c(10,15,5,2,14,20,3))
这篇关于如何按组对变量求和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!