根据另一列中的值汇总一列中的数据 [英] Aggregate data in one column based on values in another column
问题描述
我知道有一个简单的方法可以执行此操作...但是,我无法弄清楚。
I know there is an easy way to do this...but, I can't figure it out.
R脚本中有一个数据框看起来像这样:
I have a dataframe in my R script that looks something like this:
A B C
1.2 4 8
2.3 4 9
2.3 6 0
1.2 3 3
3.4 2 1
1.2 5 1
请注意,A,B和C是列名。而且我正在尝试获取这样的变量:
Note that A, B, and C are column names. And I'm trying to get variables like this:
sum1 <- [the sum of all B values such that A is 1.2]
num1 <- [the number of times A is 1.2]
任何简单做到这一点的方法?
我基本上想得到一个看起来像这样的数据框:
Any easy way to do this? I basically want to end up with a data frame that looks like this:
A num totalB
1.2 3 12
etc etc etc
其中 num是特定次数出现一个值, totalB是给定A值的B值之和。
Where "num" is the number of times that particular A value appeared, and "totalB" is the sum of the B values given the A value.
推荐答案
我会使用聚合
来获取两个聚合,然后合并
到一个数据框中:
I'd use aggregate
to get the two aggregates and then merge
them into a single data frame:
> df
A B C
1 1.2 4 8
2 2.3 4 9
3 2.3 6 0
4 1.2 3 3
5 3.4 2 1
6 1.2 5 1
> num <- aggregate(B~A,df,length)
> names(num)[2] <- 'num'
> totalB <- aggregate(B~A,df,sum)
> names(totalB)[2] <- 'totalB'
> merge(num,totalB)
A num totalB
1 1.2 3 12
2 2.3 2 10
3 3.4 1 2
这篇关于根据另一列中的值汇总一列中的数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!