如何在R中对数据帧执行求和和计数 [英] how to perform sum and count on dataframe in R

查看:214
本文介绍了如何在R中对数据帧执行求和和计数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个ID,类型和要同时执行两次操作的区域的数据框

I have a dataframe with ID, type, and the area I would like to perform two operations at once

ID         Type         Area     
1           Aa          0.02    
2           Ag          0.12    
2           Ag          0.14    
2           Ag          0.80    
2           Bm          0.20    
2           Xm          0.13    

预期结果是

ID          Type       count      area     
1           Aa           1        0.02    
2           Ag           3        1.06 (sum)    
2           Bm           1        0.20    
2           Xm           1        0.13 

我最多拥有100-150个ID,并且每种类型,具有相同ID的计数和基本面积都不同,什么是最好的方法来同时执行总和和计数并在数据帧中保留类型和ID?

I have up to 100-150 ID and type, count and basal area varies for each type with the same ID, what would be the best approach to perform sum and count same time and keep type and ID in dataframe?

谢谢

推荐答案

1)基数R-聚合计数只是一个常数列的总和,因此使用DF可重复显示注意,我们在末尾添加了这样的列并使用sum进行汇总.不使用任何软件包.

1) Base R -- aggregate Counts are just the sum of a constant column of ones so using DF shown reproducibly in the Note at the end we add such a column and aggregate using sum. No packages are used.

aggregate(cbind(Count, Area) ~ ID + Type, transform(DF, Count = 1), sum)

给予:

  ID Type Count Area
1  1   Aa     1 0.02
2  2   Ag     3 1.06
3  2   Bm     1 0.20
4  2   Xm     1 0.13

2)Base R-by 仅使用不依赖于添加一列的技巧的base R的方法是使用by. by调用产生一个类by的列表,而do.call("rbind", ...)将该列表转换为数据帧.

2) Base R -- by An approach using only base R that does not rely on the trick of adding a column of ones is to use by. The by call produces a list of class by and the do.call("rbind", ...) converts that to a data frame.

do.call("rbind", by(DF, DF[1:2], with, 
  data.frame(ID = ID[1], Type = Type[1], Count = length(ID), Area = sum(Area))))

给予:

  ID Type Count Area
1  1   Aa     1 0.02
2  2   Ag     3 1.06
3  2   Bm     1 0.20
4  2   Xm     1 0.13

3)sqldf SQL允许单独和同时应用计数和总和.

3) sqldf SQL allows the separate and simultaneous application of count and sum.

library(sqldf)
sqldf("select ID, Type, count(*) as Count, sum(Area) as Area
  from DF
  group by 1, 2")

给予:

  ID Type Count Area
1  1   Aa     1 0.02
2  2   Ag     3 1.06
3  2   Bm     1 0.20
4  2   Xm     1 0.13

4)data.table 也可以使用data.table包.

4) data.table The data.table package can also be used.

library(data.table)

DT <- as.data.table(DF)
DT[, .(Count = .N, Area = sum(Area)), by = "ID,Type"]

给予:

   ID Type Count Area
1:  1   Aa     1 0.02
2:  2   Ag     3 1.06
3:  2   Bm     1 0.20
4:  2   Xm     1 0.13

注意

Lines <- "ID         Type         Area     
1           Aa          0.02    
2           Ag          0.12    
2           Ag          0.14    
2           Ag          0.80    
2           Bm          0.20    
2           Xm          0.13 "

DF <- read.table(text = Lines, header = TRUE)

这篇关于如何在R中对数据帧执行求和和计数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆