删除R中的重复项和求和值 [英] Remove duplicates and sum values in R
问题描述
我有一个数据集df,看起来像这样,但有几百万个实例:
I have a dataset, df, that looks like this but has a few million instances:
日期AD跑道MTOW航班
2008-01-01 A 18376 2
2008-01-01 A 18376 2
2008-01-01 D 36190 1
2008-01- 02 D 09 150 2
2008-01-02 A 36280 1
2008-01-02 A 36280 1
我希望它看起来像这样:
And I want it to look like this:
日期AD跑道MTOW nr.flights
2008-01-01 A 18752 4
2008-01-01 D 36190 2
2008-01-02 D 9150 2
2008-01-02 A 36560 1
基本上我想将所有相同的日期,AD和跑道行组合在一起,所以所有重复项都是删除。同时,我希望针对特定的日期,广告和跑道汇总MTOW和nr.flights。
Basically I want to group together all the Date, AD and Runway rows that are the same, so all the duplicates are removed. At the same time, I want the MTOW and nr.flights to be summed up for that particular Date, AD and Runway.
我已经尝试过:
vals<-expand.grid(Date = unique(df $ Date),
跑道=唯一(df $ Runway),
AD =唯一(df $ AD))
所以我可以合并这与原始数据集df一起使用,但无效。我还尝试了group_by的几种组合,但这也没有给我想要的结果。
So I could merge this with the original dataset, df, but that didn't work. I have also tried a few combinations of group_by but that also didn't give me the result that I wanted.
要重现:
df <- data.frame(Date=c("2008-01-01","2008-01-01","2008-01-01","2008-01-02","2008-01-02","2008-01-02"),
AD = c("A", "A", "D", "D", "A", "A"), Runway = c(18, 18, 36, 09, 36,36),
MTOW = c(376, 376, 190, 150, 280, 280), nr.flights = c(2,2,1,2,1,1))
任何帮助将不胜感激!
Any help would be much appreciated!
推荐答案
这里是使用包 plyr 的一种:
library(plyr)
ddply(df,~Date + AD + Runway,summarise,MTOW=sum(MTOW),nr.flights=sum(nr.flights))
这篇关于删除R中的重复项和求和值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!