删除R中的重复项和求和值 [英] Remove duplicates and sum values in R

查看:163
本文介绍了删除R中的重复项和求和值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据集df,看起来像这样,但有几百万个实例:

I have a dataset, df, that looks like this but has a few million instances:


日期AD跑道MTOW航班
2008-01-01 A 18376 2
2008-01-01 A 18376 2
2008-01-01 D 36190 1
2008-01- 02 D 09 150 2
2008-01-02 A 36280 1
2008-01-02 A 36280 1

我希望它看起来像这样:

And I want it to look like this:


日期AD跑道MTOW nr.flights
2008-01-01 A 18752 4
2008-01-01 D 36190 2
2008-01-02 D 9150 2
2008-01-02 A 36560 1

基本上我想将所有相同的日期,AD和跑道行组合在一起,所以所有重复项都是删除。同时,我希望针对特定的日期,广告和跑道汇总MTOW和nr.flights。

Basically I want to group together all the Date, AD and Runway rows that are the same, so all the duplicates are removed. At the same time, I want the MTOW and nr.flights to be summed up for that particular Date, AD and Runway.

我已经尝试过:
vals<-expand.grid(Date = unique(df $ Date),
跑道=唯一(df $ Runway),
AD =唯一(df $ AD))

所以我可以合并这与原始数据集df一起使用,但无效。我还尝试了group_by的几种组合,但这也没有给我想要的结果。

So I could merge this with the original dataset, df, but that didn't work. I have also tried a few combinations of group_by but that also didn't give me the result that I wanted.

要重现:

df <- data.frame(Date=c("2008-01-01","2008-01-01","2008-01-01","2008-01-02","2008-01-02","2008-01-02"),
              AD = c("A", "A", "D", "D", "A", "A"), Runway = c(18, 18, 36, 09, 36,36), 
              MTOW = c(376, 376, 190, 150, 280, 280), nr.flights = c(2,2,1,2,1,1))

任何帮助将不胜感激!

Any help would be much appreciated!

推荐答案

这里是使用包 plyr 的一种:

library(plyr)
ddply(df,~Date + AD + Runway,summarise,MTOW=sum(MTOW),nr.flights=sum(nr.flights))

这篇关于删除R中的重复项和求和值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆