根据R中的另一列删除重复的日期 [英] Removing duplicate dates based on another column in R

查看:446
本文介绍了根据R中的另一列删除重复的日期的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个具有多个条目的时间数小时。

I have a timeseries with multiple entries for some hours.

                 date  wd  ws temp sol octa pg  mh daterep
1 2007-01-01 00:00:00 100 1.5  9.0   0    8  D 100   FALSE
2 2007-01-01 01:00:00  90 2.6  9.0   0    7  E  50    TRUE
3 2007-01-01 01:00:00  90 2.6  9.0   0    8  D 100    TRUE
4 2007-01-01 02:00:00  40 1.0  8.8   0    7  F  50   FALSE
5 2007-01-01 03:00:00  20 2.1  8.0   0    8  D 100   FALSE
6 2007-01-01 04:00:00  30 1.0  8.0   0    8  D 100   FALSE

我需要得到一个时间序列,每小时一个条目,取最小mh值的条目有多个条目。 (所以在上面的数据我的第二个条目应该是行2和行3应该被删除。)
我一直在处理这两种方法:挑选我想要的一个新的数据框,并删除我所做的不想在现有的,但没有得到任何地方。感谢您的帮助。

I need to get to a time series with one entry per hour, taking the entry with the minimum mh value where there are multiple entries. (So in the data above my second entry should be row 2 and row 3 should be removed.) I've been working on both approaches: picking out what I want into a new dataframe, and removing what I don't want in the existing, but not getting anywhere. Thanks for your help.

推荐答案

您可以通过日期排序您的数据 mh 使用 plyr :: arrange ,然后删除重复项:

You could sort your data by date and mh using plyr::arrange, then remove duplicates:

df <- read.table(textConnection("

               date    wd  ws temp sol octa pg  mh daterep
'2007-01-01 00:00:00' 100 1.5  9.0   0    8  D 100   FALSE
'2007-01-01 01:00:00'  90 2.6  9.0   0    7  E  50    TRUE
'2007-01-01 01:00:00'  90 2.6  9.0   0    8  D 100    TRUE
'2007-01-01 02:00:00'  40 1.0  8.8   0    7  F  50   FALSE
'2007-01-01 03:00:00'  20 2.1  8.0   0    8  D 100   FALSE
'2007-01-01 04:00:00'  30 1.0  8.0   0    8  D 100   FALSE

"), header = TRUE)

library(plyr)
df <- arrange(df, date, mh)
df <- df[!duplicated(df$date), ]
df
#                  date  wd  ws temp sol octa pg  mh daterep
# 1 2007-01-01 00:00:00 100 1.5  9.0   0    8  D 100   FALSE
# 2 2007-01-01 01:00:00  90 2.6  9.0   0    7  E  50    TRUE
# 4 2007-01-01 02:00:00  40 1.0  8.8   0    7  F  50   FALSE
# 5 2007-01-01 03:00:00  20 2.1  8.0   0    8  D 100   FALSE
# 6 2007-01-01 04:00:00  30 1.0  8.0   0    8  D 100   FALSE

这篇关于根据R中的另一列删除重复的日期的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆