日期在R中的日期向量的x天之内的子集数据帧 [英] Subset dataframe where date is within x days of a vector of dates in R

查看:52
本文介绍了日期在R中的日期向量的x天之内的子集数据帧的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个日期向量,例如

  dates<-c('2013-01-01','2013 -04-02','2013-06-10','2013-09-30')

和包含日期列的数据框,例如

  df <-data.frame(
'date'= c('2013-01-04','2013-01-22','2013-10-01','2013-10-10'),
'a'= c(1,2,3 ,4),
'b'= c('a','b','c','d')

我想对数据框进行子集处理,使其仅包含日期小于 dates向量中任何日期后5天的行。 p>

即初始数据帧如下所示:

 日期ab 
2013-01-04 1 a
2013-01 -22 2 b
2013-10-01 3 c
2013-10-10 4 d

查询后,我只剩下第一行和第三行(因为2013-01-04在2013-01-01的5天内,而2013-10-01在2013-09的5天内-30)



有人知道这样做的最好方法吗?



预先感谢

解决方案

使用 data.table 容易(而且非常快)滚动:

 库(data.table)
dt = data.table(df)

#转换为日期(或IDate)以使用数字代替日期字符串
#还要设置联接日期的键
dt [,date:= as.Date(date)]
date = data.table(date = as.Date(dates),key ='date')

#每5天加入一次,抛出不匹配$的日期b $ b date [dt,roll = 5,不匹配= 0]
#日期ab
#1:2013-01-04 1 a
#2:2013-10-01 3 c


I have a vector of dates e.g.

dates <- c('2013-01-01', '2013-04-02', '2013-06-10', '2013-09-30')

And a dataframe which contains a date column e.g.

df <- data.frame(
                'date' = c('2013-01-04', '2013-01-22', '2013-10-01', '2013-10-10'),
                'a'    = c(1,2,3,4),
                'b'    = c('a', 'b', 'c', 'd')
                )

And I would would like to subset the dataframe so it only contains rows where the date is less than 5 days after any of the dates in the 'dates' vector.

i.e. The initial dataframe looks like this

date       a b 
2013-01-04 1 a
2013-01-22 2 b
2013-10-01 3 c
2013-10-10 4 d

After the query I would only be left with the first and third row (since 2013-01-04 is within 5 days of 2013-01-01 and 2013-10-01 is within 5 days of 2013-09-30)

Does anyone know of the best way to do this?

Thanks in advance

解决方案

This is easy (and very fast) to do with a data.table roll:

library(data.table)
dt = data.table(df)

# convert to Date (or IDate) to have numbers instead of strings for dates
# also set the key for dates for the join
dt[, date := as.Date(date)]
dates = data.table(date = as.Date(dates), key = 'date')

# join with a roll of 5 days, throwing out dates that don't match
dates[dt, roll = 5, nomatch = 0]
#         date a b
#1: 2013-01-04 1 a
#2: 2013-10-01 3 c

这篇关于日期在R中的日期向量的x天之内的子集数据帧的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆