按日期范围加入表 [英] Join tables by date range
问题描述
我正在寻找按日期范围加入两个表的简单方法。 1表包含确切日期,另一个表包含两个变量,用于标识时间段的开始和结束。如果第一张表中的日期与第二个表格的范围相符,我需要加入表格。
I am looking for simple method to join two tables by date range. 1 table contains exact date, another table contains two variables identifying beginning and ending of the time period. I need to join tables if date in first table is withing range from second table.
data1 <- data.table(date = c('2010-01-21', '2010-01-25', '2010-02-02', '2010-02-09'),
name = c('id1','id2','id3','id4'))
data2 <- data.table(beginning=c('2010-01-15', '2010-01-23', '2010-01-30', '2010-02-05'),
ending = c('2010-01-22','2010-01-29','2010-02-04','2010-02-13'),
class = c(1,2,3,4))
result <- data.table(date = c('2010-01-21', '2010-01-25', '2010-02-02', '2010-02-09'),
beginning=c('2010-01-15', '2010-01-23', '2010-01-30', '2010-02-05'),
ending = c('2010-01-22','2010-01-29','2010-02-04','2010-02-13'),
name = c('id1','id2','id3','id4'),
class = c(1,2,3,4))
有任何帮助吗?我发现几个难点的例子,但由于格式,他们甚至没有处理我的数据。我需要这样的东西:
Any help please? I found few difficult examples but they don't even work on my data because of formats. I need something like:
select * from data1
left join
select * from data2
where data2.beginning <= data1.date <= data2.ending
谢谢
推荐答案
我知道以下看起来可怕的基数,但这是我想出来的。最好使用'sqldf'包(见下文)。
I know the following looks horrible in base, but here's what I came up with. It's better to use the 'sqldf' package (see below).
library(data.table)
data1 <- data.table(date = c('2010-01-21', '2010-01-25', '2010-02-02', '2010-02-09'),
name = c('id1','id2','id3','id4'))
data2 <- data.table(beginning=c('2010-01-15', '2010-01-23', '2010-01-30', '2010-02-05'),
ending = c('2010-01-22','2010-01-29','2010-02-04','2010-02-13'),
class = c(1,2,3,4))
result <- cbind(data1,"beginning"=sapply(1:nrow(data2),function(x) data2$beginning[data2$beginning[x]<data1$date & data2$ending[x]>data1$date]),
"ending"=sapply(1:nrow(data2),function(x) data2$ending[data2$beginning[x]<data1$date & data2$ending[x]>data1$date]),
"class"=sapply(1:nrow(data2),function(x) data2$class[data2$beginning[x]<data1$date & data2$ending[x]>data1$date]))
使用包sqldf:
library(sqldf)
result = sqldf("select * from data1
left join data2
on data1.date between data2.beginning and data2.ending")
这篇关于按日期范围加入表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!