按日期范围连接表 [英] Join tables by date range

查看:21
本文介绍了按日期范围连接表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻找按日期范围连接两个表的简单方法.一个表包含确切的日期,另一个表包含两个标识时间段开始和结束的变量.如果第一个表中的日期在第二个表的范围内,我需要连接表.

I am looking for simple method to join two tables by date range. 1 table contains exact date, another table contains two variables identifying beginning and ending of the time period. I need to join tables if date in first table is withing range from second table.

data1 <- data.table(date = c('2010-01-21', '2010-01-25', '2010-02-02', '2010-02-09'),
                name = c('id1','id2','id3','id4'))


data2 <- data.table(beginning=c('2010-01-15', '2010-01-23', '2010-01-30', '2010-02-05'), 
                ending = c('2010-01-22','2010-01-29','2010-02-04','2010-02-13'),
                class = c(1,2,3,4))

result <- data.table(date = c('2010-01-21', '2010-01-25', '2010-02-02', '2010-02-09'),
                 beginning=c('2010-01-15', '2010-01-23', '2010-01-30', '2010-02-05'), 
                 ending = c('2010-01-22','2010-01-29','2010-02-04','2010-02-13'),
                 name = c('id1','id2','id3','id4'),
                 class = c(1,2,3,4))

请问有什么帮助吗?我发现了一些困难的例子,但由于格式的原因,它们甚至不适用于我的数据.我需要类似的东西:

Any help please? I found few difficult examples but they don't even work on my data because of formats. I need something like:

select * from data1
left join
select * from data2
where data2.beginning <= data1.date <= data2.ending

谢谢

推荐答案

我知道以下内容在基础上看起来很糟糕,但这是我想到的.最好使用 'sqldf' 包(见下文).

I know the following looks horrible in base, but here's what I came up with. It's better to use the 'sqldf' package (see below).

library(data.table)
data1 <- data.table(date = c('2010-01-21', '2010-01-25', '2010-02-02', '2010-02-09'),
                    name = c('id1','id2','id3','id4'))


data2 <- data.table(beginning=c('2010-01-15', '2010-01-23', '2010-01-30', '2010-02-05'), 
                    ending = c('2010-01-22','2010-01-29','2010-02-04','2010-02-13'),
                    class = c(1,2,3,4))

result <- cbind(data1,"beginning"=sapply(1:nrow(data2),function(x) data2$beginning[data2$beginning[x]<data1$date & data2$ending[x]>data1$date]),
            "ending"=sapply(1:nrow(data2),function(x) data2$ending[data2$beginning[x]<data1$date & data2$ending[x]>data1$date]),
            "class"=sapply(1:nrow(data2),function(x) data2$class[data2$beginning[x]<data1$date & data2$ending[x]>data1$date]))

使用 sqldf 包:

Using the package sqldf:

library(sqldf)
result = sqldf("select * from data1
                left join data2
                on data1.date between data2.beginning and data2.ending")

使用 data.table 这很简单

Using data.table this is simply

data1[data2, on = .(date >= beginning, date <= ending)]

这篇关于按日期范围连接表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆