根据最近的时间戳在R中联接两个数据帧 [英] Join two data frames in R based on closest timestamp

查看:74
本文介绍了根据最近的时间戳在R中联接两个数据帧的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个表(下面的表1和表2),并希望根据最接近的时间戳将它们连接起来以形成Expected_output。如果可能的话,某种涉及dplyr的解决方案将是很好的选择,但如果它使事情进一步复杂化,则不会如此。

  table1 = 
structure(list(date = structure(c(1437051300,1434773700,1431457200
),class = c( POSIXct, POSIXt),tzone =),val1 = c(94L,
33L,53L)),.Names = c( date, val1),row.names = c(NA,-3L
),class = data.frame)

table2 =
structure(list(date = structure(c(1430248288,1435690482,1434050843
),class = c( POSIXct, POSIXt),tzone =),val2 = c(67L,
90L,18L)),.names = c( date, val2),row.names = c(NA,-3L
),class = data.frame )

Expected_output =
structure(list(date = structure(c(1437051300,1434773700,1431457200
),class = c( POSIXct, POSIXt), tzone =),val1 = c(94L,
33L,53L),val2 = c(90L,18L,67L)),.Names = c( date, val1,
val2),row.names = c(NA,-3L),class = data.frame)


解决方案

使用 data.table 的滚动联接功能和 roll = nearest

  require(data.table)#v1.9.6 + 
setDT(table1)[,val2:= setDT(table2)[table1 ,val2,on = date,roll = nearest]]

此处, val2 列是通过对 roll的 date 列执行 join 创建的=最近 选项。对于 table1 $ date 的每一行,都会计算与 table2 $ date 最接近的匹配行,而<$ c $提取对应行的c> val2 。


Hi I have two tables (table1 and table2 below) and would like to join them based on the closest timestamp to form expected_output. Some kind of solution involving dplyr would be great if possible, but not if it further complicates things.

table1 = 
structure(list(date = structure(c(1437051300, 1434773700, 1431457200
), class = c("POSIXct", "POSIXt"), tzone = ""), val1 = c(94L, 
33L, 53L)), .Names = c("date", "val1"), row.names = c(NA, -3L
), class = "data.frame")

table2 = 
structure(list(date = structure(c(1430248288, 1435690482, 1434050843
), class = c("POSIXct", "POSIXt"), tzone = ""), val2 = c(67L, 
90L, 18L)), .Names = c("date", "val2"), row.names = c(NA, -3L
), class = "data.frame")

expected_output = 
structure(list(date = structure(c(1437051300, 1434773700, 1431457200
), class = c("POSIXct", "POSIXt"), tzone = ""), val1 = c(94L,
33L, 53L), val2 = c(90L, 18L, 67L)), .Names = c("date", "val1", 
"val2"), row.names = c(NA, -3L), class = "data.frame")

解决方案

Using rolling joins feature of data.table with roll = "nearest":

require(data.table) # v1.9.6+
setDT(table1)[, val2 := setDT(table2)[table1, val2, on = "date", roll = "nearest"]]

Here, val2 column is created by performing a join on the column date with roll = "nearest" option. For each row of table1$date, the closest matching row from table2$date is computed, and val2 for corresponding row is extracted.

这篇关于根据最近的时间戳在R中联接两个数据帧的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆