在最近的时间戳上合并两个 pandas 数据帧 [英] merging two pandas dataframes on nearest time stamp

查看:67
本文介绍了在最近的时间戳上合并两个 pandas 数据帧的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个 daframe df1 和 df2

I have two daframes df1 and df2

df1 是

time                  status
2/2/2015 8.00 am      on time
2/2/2015 9.00 am      canceled
2/2/2015 10.30 am     on time
2/2/2015 12.45 pm     on time

df2 是

 w_time                 temp
 2/2/2015 8.00 am      45
 2/2/2015 8.50 am      46
 2/2/2015 9.40 am      47
 2/2/2015 10.15 am     47
 2/2/2015 10.35 am     48
 2/2/2015 12.00 pm     48
 2/2/2015 1.00 pm      49

现在我想以第二个时间戳总是接近或等于第一个时间戳的方式合并两个数据帧

Now i want merge two data frames in such way that the second time stamp is always closer or equal to the first timestamp

结果应该是

time              status     w_time              temp

2/2/2015 8.00 am  on time    2/2/2015 8.00 am     45

2/2/2015 9.00 am  canceled   2/2/2015 8.50 am     46

2/2/2015 10.30 am   on time    2/2/2015 10.35 am   48
2/2/2015 12.45 pm   on time    2/2/2015 1.00 pm    49

推荐答案

首先确保日期列是 datetime64 列.

First ensure that the date columns are datetime64 columns.

df1['time'] = pd.to_datetime(df1['time'].str.replace(".", ":"))
df2['w_time'] = pd.to_datetime(df2['w_time'].str.replace(".", ":"))

如果您将这些设置为 DatetimeIndex 则可以将 reindex 与 'nearest' 方法一起使用:

If you set these as DatetimeIndexs can then use reindex with the 'nearest' method:

In [11]: df1 = df1.set_index("time")

In [12]: df2 = df2.set_index("w_time", drop=False)

In [13]: df1
Out[13]:
                       status
time
2015-02-02 08:00:00   on time
2015-02-02 09:00:00  canceled
2015-02-02 10:30:00   on time
2015-02-02 12:45:00   on time

In [14]: df2
Out[14]:
                     temp              w_time
w_time
2015-02-02 08:00:00    45 2015-02-02 08:00:00
2015-02-02 08:50:00    46 2015-02-02 08:50:00
2015-02-02 09:40:00    47 2015-02-02 09:40:00
2015-02-02 10:15:00    47 2015-02-02 10:15:00
2015-02-02 10:35:00    48 2015-02-02 10:35:00
2015-02-02 12:00:00    48 2015-02-02 12:00:00
2015-02-02 13:00:00    49 2015-02-02 13:00:00

以下内容:

In [15]: df2.reindex(df1.index, method='nearest')
Out[15]:
                     temp              w_time
time
2015-02-02 08:00:00    45 2015-02-02 08:00:00
2015-02-02 09:00:00    46 2015-02-02 08:50:00
2015-02-02 10:30:00    48 2015-02-02 10:35:00
2015-02-02 12:45:00    49 2015-02-02 13:00:00

然后将这些列添加/连接回 df1.

Then add these columns/join back to df1.

这篇关于在最近的时间戳上合并两个 pandas 数据帧的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆