计算 pandas 连续两行之间的时差 [英] calculate the time difference between two consecutive rows in pandas

查看:112
本文介绍了计算 pandas 连续两行之间的时差的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个如下的pandas数据框

I have a pandas dataframe as follows

Dev_id     Time
88345      13:40:31
87556      13:20:33
88955      13:05:00
.....      ........
85678      12:15:28

以上数据框有83000行.我想将两个连续的行之间的时间差保持在单独的列中.预期的结果将是

The above dataframe has 83000 rows. I want to take time difference between two consecutive rows and keep it in a separate column. The desired result would be

Dev_id    Time          Time_diff(in min)
88345      13:40:31      20
87556      13:20:33      15
88955      13:05:00      15

我尝试了df['Time_diff'] = df['Time'].diff(-1),但出现如下所示的错误

I have tried df['Time_diff'] = df['Time'].diff(-1) but getting error as shown below

TypeError: unsupported operand type(s) for -: 'datetime.time' and 'datetime.time'

如何解决这个问题

推荐答案

问题是pandas需要datetimetimedelta来使用diff函数,因此首先通过

Problem is pandas need datetimes or timedeltas for diff function, so first converting by to_timedelta, then get total_seconds and divide by 60:

df['Time_diff'] = pd.to_timedelta(df['Time'].astype(str)).diff(-1).dt.total_seconds().div(60)
#alternative
#df['Time_diff'] = pd.to_datetime(df['Time'].astype(str)).diff(-1).dt.total_seconds().div(60)
print (df)
   Dev_id      Time  Time_diff
0   88345  13:40:31  19.966667
1   87556  13:20:33  15.550000
2   88955  13:05:00  49.533333
3   85678  12:15:28        NaN

如果需要 floor round 每分钟:

If want floor or round per minutes:

df['Time_diff'] = (pd.to_timedelta(df['Time'].astype(str))
                     .diff(-1)
                     .dt.floor('T')
                     .dt.total_seconds()
                     .div(60))
print (df)
   Dev_id      Time  Time_diff
0   88345  13:40:31       19.0
1   87556  13:20:33       15.0
2   88955  13:05:00       49.0
3   85678  12:15:28        NaN

这篇关于计算 pandas 连续两行之间的时差的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆