计算 pandas 连续两行之间的时差 [英] calculate the time difference between two consecutive rows in pandas
本文介绍了计算 pandas 连续两行之间的时差的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个如下的pandas数据框
I have a pandas dataframe as follows
Dev_id Time
88345 13:40:31
87556 13:20:33
88955 13:05:00
..... ........
85678 12:15:28
以上数据框有83000行.我想将两个连续的行之间的时间差保持在单独的列中.预期的结果将是
The above dataframe has 83000 rows. I want to take time difference between two consecutive rows and keep it in a separate column. The desired result would be
Dev_id Time Time_diff(in min)
88345 13:40:31 20
87556 13:20:33 15
88955 13:05:00 15
我尝试了df['Time_diff'] = df['Time'].diff(-1)
,但出现如下所示的错误
I have tried df['Time_diff'] = df['Time'].diff(-1)
but getting error as shown below
TypeError: unsupported operand type(s) for -: 'datetime.time' and 'datetime.time'
如何解决这个问题
推荐答案
问题是pandas
需要datetime
或timedelta
来使用diff
函数,因此首先通过 total_seconds
除以60
:
Problem is pandas
need datetime
s or timedelta
s for diff
function, so first converting by to_timedelta
, then get total_seconds
and divide by 60
:
df['Time_diff'] = pd.to_timedelta(df['Time'].astype(str)).diff(-1).dt.total_seconds().div(60)
#alternative
#df['Time_diff'] = pd.to_datetime(df['Time'].astype(str)).diff(-1).dt.total_seconds().div(60)
print (df)
Dev_id Time Time_diff
0 88345 13:40:31 19.966667
1 87556 13:20:33 15.550000
2 88955 13:05:00 49.533333
3 85678 12:15:28 NaN
If want floor
or round
per minutes:
df['Time_diff'] = (pd.to_timedelta(df['Time'].astype(str))
.diff(-1)
.dt.floor('T')
.dt.total_seconds()
.div(60))
print (df)
Dev_id Time Time_diff
0 88345 13:40:31 19.0
1 87556 13:20:33 15.0
2 88955 13:05:00 49.0
3 85678 12:15:28 NaN
这篇关于计算 pandas 连续两行之间的时差的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文