计算DataFrame Pandas中“时间”行之间的差异 [英] Calculate difference between 'times' rows in DataFrame Pandas
问题描述
我的DataFrame的格式为:
My DataFrame is in the Form:
TimeWeek TimeSat TimeHoli
0 6:40:00 8:00:00 8:00:00
1 6:45:00 8:05:00 8:05:00
2 6:50:00 8:09:00 8:10:00
3 6:55:00 8:11:00 8:14:00
4 6:58:00 8:13:00 8:17:00
5 7:40:00 8:15:00 8:21:00
我需要在TimeWeek,TimeSat中找到每一行之间的时差和TimeHoli,输出必须为
I need to find the time difference between each row in TimeWeek , TimeSat and TimeHoli, the output must be
TimeWeekDiff TimeSatDiff TimeHoliDiff
00:05:00 00:05:00 00:05:00
00:05:00 00:04:00 00:05:00
00:05:00 00:02:00 00:04:00
00:03:00 00:02:00 00:03:00
00:02:00 00:02:00 00:04:00
我尝试使用(d ['TimeWeek']-df ['TimeWeek']。shift()。fillna(0)
抛出错误:
TypeError: unsupported operand type(s) for -: 'str' and 'str'
可能是因为该列中包含:。我该如何解决呢?
Probably because of the presence of ':' in the column. How do I resolve this?
推荐答案
看起来好像抛出了错误,因为数据是字符串形式而不是字符串形式时间戳记。首先将它们转换为时间戳:
It looks like the error is thrown because the data is in the form of a string instead of a timestamp. First convert them to timestamps:
df2 = df.apply(lambda x: [pd.Timestamp(ts) for ts in x])
默认情况下,它们将包含今天的日期,但这与您的时间差不重要(希望您不必担心跨日期的23:55和00:05之间的差异)。
They will contain today's date by default, but this shouldn't matter once you difference the time (hopefully you don't have to worry about differencing 23:55 and 00:05 across dates).
一旦转换,只需更改DataFrame:
Once converted, simply difference the DataFrame:
>>> df2 - df2.shift()
TimeWeek TimeSat TimeHoli
0 NaT NaT NaT
1 00:05:00 00:05:00 00:05:00
2 00:05:00 00:04:00 00:05:00
3 00:05:00 00:02:00 00:04:00
4 00:03:00 00:02:00 00:03:00
5 00:42:00 00:02:00 00:04:00
根据您的需要,您可以只接受第1行以上(忽略NaT):
Depending on your needs, you can just take rows 1+ (ignoring the NaTs):
(df2 - df2.shift()).iloc[1:, :]
,也可以用零填充NaT:
or you can fill the NaTs with zeros:
(df2 - df2.shift()).fillna(0)
这篇关于计算DataFrame Pandas中“时间”行之间的差异的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!