如何计算 pandas 事件之间的时间 [英] How to calculate time between events in a pandas

查看:78
本文介绍了如何计算 pandas 事件之间的时间的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我陷入了以下问题.我正在尝试找出工厂中的哪个时间以及车辆停放了多长时间.我有一张Excel工作表,其中存储了所有事件,这些事件要么是交货路线,要么是维护事件.最终目的是获得一个数据框,在该数据框中给出车辆登记号以及相应的到达工厂的时间以及在工厂所花费的时间(包括维护措施).对于感兴趣的人,这是因为我最终希望能够安排车辆的非关键维护行动.

I'm stuck on the following problem. I'm trying to figure out at which moments in time and for how long a vehicle is situated at the factory. I have an excel sheet in which all events are stored which are either delivery routes or maintenance events. The ultimate goal is to obtain a dataframe in which the vehicle registration number is given with the corresponding arrival at the factory and the time spend there(including maintenance actions). For people interested, this is because I ultimately want to be able to schedule non-critical maintenance actions on the vehicles.

我的数据框的示例为:

  Registration RoutID       Date Dep Loc Arr Loc Dep Time Arr Time  Days
0         XC66    A58  20/May/17    Home   Loc A    10:54    21:56     0
1         XC66    A59  21/May/17   Loc A    Home    00:12    10:36     0
2         XC66   A345  21/May/17   Home    Loc B    12:41    19:16     0
3         XC66   A346  21/May/17   Loc B   Loc C    20:50    03:49     1
4         XC66   A347  22/May/17   Loc C    Home    06:10    07:40     0
5         XC66    #M1  22/May/17    Home    Home    10:51    13:00     0

我创建了一个脚本,其中所有日期和时间都经过处理,以为到达和离开日期时间创建正确的datetime列.对于维护期间:"Dep Loc" =主页,"Arr Loc" = Home,下面的代码用于选择相关行:

I have created a script in which the dates and times are all processed to create the correct datetime columns for the arrival and departure datetimes. For the maintenance periods: "Dep Loc" = Home and "Arr Loc" = Home the following code is used to single out the relevant lines:

df_home = df[df["Dep Loc"].isin(["Home"])]
df_home = df_home[df_home["Arr Loc"].isin(["Home"])]

从这里我可以轻松地减去日期以创建工期列.

From here I can easily subtract the dates to create the duration column.

到目前为止,一切都很好.但是,我坚持使用其他时间进行计算.这是因为可能会有中间的停靠点,所以.shift()函数不起作用,因为要移动的行数不是恒定的.

So far so good. However, I'm stuck on using calculating the other times. This because there might be intermediate stops, so the .shift() function does not work as the amount of rows to shift by is not-constant.

我试图对此事进行搜索,但是我只能找到基于内部事件时间而不是事件之间时间的轮班解决方案或答案.

I have tried to search on this matter but I could only find shift solutions, or answers that are based in the internal event times, but not on the time between events.

任何朝着正确方向的指导将不胜感激!

Any guidance in the right direction would be greatly appreciated!

致谢

我已经在这个问题上停留了一段时间,但是在发布了这个问题之后不久,我尝试了以下解决方案:

I have been stuck on this question for a while now, but shortly after posting this question I tried this solution:

for idx, loc in enumerate(df["Arr Loc"]):
    if loc == "Home":
        a = ((idx2, obj) for idx2, obj in enumerate(df["Dep Loc"]) if (obj == "Home" and idx2 > idx))
        idx_next = next(a)
        idx_next = idx_next[0]

        Arrival_times = df["Arr Time"]
        Departure_times = df["Dep Time"]

        Duration = Arrival_times[idx] - Departure_times[idx_next]

在这里,我使用了下一个函数来查找下一次出现的Home位置(即车辆离开基地的时间).随后,我减去两个日期以找到适当的时差.

Here I used the next function to find the next occurrence of Home as the starting location(i.e. the time the vehicle leaves the base). Subsequently I subtract the two dates to find the proper time difference.

它适用于小型数据集,但不适用于整个数据集.

It works for the small data set, but not still for the entire dataset.

推荐答案

在过滤了相关的数据行之后,转换到达时间"&根据日期"和时间"将时间戳记"Dep time" 天"列

After filtering the relevant data rows, convert the "Arr time" & "Dep time" to timestamps according to the "Date" & "Days" columns

df_home = df[df["Dep Loc"].isin(["Home"])]
df_home = df_home[df_home["Arr Loc"].isin(["Home"])]

df_home['Dep Time']=df_home['Date']+' '+df_home['Dep Time'] 

df_home['Arr Time']=df_home['Date']+' '+df_home['Arr Time'] 

df_home['Date']=pd.to_datetime(df_home['Date'])

df_home['Dep Time']=pd.to_datetime(df_home['Dep Time'])
df_home['Arr Time']=pd.to_datetime(df_home['Arr Time'])
df_home['Dep Time']=pd.to_datetime(df_home['Dep Time'])+pd.to_timedelta(df_home['Days'], unit='d')

最后,"Dep time"和"Dep time"之间的时差为到达时间"将给出持续时间(以分钟为单位)

Finally, difference between "Dep time" & "Arr time" would give the time duration(in minutes)

df_home['diff_duration']=(df_home['Dep Time']-df_home['Arr Time']).astype('timedelta64[m]')

这篇关于如何计算 pandas 事件之间的时间的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆