将事件的时间序列+持续时间重新采样为并发事件 [英] Resampling a time series of events + duration into concurrent events

查看:87
本文介绍了将事件的时间序列+持续时间重新采样为并发事件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两栏;事件开始的时间和事件的持续时间.像这样:

I have two columns; the time an event started and the duration of that event. Like so:

time, duration
1:22:51,41
1:56:29,36
2:02:06,12
2:32:37,38
2:34:51,24
3:24:07,31
3:28:47,59
3:31:19,32
3:42:52,37
3:57:04,58
4:21:55,23
4:40:28,17
4:52:39,51
4:54:48,26
5:17:06,46
6:08:12,1
6:21:34,12
6:22:48,24
7:04:22,1
7:06:28,46
7:19:12,51
7:19:19,4
7:22:27,27
7:32:25,53

我想创建一个折线图,以显示全天发生的并发事件的数量.将时间重命名为start_time并添加一个新的列来计算end_time很容易(假设这是下一步)-我不太确定自己是否理解如何在此之后对数据进行重新采样,因此可以绘制并发图表.

I want to create a line chart that shows the number of concurrent events happening throughout the day. Renaming time to start_time and adding a new column that computes the end_time is easy enough (assuming that's the next step) -- what I'm not quite sure I understand is how, afterwards, I can resample this data so I can chart concurrents.

我想我想用类似的东西结束(但一分钟就弄糟了):

I imagine I want to wind up with something like (but bucketed by the minute):

time, events
1:30:00,1
2:00:00,2
2:30:00,1
3:00:00,1
3:30:00,2

推荐答案

首先将其设置为实际时间戳:

First make it an actual time stamp:

df['time'] = pd.to_datetime('2014-03-14 ' + df['time'])

现在您可以获取结束时间:

Now you can get the end times:

df['end_time'] = df['time'] + df['duration'] * pd.offsets.Minute(1)

获取公开事件的一种方法是将开始时间和结束时间,重采样和累积量相结合:

A way to get the open events is to combine the start and end times, resample and cumsum:

In [11]: open = pd.concat([pd.Series(1, df.time),  # created add 1
                           pd.Series(-1, df.end_time)  # closed substract 1
                           ]).resample('30Min', how='sum').cumsum()

In [12]: open
Out[12]:
2014-03-14 01:00:00    1
2014-03-14 01:30:00    2
2014-03-14 02:00:00    1
2014-03-14 02:30:00    1
2014-03-14 03:00:00    2
2014-03-14 03:30:00    4
2014-03-14 04:00:00    2
2014-03-14 04:30:00    2
2014-03-14 05:00:00    2
2014-03-14 05:30:00    1
2014-03-14 06:00:00    2
2014-03-14 06:30:00    0
2014-03-14 07:00:00    3
2014-03-14 07:30:00    2
2014-03-14 08:00:00    0
Freq: 30T, dtype: int64

这篇关于将事件的时间序列+持续时间重新采样为并发事件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆