计算时间序列python中事件的持续时间 [英] Calculating the duration an event in a time series python

查看：366 发布时间：2020/10/16 23:19:17 python pandas dataframe time-series

本文介绍了计算时间序列python中事件的持续时间的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个如下所示的数据框：

I have a dataframe as show below:

index                value
2003-01-01 00:00:00  14.5
2003-01-01 01:00:00  15.8
2003-01-01 02:00:00     0
2003-01-01 03:00:00     0
2003-01-01 04:00:00  13.6
2003-01-01 05:00:00   4.3
2003-01-01 06:00:00  13.7
2003-01-01 07:00:00  14.4
2003-01-01 08:00:00     0
2003-01-01 09:00:00     0
2003-01-01 10:00:00     0
2003-01-01 11:00:00  17.2
2003-01-01 12:00:00     0
2003-01-01 13:00:00   5.3
2003-01-01 14:00:00     0
2003-01-01 15:00:00   2.0
2003-01-01 16:00:00   4.0
2003-01-01 17:00:00     0
2003-01-01 18:00:00     0
2003-01-01 19:00:00   3.9
2003-01-01 20:00:00   7.2
2003-01-01 21:00:00   1.0
2003-01-01 22:00:00   1.0
2003-01-01 23:00:00  10.0

索引是日期时间，并有列记录每小时的降雨量（单位：mm），我想计算平均湿sp持续时间，这意味着一天中存在值（不为零）的连续小时平均
，因此计算方式为

The index is datetime and have column record the rainfall value(unit:mm) in each hour,I would like to calculate the "Average wet spell duration", which means the average of continuous hours that exist values (not zero) in a day, so the calculation is

2 + 4 + 1 + 1 + 2 + 5 / 6 (events) = 2.5 (hr)

和平均湿拼写数量，即一天中连续几个小时的总和。

and the "average wet spell amount", which means the average of sum of the values in continuous hours in a day.

{ (14.5 + 15.8) + ( 13.6 + 4.3 + 13.7 + 14.4 ) + (17.2) + (5.3) + (2 + 4)+ (3.9 + 7.2 + 1 + 1 + 10) } /  6 (events) = 21.32 (mm)

上面的datafame只是一个例子，我拥有更多的dataframe较长的时间序列（例如，超过一年），如何编写函数，以便可以更好地计算上述两个值？

The datafame above is just a example, the dataframe which I have have more longer time series (more than one year for example), how can I write a function so it could calculate the two value mentioned above in a better way? thanks in advance!

P.S。值可能是NaN，我只想忽略它。

P.S. the values may be NaN, and I would like to just ignore it.

推荐答案

我相信这就是您想要的。我已经为每个步骤的代码添加了解释。

I believe this is what you are looking for. I have added explanations to the code for each step.

# create helper columns defining contiguous blocks and day
df['block'] = (df['value'].astype(bool).shift() != df['value'].astype(bool)).cumsum()
df['day'] = df['index'].dt.normalize()

# group by day to get unique block count and value count
session_map = df[df['value'].astype(bool)].groupby('day')['block'].nunique()
hour_map = df[df['value'].astype(bool)].groupby('day')['value'].count()

# map to original dataframe
df['sessions'] = df['day'].map(session_map)
df['hours'] = df['day'].map(hour_map)

# calculate result
res = df.groupby(['day', 'hours', 'sessions'], as_index=False)['value'].sum()
res['duration'] = res['hours'] / res['sessions']
res['amount'] = res['value'] / res['sessions']

结果

         day  sessions  duration  value     amount
0 2003-01-01         6       2.5  127.9  21.316667

这篇关于计算时间序列python中事件的持续时间的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

计算时间序列python中事件的持续时间 [英] Calculating the duration an event in a time series python

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

计算时间序列python中事件的持续时间 [英] Calculating the duration an event in a time series python

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭