pandas 按时间分组,指定的开始时间为非整数分钟 [英] Pandas group by time with specified start time with non integer minutes

查看:53
本文介绍了 pandas 按时间分组,指定的开始时间为非整数分钟的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个带有一个小时信号的数据帧.我想将它们分组在10分钟的存储桶中.问题在于开始时间并不是10分钟的整数倍",因此,我没有获得6组,而是获得了7个,其中第一个和最后一个不完整.

I have a dataframe with one hour long signals. I want to group them in 10 minutes buckets. The problem is that the starting time is not precisely a "multiple" of 10 minutes, therefore, instead of obtaining 6 groups, I obtain 7 with the first and the last incomplete.

可以很容易地重现该问题

The issue can be easily reproduced doing

import pandas as pd
import numpy as np
import datetime as dt

rng = pd.date_range('1/1/2011 00:05:30', periods=3600, freq='1S')
ts = pd.DataFrame({'a':np.random.randn(len(rng)),'b':np.random.randn(len(rng))}, index=rng)

interval = dt.timedelta(minutes=10)

ts.groupby(pd.Grouper(freq=interval)).apply(len)

2011-01-01 00:00:00    270
2011-01-01 00:10:00    600
2011-01-01 00:20:00    600
2011-01-01 00:30:00    600
2011-01-01 00:40:00    600
2011-01-01 00:50:00    600
2011-01-01 01:00:00    330
Freq: 10T, dtype: int64

我尝试按照此处所述解决问题base只需要整数分钟.对于上面的示例(从00:05之后的30秒开始),下面的代码仍然不起作用

I tried to solve it as described here but base only takes integer number of minutes. For the above example (starting from 30s after 00:05) the code below still doesn't work

ts.groupby(pd.Grouper(freq=interval, base=ts.index[0].minute)).apply(len)

如何为石斑鱼设置通用的开始时间?我的预期输出是

How can I set a generic starting time for the Grouper? My expected output here would be

2011-01-01 00:05:30    600
2011-01-01 00:15:30    600
2011-01-01 00:25:30    600
2011-01-01 00:35:30    600
2011-01-01 00:45:30    600
2011-01-01 00:55:30    600

推荐答案

base接受float参数.除了分钟,您还必须考虑秒.

base accepts a float argument. In addition to the minutes, you must also consider the seconds.

base = ts.index[0].minute + ts.index[0].second/60
ts.groupby(pd.Grouper(freq=interval, base=base)).size()

2011-01-01 00:05:30    600
2011-01-01 00:15:30    600
2011-01-01 00:25:30    600
2011-01-01 00:35:30    600
2011-01-01 00:45:30    600
2011-01-01 00:55:30    600
Freq: 10T, dtype: int64

这篇关于 pandas 按时间分组,指定的开始时间为非整数分钟的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆