pandas groupby 滚动不均匀时间 [英] pandas groupby rolling uneven time
本文介绍了pandas groupby 滚动不均匀时间的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我在熊猫滚动时遇到了一些麻烦.这是我的数据集的简化版本:
I am having some trouble with pandas rolling. Here a simplify version of my dataset:
df2 = pd.DataFrame({
'A' : pd.Categorical(["test","train","test","train",'train','hello']),
'B' : (pd.Timestamp('2013-01-02 00:00:05'),
pd.Timestamp('2013-01-02 00:00:10'),
pd.Timestamp('2013-01-02 00:00:09'),
pd.Timestamp('2013-01-02 00:01:05'),
pd.Timestamp('2013-01-02 00:01:25'),
pd.Timestamp('2013-01-02 00:02:05')),
'C' : 1.}).sort_values('A').reset_index(drop=True)
>>> df2
A B C
0 hello 2013-01-02 00:02:05 1.0
1 test 2013-01-02 00:00:05 1.0
2 test 2013-01-02 00:00:09 1.0
3 train 2013-01-02 00:00:10 1.0
4 train 2013-01-02 00:01:05 1.0
5 train 2013-01-02 00:01:25 1.0
我想要一个 10 秒的滚动窗口,以获得以下输出:
I would like to have a rolling window of 10s, to get the following output:
A count
0 hello 1
1 test 2
3 train 1
我尝试使用 groupby 和滚动.
I try the groupby and rolling.
df2.groupby('A').rolling('10s', on='B', closed='right').C.sum()
我从过去的10 年代"观察中得到了滚动窗口,这不是我想要的:
I get the rolling windows from the past '10s' observation, which is not what i am looking for:
A B
hello 2013-01-02 00:02:05 1.0
test 2013-01-02 00:00:05 1.0
2013-01-02 00:00:09 2.0
train 2013-01-02 00:00:10 1.0
2013-01-02 00:01:05 1.0
2013-01-02 00:01:25 1.0
我也尝试重新采样,但我无法得到结果.
I also try resampling, but I am not able to get the result.
grouped = df3.set_index('B').groupby('A').resample('S' )['C'].count()
grouped.reset_index().groupby('A').rolling(window=10,on='B' , min_periods=1).sum()
推荐答案
我认为你必须试试这个:
I think you have to try this:
df2.groupby('A').rolling('11s', on='B').agg({'C': 'sum'}).groupby('A').max()
这篇关于pandas groupby 滚动不均匀时间的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文