pandas 数据框中的不重叠滚动窗口 [英] Non-overlapping rolling windows in pandas dataframes
问题描述
我熟悉Pandas Rolling窗口函数,但它们的步长始终为1。我想在Pandas中执行移动聚合函数,但输入项不重叠。
I am familiar with the Pandas Rolling window functions, but they always have a step size of 1. I want to do a moving aggregate function in Pandas, but where the entries don't overlap.
在此数据框中:
df.rolling(2).min()
将产生:
N / A 519566727 1099 12385
但是我想要一个固定的台阶式窗口大小为2,因此得出:
But I want a fixed window with a step size of 2, so it yields:
519 727 12385
因为有一个固定的窗口,所以它应该以该窗口的大小代替。
Because with a fixed window, it should step over by the size of that window instead.
推荐答案
没有这样的窗口内置在 rolling
函数中的参数,但是您可以计算常用的滚动函数,然后跳过每 n
行(您的c中 n = 2
ase)。
There's no such built in argument in the rolling
function, but you can compute the usual rolling function and then skip every n
th row (where n=2
in your case).
df.rolling(n).min()[n-1::n]
正如您在评论中提到的那样,这可能导致许多冗余计算将被忽略(特别是如果 n
很大)。
相反,您可以使用以下代码将数据划分(分组)为大小为 n
的bin:
As you mentioned in your comment, this might result in many redudant computations which will be ignored (especially if n
is large).
Instead, you could use the following code which partitions (groups) the data into bins of size n
:
df.groupby(df.index // n).min()
我没有检查它是否确实更有效,但我认为应该有效。
I did not check if it's indeed more efficient, but I believe it should be.
这篇关于 pandas 数据框中的不重叠滚动窗口的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!