取 pandas 系列中每N行的总和 [英] Take the sum of every N rows in a pandas series
问题描述
假设
s = pd.Series(range(50))
0 0
1 1
2 2
3 3
...
48 48
49 49
如何获得由每n行之和组成的新系列?
How can I get the new series that consists of sum of every n rows?
当n = 5时,预期结果如下:
Expected result is like below, when n = 5;
0 10
1 35
2 60
3 85
...
8 210
9 235
如果使用loc或iloc并通过python循环,当然可以完成,但是我相信可以简单地以Pandas的方式完成.
If using loc or iloc and loop by python, of course it can be accomplished, however I believe it could be done simply in Pandas way.
此外,这是一个非常简化的示例,我不希望对序列进行解释:).我正在尝试的实际数据系列具有时间索引和每秒发生的事件数作为值.
Also, this is a very simplified example, I don't expect the explanation of the sequences:). Actual data series I'm trying has the time index and the the number of events occurred in every second as the values.
推荐答案
GroupBy.sum
N = 5
s.groupby(s.index // N).sum()
0 10
1 35
2 60
3 85
4 110
5 135
6 160
7 185
8 210
9 235
dtype: int64
将索引分为5组,并相应地进行分组.
Chunk the index into groups of 5 and group accordingly.
如果大小是N(或5)的倍数,则可以调整形状并添加:
If the size is a multiple of N (or 5), you can reshape and add:
s.values.reshape(-1, N).sum(1)
# array([ 10, 35, 60, 85, 110, 135, 160, 185, 210, 235])
numpy.add.at
numpy.add.at
b = np.zeros(len(s) // N)
np.add.at(b, s.index // N, s.values)
b
# array([ 10., 35., 60., 85., 110., 135., 160., 185., 210., 235.])
这篇关于取 pandas 系列中每N行的总和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!