计算表中每 x 行的平均值并创建新表 [英] Calculate average of every x rows in a table and create new table
问题描述
我有一个很长的数据表(约 200 行 x 50 列),我需要创建一个代码来计算表中每两行和每一列的平均值,最终输出是一个新表的平均值.这在 Excel 中显然很疯狂!我使用 python3,我知道一些类似的问题:此处,此处 和 此处.但是这些都没有帮助,因为我需要一些优雅的代码来处理多列并生成一个有组织的数据表.顺便说一下,我的原始数据表是使用 Pandas 导入的,并被定义为一个数据框,但在 Pandas 中找不到一种简单的方法来做到这一点.非常感谢帮助.
I have a long table of data (~200 rows by 50 columns) and I need to create a code that can calculate the mean values of every two rows and for each column in the table with the final output being a new table of the mean values. This is obviously crazy to do in Excel! I use python3 and I am aware of some similar questions:here, here and here. But none of these helps as I need some elegant code to work with multiple columns and produces an organised data table. By the way my original datatable has been imported using pandas and is defined as a dataframe but could not find an easy way to do this in pandas. Help is much appreciated.
表格的一个例子(简短版本)是:
An example of the table (short version) is:
a b c d
2 50 25 26
4 11 38 44
6 33 16 25
8 37 27 25
10 28 48 32
12 47 35 45
14 8 16 7
16 12 16 30
18 22 39 29
20 9 15 47
预期均值表:
a b c d
3 30.5 31.5 35
7 35 21.5 25
11 37.5 41.5 38.5
15 10 16 18.5
19 15.5 27 38
推荐答案
您可以使用 df.index//2
创建一个人工组(或者如@DSM 指出的那样,使用 np.arange(len(df))//2
- 这样它适用于所有索引)然后使用 groupby:
You can create an artificial group using df.index//2
(or as @DSM pointed out, using np.arange(len(df))//2
- so that it works for all indices) and then use groupby:
df.groupby(np.arange(len(df))//2).mean()
Out[13]:
a b c d
0 3.0 30.5 31.5 35.0
1 7.0 35.0 21.5 25.0
2 11.0 37.5 41.5 38.5
3 15.0 10.0 16.0 18.5
4 19.0 15.5 27.0 38.0
这篇关于计算表中每 x 行的平均值并创建新表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!