Python:获取数据框中多个数组的按元素均值 [英] Python: get the element-wise mean of multiple arrays in a dataframe

查看:668
本文介绍了Python:获取数据框中多个数组的按元素均值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个16x10熊猫数据帧,每个单元格中都有1x35000阵列(或NaN).我想对每列的行进行元素明智的平均计算.

I have a 16x10 panda dataframe with 1x35000 arrays (or NaN) in each cell. I want to take the element-wise mean over rows for each column.

      1       2       3       ...       10
1    1x35000 1x35000 1x35000           1x35000

2    1x35000 NaN     1x35000           1x35000

3    1x35000 NaN     1x35000           NaN

...

16   1x35000 1x35000 NaN               1x35000

为避免造成误解:在第一列中获取每个数组的第一个元素,并取均值.然后在第一列中获取每个数组的第二个元素,并再次取均值.最后,我希望有一个1x10数据帧,每列一个1x35000数组.该数组应该是我以前数组的按元素的均值.

To avoid misunderstandings: take the first element of each array in the first column and take the mean. Then take the second element of each array in the first column and take the mean again. In the end I want to have a 1x10 dataframe with one 1x35000 array each per column. The array should be the element-wise mean of my former arrays.

      1       2       3       ...       10
1    1x35000 1x35000 1x35000           1x35000

您是否有一个想法,最好在没有for循环的情况下优雅地到达那里?

Do you have an idea to get there elegantly preferably without for-loops?

推荐答案

设置

np.random.seed([3,14159])
df = pd.DataFrame(
    np.random.randint(10, size=(3, 3, 5)).tolist(),
    list('XYZ'), list('ABC')
).applymap(np.array)

df.loc['X', 'B'] = np.nan
df.loc['Z', 'A'] = np.nan

df


                 A                B                C
X  [4, 8, 1, 1, 9]              NaN  [8, 2, 8, 4, 9]
Y  [4, 3, 4, 1, 5]  [1, 2, 6, 2, 7]  [7, 1, 1, 7, 8]
Z              NaN  [9, 3, 8, 7, 7]  [2, 6, 3, 1, 9]


解决方案

g = df.stack().groupby(level=1)
g.apply(np.sum, axis=0) / g.size()

A                        [4.0, 5.5, 2.5, 1.0, 7.0]
B                        [5.0, 2.5, 7.0, 4.5, 7.0]
C    [5.66666666667, 3.0, 4.0, 4.0, 8.66666666667]
dtype: object

如果您坚持要呈现的形状

If you insist on the shape you presented

g = df.stack().groupby(level=1)
(g.apply(np.sum, axis=0) / g.size()).to_frame().T

                           A                          B                                              C
0  [4.0, 5.5, 2.5, 1.0, 7.0]  [5.0, 2.5, 7.0, 4.5, 7.0]  [5.66666666667, 3.0, 4.0, 4.0, 8.66666666667]

这篇关于Python:获取数据框中多个数组的按元素均值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆