Python:获取数据框中多个数组的按元素均值 [英] Python: get the element-wise mean of multiple arrays in a dataframe
问题描述
我有一个16x10熊猫数据帧,每个单元格中都有1x35000阵列(或NaN).我想对每列的行进行元素明智的平均计算.
I have a 16x10 panda dataframe with 1x35000 arrays (or NaN) in each cell. I want to take the element-wise mean over rows for each column.
1 2 3 ... 10
1 1x35000 1x35000 1x35000 1x35000
2 1x35000 NaN 1x35000 1x35000
3 1x35000 NaN 1x35000 NaN
...
16 1x35000 1x35000 NaN 1x35000
为避免造成误解:在第一列中获取每个数组的第一个元素,并取均值.然后在第一列中获取每个数组的第二个元素,并再次取均值.最后,我希望有一个1x10数据帧,每列一个1x35000数组.该数组应该是我以前数组的按元素的均值.
To avoid misunderstandings: take the first element of each array in the first column and take the mean. Then take the second element of each array in the first column and take the mean again. In the end I want to have a 1x10 dataframe with one 1x35000 array each per column. The array should be the element-wise mean of my former arrays.
1 2 3 ... 10
1 1x35000 1x35000 1x35000 1x35000
您是否有一个想法,最好在没有for循环的情况下优雅地到达那里?
Do you have an idea to get there elegantly preferably without for-loops?
推荐答案
设置
np.random.seed([3,14159])
df = pd.DataFrame(
np.random.randint(10, size=(3, 3, 5)).tolist(),
list('XYZ'), list('ABC')
).applymap(np.array)
df.loc['X', 'B'] = np.nan
df.loc['Z', 'A'] = np.nan
df
A B C
X [4, 8, 1, 1, 9] NaN [8, 2, 8, 4, 9]
Y [4, 3, 4, 1, 5] [1, 2, 6, 2, 7] [7, 1, 1, 7, 8]
Z NaN [9, 3, 8, 7, 7] [2, 6, 3, 1, 9]
解决方案
g = df.stack().groupby(level=1)
g.apply(np.sum, axis=0) / g.size()
A [4.0, 5.5, 2.5, 1.0, 7.0]
B [5.0, 2.5, 7.0, 4.5, 7.0]
C [5.66666666667, 3.0, 4.0, 4.0, 8.66666666667]
dtype: object
如果您坚持要呈现的形状
If you insist on the shape you presented
g = df.stack().groupby(level=1)
(g.apply(np.sum, axis=0) / g.size()).to_frame().T
A B C
0 [4.0, 5.5, 2.5, 1.0, 7.0] [5.0, 2.5, 7.0, 4.5, 7.0] [5.66666666667, 3.0, 4.0, 4.0, 8.66666666667]
这篇关于Python:获取数据框中多个数组的按元素均值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!