获取 pandas 数据框中多个选定列的平均值 [英] Get mean of multiple selected columns in a pandas dataframe

查看:66
本文介绍了获取 pandas 数据框中多个选定列的平均值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想计算数据框中选定列中所有值的平均值.例如,我有一个包含 A、B、C、D 和 E 列的数据框,我想要 A、C 和 E 列中所有值的平均值.

I want to calculate the mean of all the values in selected columns in a dataframe. For example, I have a dataframe with columns A, B, C, D and E and I want the mean of all the values in columns A, C and E.

import pandas as pd

df1 = pd.DataFrame( ( {'A': [1,2,3,4,5],
                      'B': [10,20,30,40,50],
                      'C': [11,21,31,41,51],
                      'D': [12,22,32,42,52],
                      'E': [13,23,33,43,53]} ) )

print( df1 )

print( "Mean of df1:", df1.mean() )

df2 = pd.concat( [df1['A'], df1['C'], df1['E'] ], ignore_index=True )
print( df2 )
print( "Mean of df2:", df2.mean() )

df3 = pd.DataFrame()
df3 = pd.concat( [ df3, df1['A'] ], ignore_index=True )
df3 = pd.concat( [ df3, df1['C'] ], ignore_index=True )
df3 = pd.concat( [ df3, df1['E'] ], ignore_index=True )
print( df3 )
print( "Mean of df3:", df3.mean() )

df2 为我提供了正确的答案,但我需要创建一个新的数据框来获取它.

df2 gets me the right answer, but I need to create a new dataframe to get it.

我虽然像 df1['A', 'C', 'E'].mean() 这样的东西会起作用,但它返回每列的平均值,而不是组合平均值.有没有办法在不创建新数据框的情况下做到这一点?我还需要其他数据统计信息,例如 .std()、.min()、max(),因此这不仅仅是一次性计算.

I though something like df1['A', 'C', 'E'].mean() would work but it returns the mean values for each column, not the combined average. Is there a way to do this without creating a new dataframe? I also need other data statistics like .std(), .min(), max() so this isn't just a one-off calculation.

推荐答案

我知道你有两个选择:

对于mean(), min(), max() 你可以使用mean of mean, min of min, max of max 这将产生A、C、E的所有元素的均值、最小值、最大值.

for mean(), min(), max() you can use mean of mean, min of min, max of max this would yield, mean, min, max of all the elements of A, C, E.

所以你可以使用:for mean():在此输入代码

So you can use: for mean():enter code here

df1[['A','C','E']].apply(np.mean).mean()
df1[['A','C','E']].values.mean() 

以上任何一项都应该为您提供 A、C、E 列所有元素的平均值.

Any one of the above should give you the mean of all the elements of columns A, C, E.

对于 min():

df1[['A','C','E']].apply(np.min).min()
df1[['A','C','E']].values.min()  

对于 max():

df1[['A','C','E']].apply(np.max).max()
df1[['A','C','E']].values.max() 

对于 std()

df1[['A','C','E']].apply(np.std).std()    ##  this will not give error, but gives a 
                       value that is not what you want.
df1[['A','C','E']].values.std()    # this gives the std of all the elements of columns A, C, E.

std 的 std 不会给出所有元素的 std.

std of std will not give the std of all the elements.

这篇关于获取 pandas 数据框中多个选定列的平均值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆