获取 pandas 数据框中多个选定列的平均值 [英] Get mean of multiple selected columns in a pandas dataframe
问题描述
我想计算数据框中选定列中所有值的平均值.例如,我有一个包含 A、B、C、D 和 E 列的数据框,我想要 A、C 和 E 列中所有值的平均值.
I want to calculate the mean of all the values in selected columns in a dataframe. For example, I have a dataframe with columns A, B, C, D and E and I want the mean of all the values in columns A, C and E.
import pandas as pd
df1 = pd.DataFrame( ( {'A': [1,2,3,4,5],
'B': [10,20,30,40,50],
'C': [11,21,31,41,51],
'D': [12,22,32,42,52],
'E': [13,23,33,43,53]} ) )
print( df1 )
print( "Mean of df1:", df1.mean() )
df2 = pd.concat( [df1['A'], df1['C'], df1['E'] ], ignore_index=True )
print( df2 )
print( "Mean of df2:", df2.mean() )
df3 = pd.DataFrame()
df3 = pd.concat( [ df3, df1['A'] ], ignore_index=True )
df3 = pd.concat( [ df3, df1['C'] ], ignore_index=True )
df3 = pd.concat( [ df3, df1['E'] ], ignore_index=True )
print( df3 )
print( "Mean of df3:", df3.mean() )
df2 为我提供了正确的答案,但我需要创建一个新的数据框来获取它.
df2 gets me the right answer, but I need to create a new dataframe to get it.
我虽然像 df1['A', 'C', 'E'].mean()
这样的东西会起作用,但它返回每列的平均值,而不是组合平均值.有没有办法在不创建新数据框的情况下做到这一点?我还需要其他数据统计信息,例如 .std()、.min()、max(),因此这不仅仅是一次性计算.
I though something like df1['A', 'C', 'E'].mean()
would work but it returns the mean values for each column, not the combined average. Is there a way to do this without creating a new dataframe? I also need other data statistics like .std(), .min(), max() so this isn't just a one-off calculation.
推荐答案
我知道你有两个选择:
对于mean(), min(), max() 你可以使用mean of mean, min of min, max of max 这将产生A、C、E的所有元素的均值、最小值、最大值.
for mean(), min(), max() you can use mean of mean, min of min, max of max this would yield, mean, min, max of all the elements of A, C, E.
所以你可以使用:for mean():在此输入代码
So you can use:
for mean():enter code here
df1[['A','C','E']].apply(np.mean).mean()
df1[['A','C','E']].values.mean()
以上任何一项都应该为您提供 A、C、E 列所有元素的平均值.
Any one of the above should give you the mean of all the elements of columns A, C, E.
对于 min():
df1[['A','C','E']].apply(np.min).min()
df1[['A','C','E']].values.min()
对于 max():
df1[['A','C','E']].apply(np.max).max()
df1[['A','C','E']].values.max()
对于 std()
df1[['A','C','E']].apply(np.std).std() ## this will not give error, but gives a
value that is not what you want.
df1[['A','C','E']].values.std() # this gives the std of all the elements of columns A, C, E.
std 的 std 不会给出所有元素的 std.
std of std will not give the std of all the elements.
这篇关于获取 pandas 数据框中多个选定列的平均值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!