缺少值的列子集的按行平均值 [英] Row-wise average for a subset of columns with missing values
问题描述
我有一个'DataFrame',它偶尔会有缺失的值,看起来像这样:
I've got a 'DataFrame` which has occasional missing values, and looks something like this:
Monday Tuesday Wednesday
================================================
Mike 42 NaN 12
Jenna NaN NaN 15
Jon 21 4 1
我想在数据框中添加一个新的column
,以便计算每个row
中所有columns
的平均值.
I'd like to add a new column
to my data frame where I'd calculate the average across all columns
for every row
.
含义,对于Mike
,我需要
(df['Monday'] + df['Wednesday'])/2
,但是对于Jenna
,我只需使用df['Wednesday amt.']/1
Meaning, for Mike
, I'd need
(df['Monday'] + df['Wednesday'])/2
, but for Jenna
, I'd simply use df['Wednesday amt.']/1
有人知道解决因缺失值导致的变化并计算平均值的最佳方法吗?
Does anyone know the best way to account for this variation that results from missing values and calculate the average?
推荐答案
您可以简单地:
df['avg'] = df.mean(axis=1)
Monday Tuesday Wednesday avg
Mike 42 NaN 12 27.000000
Jenna NaN NaN 15 15.000000
Jon 21 4 1 8.666667
因为.mean()
默认情况下会忽略缺少的值:请参阅文档.
because .mean()
ignores missing values by default: see docs.
要选择一个子集,您可以:
To select a subset, you can:
df['avg'] = df[['Monday', 'Tuesday']].mean(axis=1)
Monday Tuesday Wednesday avg
Mike 42 NaN 12 42.0
Jenna NaN NaN 15 NaN
Jon 21 4 1 12.5
这篇关于缺少值的列子集的按行平均值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!