缺少值的列子集的按行平均值 [英] Row-wise average for a subset of columns with missing values

查看:95
本文介绍了缺少值的列子集的按行平均值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个'DataFrame',它偶尔会有缺失的值,看起来像这样:

I've got a 'DataFrame` which has occasional missing values, and looks something like this:

          Monday         Tuesday         Wednesday 
      ================================================
Mike        42             NaN               12
Jenna       NaN            NaN               15
Jon         21              4                 1

我想在数据框中添加一个新的column,以便计算每个row中所有columns的平均值.

I'd like to add a new column to my data frame where I'd calculate the average across all columns for every row.

含义,对于Mike,我需要 (df['Monday'] + df['Wednesday'])/2,但是对于Jenna,我只需使用df['Wednesday amt.']/1

Meaning, for Mike, I'd need (df['Monday'] + df['Wednesday'])/2, but for Jenna, I'd simply use df['Wednesday amt.']/1

有人知道解决因缺失值导致的变化并计算平均值的最佳方法吗?

Does anyone know the best way to account for this variation that results from missing values and calculate the average?

推荐答案

您可以简单地:

df['avg'] = df.mean(axis=1)

       Monday  Tuesday  Wednesday        avg
Mike       42      NaN         12  27.000000
Jenna     NaN      NaN         15  15.000000
Jon        21        4          1   8.666667

因为.mean()默认情况下会忽略缺少的值:请参阅文档.

because .mean() ignores missing values by default: see docs.

要选择一个子集,您可以:

To select a subset, you can:

df['avg'] = df[['Monday', 'Tuesday']].mean(axis=1)

       Monday  Tuesday  Wednesday   avg
Mike       42      NaN         12  42.0
Jenna     NaN      NaN         15   NaN
Jon        21        4          1  12.5

这篇关于缺少值的列子集的按行平均值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆