基于特定列属性的 pandas fillna() [英] Pandas fillna() based on specific column attribute

查看:128
本文介绍了基于特定列属性的 pandas fillna()的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有这张桌子

Type | Killed | Survived
Dog      5         2
Dog      3         4
Cat      1         7
Dog     nan        3
cow     nan        2

[Type] = Dog缺少Killed上的值之一.

我想在[Killed]中为[Type] = Dog推算平均值.

I want to impute the mean in [Killed] for [Type] = Dog.

我的代码如下:

  1. 搜索均值

df[df['Type'] == 'Dog'].mean().round()

这将给我平均值(约2.25)

This will give me the mean (around 2.25)

  1. 求平均值(这是问题开始的地方)

df.loc[(df['Type'] == 'Dog') & (df['Killed'])].fillna(2.25, inplace = True)

代码可以运行,但是该值不是估算值,NaN值仍然存在.

The code runs, but the value is not impute, the NaN value is still there.

我的问题是,如何根据[Type] = Dog来估算[Killed]中的均值.

My Question is, how do I impute the mean in [Killed] based on [Type] = Dog.

推荐答案

为我工作:

df.ix[df['Type'] == 'Dog', 'Killed'] = df.ix[df['Type'] == 'Dog', 'Killed'].fillna(2.25)
print (df)
  Type  Killed  Survived
0  Dog    5.00         2
1  Dog    3.00         4
2  Cat    1.00         7
3  Dog    2.25         3
4  cow     NaN         2

如果需要 fillna Series-因为2列KilledSurvived:

If need fillna by Series - because 2 columns Killed and Survived:

m = df[df['Type'] == 'Dog'].mean().round()
print (m)
Killed      4.0
Survived    3.0
dtype: float64

df.ix[df['Type'] == 'Dog'] = df.ix[df['Type'] == 'Dog'].fillna(m)
print (df)
  Type  Killed  Survived
0  Dog     5.0         2
1  Dog     3.0         4
2  Cat     1.0         7
3  Dog     4.0         3
4  cow     NaN         2

如果仅在Killed列中需要fillna:

If need fillna only in column Killed:

#if dont need rounding, omit it
m = round(df.ix[df['Type'] == 'Dog', 'Killed'].mean())
print (m)
4

df.ix[df['Type'] == 'Dog', 'Killed'] = df.ix[df['Type'] == 'Dog', 'Killed'].fillna(m)
print (df)
  Type  Killed  Survived
0  Dog     5.0         2
1  Dog     3.0         8
2  Cat     1.0         7
3  Dog     4.0         3
4  cow     NaN         2

您可以重复使用以下代码:

You can reuse code like:

filtered = df.ix[df['Type'] == 'Dog', 'Killed']
print (filtered)
0    5.0
1    3.0
3    NaN
Name: Killed, dtype: float64

df.ix[df['Type'] == 'Dog', 'Killed'] = filtered.fillna(filtered.mean())
print (df)
  Type  Killed  Survived
0  Dog     5.0         2
1  Dog     3.0         8
2  Cat     1.0         7
3  Dog     4.0         3
4  cow     NaN         2

这篇关于基于特定列属性的 pandas fillna()的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆