如何用python中的DataFrame列的模式替换NA值? [英] How to replace NA values with mode of a DataFrame column in python?

查看:331
本文介绍了如何用python中的DataFrame列的模式替换NA值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对Python(和此网站)完全陌生,目前正尝试使用其模式替换特定数据框列中的NA值。我尝试了各种无效的方法。请帮助我找出我做错的事情:

I'm completely new to Python (and this website) and am currently trying to replace NA values in specific dataframe columns with their mode. I've tried various methods which are not working. Please help me spot what I'm doing incorrectly:

注意:我正在使用的所有列都是 float64 类型。我所有的代码都可以运行,但是当我在列中使用 df [cols_mode] .isnull()。sum()检查空值时,它保持不变。

Note: All the columns I'm working with are float64 types. All my codes run but when I check the null amount with df[cols_mode].isnull().sum() in the columns, it remains the same.

方法1:

cols_mode = ['race', 'goal', 'date', 'go_out', 'career_c']

df[cols_mode].apply(lambda x: x.fillna(x.mode, inplace=True))

我也尝试了Imputer方法,但遇到了相同的结果

I tried the Imputer method too but encountered the same result

方法2:

for column in df[['race', 'goal', 'date', 'go_out', 'career_c']]:
    mode = df[column].mode()
    df[column] = df[column].fillna(mode)

方法3:

df['race'].fillna(df.race.mode(), inplace=True)
df['goal'].fillna(df.goal.mode(), inplace=True)
df['date'].fillna(df.date.mode(), inplace=True)
df['go_out'].fillna(df.go_out.mode(), inplace=True)
df['career_c'].fillna(df.career_c.mode(), inplace=True)

方法4:
我的方法成为越来越多的手动过程终于奏效了:

Method 4: My methods become more and more of a manual process and finally this one works:

df['race'].fillna(2.0, inplace=True)
df['goal'].fillna(1.0, inplace=True)
df['date'].fillna(6.0, inplace=True)
df['go_out'].fillna(2.0, inplace=True)
df['career_c'].fillna(2.0, inplace=True) 


推荐答案

mode 返回一个Series,因此在替换<$ c之前,您仍然需要访问所需的行$ c> NaN 值。

mode returns a Series, so you still need to access the row you want before replacing NaN values in your DataFrame.

for column in ['race', 'goal', 'date', 'go_out', 'career_c']:
    df[column].fillna(df[column].mode()[0], inplace=True)

如果要将其应用于DataFrame的所有列,则:

If you want to apply it to all the columns of the DataFrame, then:

for column in df.columns:
    df[column].fillna(df[column].mode()[0], inplace=True)

这篇关于如何用python中的DataFrame列的模式替换NA值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆