在 pandas 数据框中填写缺少的行值 [英] Fill in missing row values in pandas dataframe

查看:42
本文介绍了在 pandas 数据框中填写缺少的行值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下数据框,想填写缺失的值.

I have the following dataframe and would like to fill in missing values.

mukey   hzdept_r    hzdepb_r    sandtotal_r silttotal_r
425897      0         61        
425897      61        152          5.3         44.7
425911      0         30           30.1        54.9
425911      30        74           17.7        49.8
425911      74        84        

我希望每个缺失值都是对应于该mukey的值的平均值.在这种情况下,例如第一行缺失值将是对应于mukey == 425897的sandtotal_r和silttotal_r的平均值.熊猫fillna似乎并不能解决问题.有帮助吗?

I want each missing value to be the average of values corresponding to that mukey. In this case, e.g. the first row missing values will be the average of sandtotal_r and silttotal_r corresponding to mukey==425897. pandas fillna doesn't seem to do the trick. Any help?

推荐答案

使用下面我刚刚学到的几个问题....

Using what I just learned a couple questions below....

仅供参考,对于没有任何"sandtotal_r"或"silttotal_r"的任何"Mukey",此解决方案仍将保留NaN.

FYI, this solution will still leave NaN's for any 'Mukey's that don't have any 'sandtotal_r's or 'silttotal_r's.

import pandas as pd

df = pd.read_clipboard()

df1 = df.set_index('mukey')

df1.fillna(df.groupby('mukey').mean(),inplace=True)

df1.reset_index()

    mukey  hzdept_r  hzdepb_r  sandtotal_r  silttotal_r
0  425897         0        61          5.3        44.70
1  425897        61       152          5.3        44.70
2  425911         0        30         30.1        54.90
3  425911        30        74         17.7        49.80
4  425911        74        84         23.9        52.35

这篇关于在 pandas 数据框中填写缺少的行值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆