你如何用组的子集的平均值填充 NaN? [英] How do you fill NaN with mean of a subset of a group?

查看:56
本文介绍了你如何用组的子集的平均值填充 NaN?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据框,其中包含 yeartype 的一些值.我想用特定类型的当年值的平均值替换每年的所有 NaN 值.我想以最优雅的方式做到这一点.我正在处理大量数据,因此减少计算也会有好处.

I have a data frame with some values by year and type. I want to replace all NaN values in each year with the mean of values in that year with a specific type. I would like to do this in the most elegant way possible. I'm dealing with a lot of data so less computation would be good as well.

示例:

df =pd.DataFrame({'year':[1,1,1,2,2,2],
                  'type':[1,1,2,1,1,2],
             'val':[np.nan,5,10,100,200,np.nan]})

我希望所有类型的 nan 都被替换为所有类型 1 的各自年份平均值.

I want ALL nan's regardless of their type to be replaced with their respective year mean of all type 1.

在本例中,第一行 NaN 应替换为 5,最后一行应替换为 150.

In this example, the first row NaN should be replaced with 5 and the last row should be replaced with 150.

这只会填充类型 1 缺少的值,而不是类型 2

This only fills in values that are missing for type 1 , not type 2

df[val]=df[val].fillna(df.query('type==1').groupby('year')[val].transform('mean'))

推荐答案

masktransform

df.fillna({'val': df.val.mask(df.type.ne(1)).groupby(df.year).transform('mean')})

   year  type    val
0     1     1    5.0
1     1     1    5.0
2     1     2   10.0
3     2     1  100.0
4     2     1  200.0
5     2     2  150.0

这篇关于你如何用组的子集的平均值填充 NaN?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆