如何在r中按组填充NA的均值? [英] How to fill mean for NAs in column by groups in r?

查看:77
本文介绍了如何在r中按组填充NA的均值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含多个NA的数据集,我想对每列取平均值,然后按特定组填充Nas,我的数据集如下所示

I have a dataset with several NAs I want to take mean for each column and fill Nas by specific groups my dataset looks as below

PID Category    column1 column2 column3
123    1             54    2.4  NA
324    1             52    NA   21.1
356    1             NA    3.6  25.6
378    2             56    3.2  NA
395    2             NA    3.5  29.9
362    2             45    NA   24.3
789    3             65   12.6  23.8
759    3             66    NA   26.8
762    3             NA    NA   27.2
741    3             69   8.5   23.3

我需要想要的输出

PID Category    column1 column2 column3
123    1             54   2.4   23.3
324    1             52   3.0   21.1
356    1             53   3.6   25.6
378    2             56   3.2   27.1
395    2             50.5 3.5   29.9
362    2             61.3 3.3   24.3
789    3             65   12.6  23.8
759    3             66   10.5  26.8
762    3             66.6 10.5  27.2
741    3             69   8.5   23.3

谢谢

推荐答案

我们可以使用 zoo 中的 na.aggregate ,默认情况下,它将替换 NA和相关列的平均值

We can use na.aggregate from zoo and by default, it replaces the NA with mean of the column concerned

library(dplyr)
library(zoo)
df1 %>%
   group_by(Category) %>%
   mutate(across(starts_with('column'), na.aggregate)) %>%
   ungroup


或使用 group_modify na.aggregate 作为@G.格洛腾迪克在评论中建议


Or use group_modify with na.aggregate as @G. Grothendieck suggested in the comments

df1 %>% 
  group_by(Category) %>% 
  group_modify(na.aggregate) %>%
  ungroup


或使用 data.table

library(data.table)
nm1 <- grep("^column\\d+$", names(df1), value = TRUE)
setDT(df1)[, (nm1) := na.aggregate(.SD), by = Category, .SDcols = nm1]


或使用 base R

unsplit(lapply(split(df1, df1$Category), na.aggregate), df1$Category)

这篇关于如何在r中按组填充NA的均值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆