如何在r中按组填充NA的均值? [英] How to fill mean for NAs in column by groups in r?
本文介绍了如何在r中按组填充NA的均值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个包含多个NA的数据集,我想对每列取平均值,然后按特定组填充Nas,我的数据集如下所示
I have a dataset with several NAs I want to take mean for each column and fill Nas by specific groups my dataset looks as below
PID Category column1 column2 column3
123 1 54 2.4 NA
324 1 52 NA 21.1
356 1 NA 3.6 25.6
378 2 56 3.2 NA
395 2 NA 3.5 29.9
362 2 45 NA 24.3
789 3 65 12.6 23.8
759 3 66 NA 26.8
762 3 NA NA 27.2
741 3 69 8.5 23.3
我需要想要的输出
PID Category column1 column2 column3
123 1 54 2.4 23.3
324 1 52 3.0 21.1
356 1 53 3.6 25.6
378 2 56 3.2 27.1
395 2 50.5 3.5 29.9
362 2 61.3 3.3 24.3
789 3 65 12.6 23.8
759 3 66 10.5 26.8
762 3 66.6 10.5 27.2
741 3 69 8.5 23.3
谢谢
推荐答案
我们可以使用 zoo
中的 na.aggregate
,默认情况下,它将替换 NA
和相关列的平均值
We can use na.aggregate
from zoo
and by default, it replaces the NA
with mean
of the column concerned
library(dplyr)
library(zoo)
df1 %>%
group_by(Category) %>%
mutate(across(starts_with('column'), na.aggregate)) %>%
ungroup
或使用 group_modify
和 na.aggregate
作为@G.格洛腾迪克在评论中建议
Or use group_modify
with na.aggregate
as @G. Grothendieck suggested in the comments
df1 %>%
group_by(Category) %>%
group_modify(na.aggregate) %>%
ungroup
或使用 data.table
library(data.table)
nm1 <- grep("^column\\d+$", names(df1), value = TRUE)
setDT(df1)[, (nm1) := na.aggregate(.SD), by = Category, .SDcols = nm1]
或使用 base R
unsplit(lapply(split(df1, df1$Category), na.aggregate), df1$Category)
这篇关于如何在r中按组填充NA的均值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文