如何用R中的均值替换所有NA? [英] How do I replace all NA with mean in R?

查看:1305
本文介绍了如何用R中的均值替换所有NA?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的数据集中有1500多个列,其中100多个包含至少一个NA.我知道我可以用单列代替NAs

I have over 1500 columns in my dataset and 100+ of them contains at least one NA. I know I can replace NAs in a single column by

d$var[is.na(d$var)] <- mean(d$var, na.rm=TRUE)

但是我也该怎么办数据集中的所有NA?

but how do I do this too ALL the NAs in my dataset?

谢谢!

推荐答案

我们可以使用zoo中的na.aggregate.循环浏览数据集的列(假设所有列均为numeric),应用na.aggregate将NA替换为mean值(默认情况下),并将其分配回数据集.

We can use na.aggregate from zoo. Loop through the columns of dataset (assuming all the columns are numeric ), apply the na.aggregate to replace the NA with mean values (by default) and assign it back to the dataset.

library(zoo)
df[] <- lapply(df, na.aggregate)

默认情况下,na.aggregateFUN自变量为mean:

By default, the FUN argument of na.aggregate is mean:

默认的S3方法:

Default S3 method:

na.aggregate(object,by = 1,...,FUN =平均值, na.rm = FALSE,maxgap = Inf)

na.aggregate(object, by = 1, ..., FUN = mean, na.rm = FALSE, maxgap = Inf)

要无损地执行此操作:

df2 <- df
df2[] <- lapply(df2, na.aggregate)

或一行:

df2 <- replace(df, TRUE, lapply(df, na.aggregate))

如果存在非数字列,请仅通过首先创建逻辑索引来对数字列执行此操作

If there are non-numeric columns, do this only for the numeric columns by creating a logical index first

ok <- sapply(df, is.numeric)
df[ok] <- lapply(df[ok], na.aggregate)

这篇关于如何用R中的均值替换所有NA?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆