更改 R 数据集中所有列中 NA 的异常值 [英] Changing outliers for NA in all columns in a dataset in R

查看:33
本文介绍了更改 R 数据集中所有列中 NA 的异常值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是 R 的初学者,无法更改 R 数据集中所有列的异常值.我成功地使用

I'm a beginner with R and can't manage to change outliers for ALL columns in a dataset in R. I succeeded changing one column at a time with

dataset$column[dataset$column %in% boxplot.stats(dataset$column)$out] <- NA

但是我有 21 列需要更改 NA 的异常值.

But I have 21 columns on which I need to change the outliers for NA.

你会怎么做?

对于一个列范围,你会怎么做?具体列?

How would you do it for a column range? Specific columns?

推荐答案

您可以在列上使用 apply.示例:

You can use apply over the columns. Example:

set.seed(1)
x = matrix(rnorm(20), ncol = 2)
x[2, 1] = 100
x[4, 2] = 200

apply(x, 2, function(row){row[row %in% boxplot(row, plot = FALSE)$out] = NA; row})

            [,1]        [,2]
 [1,] -0.6264538  1.51178117
 [2,]         NA  0.38984324
 [3,] -0.8356286 -0.62124058
 [4,]  1.5952808          NA
 [5,]  0.3295078  1.12493092
 [6,] -0.8204684 -0.04493361
 [7,]  0.4874291 -0.01619026
 [8,]  0.7383247  0.94383621
 [9,]  0.5757814  0.82122120
[10,] -0.3053884  0.59390132

这篇关于更改 R 数据集中所有列中 NA 的异常值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆