如何从数据集中删除异常值 [英] How to remove outliers from a dataset

查看：91 发布时间：2021/6/30 19:50:17 r statistics outliers

本文介绍了如何从数据集中删除异常值的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一些关于美貌与年龄的多元数据.年龄范围为 20-40 岁，间隔为 2(20、22、24....40)，并且对于每条数据记录，他们被赋予一个年龄和 1-5 级的美貌评级.当我绘制这些数据的箱线图(X 轴为年龄，Y 轴为美貌评分)时，每个箱体的胡须外都绘制了一些异常值.

I've got some multivariate data of beauty vs ages. The ages range from 20-40 at intervals of 2 (20, 22, 24....40), and for each record of data, they are given an age and a beauty rating from 1-5. When I do boxplots of this data (ages across the X-axis, beauty ratings across the Y-axis), there are some outliers plotted outside the whiskers of each box.

我想从数据框本身中删除这些异常值，但我不确定 R 如何计算其箱线图的异常值.下面是我的数据可能是什么样子的示例.

I want to remove these outliers from the data frame itself, but I'm not sure how R calculates outliers for its box plots. Below is an example of what my data might look like.

推荐答案

好的，您应该将类似的内容应用到您的数据集.不要更换 &保存，否则你会破坏你的数据！而且，顺便说一句，您应该(几乎)永远不要从数据中删除异常值:

OK, you should apply something like this to your dataset. Do not replace & save or you'll destroy your data! And, btw, you should (almost) never remove outliers from your data:

remove_outliers <- function(x, na.rm = TRUE, ...) {
  qnt <- quantile(x, probs=c(.25, .75), na.rm = na.rm, ...)
  H <- 1.5 * IQR(x, na.rm = na.rm)
  y <- x
  y[x < (qnt[1] - H)] <- NA
  y[x > (qnt[2] + H)] <- NA
  y
}

查看实际效果:

set.seed(1)
x <- rnorm(100)
x <- c(-10, x, 10)
y <- remove_outliers(x)
## png()
par(mfrow = c(1, 2))
boxplot(x)
boxplot(y)
## dev.off()

再说一次，你永远不应该自己做这件事，离群值只是注定的！=)

And once again, you should never do this on your own, outliers are just meant to be! =)

我添加了 na.rm = TRUE 作为默认值.

I added na.rm = TRUE as default.

删除了 quantile 函数，添加了下标，从而使函数更快！=)

Removed quantile function, added subscripting, hence made the function faster! =)

这篇关于如何从数据集中删除异常值的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何从数据集中删除异常值 [英] How to remove outliers from a dataset

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何从数据集中删除异常值 [英] How to remove outliers from a dataset

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭