R:如何检测并修复绘图上的异常值? [英] R : How to detect and fix abnormal values on plot?

查看:131
本文介绍了R:如何检测并修复绘图上的异常值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试从 https://github.com/twitter/AnomalyDetection使用AnomalyDetectionTs() by library(AnomalyDetection) https://www.r-bloggers.com/anomaly-detection- in-r/ 在我的数据上.在我的示例数据中,有非常多的摆动值,并且曲线上的曲线没有下降(或者像图形一样缓慢下降),而没有比其图形应有的下降更多.此功能对我不起作用.该功能检测到的所有异常点均为正确值和正常值.

I tried to use AnomalyDetectionTs() by library(AnomalyDetection) from https://github.com/twitter/AnomalyDetection and https://www.r-bloggers.com/anomaly-detection-in-r/ on my data. In my example data, there are very swing values without dropping curve (or dropping slowly like pattern) on plot more than it should be from its pattern. This function doesn't work for me. All those anomaly detected points by the function are right and normal values.

这是该函数的结果:

我的示例数据: https://raw.githubusercontent.com/ieatbaozi/R-Practicing/master/example.csv

df <- read.csv(url("https://raw.githubusercontent.com/ieatbaozi/R-Practicing/master/example.csv"),header = TRUE,stringsAsFactors = FALSE)
df$DateTime <- as.POSIXct(df$DateTime)

library(AnomalyDetection)
ADtest <- AnomalyDetectionTs(df, max_anoms=0.1, direction='both', plot=TRUE)
ADtest$plot

这是我的预期结果: 如何检测那些异常数据?

Here is my expected result : How to detect those abnormal data?

如何通过填充最适当的值来固定这些值?使它们平滑以在它们周围绘制接近图案的图形,并且在固定这些值之后,所有数据的总值仍然相同.

How to fix those values by filling most proper values? Smooth them to plot close to pattern around them and total value of all data still be the same after fixing those values.

我的额外问题是:您是否有找到其模式的想法?我可以给你更多的信息.非常感谢您的帮助.

My extra question is : Do you have any idea to find its pattern? I can you give you more information. Thank you so much for you helps.

推荐答案

这是一个可能的解决方案.

Here is a possible solution.

  1. 计算每个点周围小窗口的平均值(滚动平均值)
  2. 计算实际值与局部平均值之间的差.
  3. 计算第2步中所有差异的标准偏差.
  4. 将离本地均值大于X标准偏差的那些点标记为离群值.

使用此方法,我获得了您正在寻找的点以及其他一些点-从极低值到极高值的过渡点.您也许可以将其过滤掉.

Using this method, I got the points that you are looking for, together with a few others - points that are in the transition from the very low values to the very high values. You may be able to filter those out.

代码

library(zoo)        ## For rolling mean function

WindowSize = 5
HalfWidth = (WindowSize-1)/2

SD = sqrt(mean((rollmean(df$Val, WindowSize ) - 
    df$Val[-c(1:HalfWidth, (nrow(df)+1-(1:HalfWidth)))])^2))
Out = which(abs(rollmean(df$Val, WindowSize ) - 
    df$Val[-c(1:HalfWidth, (nrow(df)+1-(1:HalfWidth)))]) > 2.95*SD) + 2

plot(df, type="l")
points(df[Out,], pch=16, col="red")

这篇关于R:如何检测并修复绘图上的异常值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆