查找并删除在PHP中的异常值 [英] Finding and removing outliers in PHP

查看:123
本文介绍了查找并删除在PHP中的异常值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我品尝精选的数据库记录返回下面的数字:

Suppose I sample a selection of database records that return the following numbers:

20.50, 80.30, 70.95, 15.25, 99.97, 85.56, 69.77

是否有可在PHP被有效地实施,以找到异常值(如果有的话)从float数组基于它们与平均值有多远偏离一个算法

Is there an algorithm that can be efficiently implemented in PHP to find the outliers (if there are any) from an array of floats based on how far they deviate from the mean?

推荐答案

好让我们假设你有你的数据点在这样一个数组:

Ok let's assume you have your data points in an array like so:

<?php $dataset = array(20.50, 80.30, 70.95, 15.25, 99.97, 85.56, 69.77); ?>

然后你可以用下面的函数(请参见发生了什么评论),除去落在平均值之外的所有数+/-标准偏差倍大小设置(默认为1):

Then you can use the following function (see comments for what is happening) to remove all numbers that fall outside of the mean +/- the standard deviation times a magnitude you set (defaults to 1):

<?php

function remove_outliers($dataset, $magnitude = 1) {

  $count = count($dataset);
  $mean = array_sum($dataset) / $count; // Calculate the mean
  $deviation = sqrt(array_sum(array_map("sd_square", $dataset, array_fill(0, $count, $mean))) / $count) * $magnitude; // Calculate standard deviation and times by magnitude

  return array_filter($dataset, function($x) use ($mean, $deviation) { return ($x <= $mean + $deviation && $x >= $mean - $deviation); }); // Return filtered array of values that lie within $mean +- $deviation.
}

function sd_square($x, $mean) {
  return pow($x - $mean, 2);
} 

?>

有关您的例子此函数返回以下为1的大小:

For your example this function returns the following with a magnitude of 1:

Array
(
    [1] => 80.3
    [2] => 70.95
    [5] => 85.56
    [6] => 69.77
)

这篇关于查找并删除在PHP中的异常值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆