返回kmeans聚类中最远的离群值? [英] Return the furthermost outlier in kmeans clustering?

查看:260
本文介绍了返回kmeans聚类中最远的离群值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在sklearn kmeans聚类之后,是否有任何简单的方法可以返回最远的离群值?

Is there any easy way to return the furthermost outlier after sklearn kmeans clustering?

本质上,我想列出负载最大的离群值列表。不幸的是,由于分配,我需要使用sklearn.cluster.KMeans。

Essentially I want to make a list of the biggest outliers for a load of clusters. Unfortunately I need to use sklearn.cluster.KMeans due to the assignment.

推荐答案

K-means不适用于离群值检测。

K-means is not well suited for "outlier" detection.

k均值有将离群值变成一个单元素簇的趋势。然后离群值具有最小可能的距离,并且不会被检测到。

k-means has a tendency to make outliers a one-element cluster. Then the outliers have the smallest possible distance and will not be detected.

当数据中存在离群值时,K均值不够鲁棒。您实际上可能想在使用k均值之前删除异常值

K-means is not robust enough when there are outliers in your data. You may actually want to remove outliers prior to using k-means.

请改用kNN,LOF或LoOP之类的东西。

Use rather something like kNN, LOF or LoOP instead.

这篇关于返回kmeans聚类中最远的离群值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆