我只想使用python使用椭圆形信封来检测离群值,怎么办? [英] i want to detect outliers only using elliptic envelope using python so how?

查看:37
本文介绍了我只想使用python使用椭圆形信封来检测离群值,怎么办?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我查看了 SKL 的文档,但他们使用了多种算法进行异常检测,但我正在研究椭圆包络的 Python 代码

解决方案

SK Learn 文档有一些关于如何使用它的示例和文档.只需按照此示例

i looked at the documentation of SKL but they used a multiple algorithms for anaomaly detection , but am looking into python code for elliptic envelope only

解决方案

The SK Learn documentation has a few examples and documentation on how to use it. Just follow this example here, but i went ahead and adapted the example to only be Elliptical Envelope.

You should be more than capable of taking the example and applying it elsewhere.

import numpy as np
from scipy import stats
import matplotlib.pyplot as plt
import matplotlib.font_manager


from sklearn.covariance import EllipticEnvelope
from sklearn.ensemble import IsolationForest
from sklearn.neighbors import LocalOutlierFactor

np.random.seed(42)
rng = np.random.RandomState(42)

# Example settings
n_samples = 200
outliers_fraction = 0.25
clusters_separation = [0, 1, 2]

# Settings for evaluation
xx, yy = np.meshgrid(np.linspace(-7, 7, 100), np.linspace(-7, 7, 100))
n_inliers = int((1. - outliers_fraction) * n_samples)
n_outliers = int(outliers_fraction * n_samples)
ground_truth = np.ones(n_samples, dtype=int)
ground_truth[-n_outliers:] = -1


for i, offset in enumerate(clusters_separation):
    # Data generation
    X1 = 0.3 * np.random.randn(n_inliers // 2, 2) - offset
    X2 = 0.3 * np.random.randn(n_inliers // 2, 2) + offset
    X = np.r_[X1, X2]

    # Add outliers
    X = np.r_[X, np.random.uniform(low=-6, high=6, size=(n_outliers, 2))]

    # Model
    clf = EllipticEnvelope(contamination=outliers_fraction)

    # Fit the model
    plt.figure(figsize=(9, 7))


    clf.fit(X)
    scores_pred = clf.decision_function(X)
    y_pred = clf.predict(X)
    threshold = stats.scoreatpercentile(scores_pred, 100 * outliers_fraction)
    n_errors = (y_pred != ground_truth).sum()

    # plot the levels lines and the points
    Z = clf.decision_function(np.c_[xx.ravel(), yy.ravel()])
    Z = Z.reshape(xx.shape)

    plt.contourf(xx, yy, Z, levels=np.linspace(Z.min(), threshold, 7),
                     cmap=plt.cm.Blues_r)
    a = plt.contour(xx, yy, Z, levels=[threshold],
                        linewidths=2, colors='red')
    plt.contourf(xx, yy, Z, levels=[threshold, Z.max()],
                     colors='orange')
    b = plt.scatter(X[:-n_outliers, 0], X[:-n_outliers, 1], c='white',
                        s=20, edgecolor='k')
    c = plt.scatter(X[-n_outliers:, 0], X[-n_outliers:, 1], c='black',
                        s=20, edgecolor='k')
    plt.axis('tight')
    plt.legend(
        [a.collections[0], b, c],
        ['learned decision function', 'true inliers', 'true outliers'],
        prop=matplotlib.font_manager.FontProperties(size=10),
        loc='lower right')
    plt.xlabel("%d. %s (errors: %d)" % (i + 1, 'Elliptic Envelope', n_errors))
    plt.xlim((-7, 7))
    plt.ylim((-7, 7))
    plt.suptitle("Outlier detection via Elliptic Envelope")

plt.show()

这篇关于我只想使用python使用椭圆形信封来检测离群值,怎么办?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆