我只想使用python使用椭圆形信封来检测离群值,怎么办? [英] i want to detect outliers only using elliptic envelope using python so how?
本文介绍了我只想使用python使用椭圆形信封来检测离群值,怎么办?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我查看了 SKL 的文档,但他们使用了多种算法进行异常检测,但我正在研究椭圆包络的 Python 代码
解决方案
SK Learn 文档有一些关于如何使用它的示例和文档.只需按照此示例
i looked at the documentation of SKL but they used a multiple algorithms for anaomaly detection , but am looking into python code for elliptic envelope only
解决方案
The SK Learn documentation has a few examples and documentation on how to use it. Just follow this example here, but i went ahead and adapted the example to only be Elliptical Envelope.
You should be more than capable of taking the example and applying it elsewhere.
import numpy as np
from scipy import stats
import matplotlib.pyplot as plt
import matplotlib.font_manager
from sklearn.covariance import EllipticEnvelope
from sklearn.ensemble import IsolationForest
from sklearn.neighbors import LocalOutlierFactor
np.random.seed(42)
rng = np.random.RandomState(42)
# Example settings
n_samples = 200
outliers_fraction = 0.25
clusters_separation = [0, 1, 2]
# Settings for evaluation
xx, yy = np.meshgrid(np.linspace(-7, 7, 100), np.linspace(-7, 7, 100))
n_inliers = int((1. - outliers_fraction) * n_samples)
n_outliers = int(outliers_fraction * n_samples)
ground_truth = np.ones(n_samples, dtype=int)
ground_truth[-n_outliers:] = -1
for i, offset in enumerate(clusters_separation):
# Data generation
X1 = 0.3 * np.random.randn(n_inliers // 2, 2) - offset
X2 = 0.3 * np.random.randn(n_inliers // 2, 2) + offset
X = np.r_[X1, X2]
# Add outliers
X = np.r_[X, np.random.uniform(low=-6, high=6, size=(n_outliers, 2))]
# Model
clf = EllipticEnvelope(contamination=outliers_fraction)
# Fit the model
plt.figure(figsize=(9, 7))
clf.fit(X)
scores_pred = clf.decision_function(X)
y_pred = clf.predict(X)
threshold = stats.scoreatpercentile(scores_pred, 100 * outliers_fraction)
n_errors = (y_pred != ground_truth).sum()
# plot the levels lines and the points
Z = clf.decision_function(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)
plt.contourf(xx, yy, Z, levels=np.linspace(Z.min(), threshold, 7),
cmap=plt.cm.Blues_r)
a = plt.contour(xx, yy, Z, levels=[threshold],
linewidths=2, colors='red')
plt.contourf(xx, yy, Z, levels=[threshold, Z.max()],
colors='orange')
b = plt.scatter(X[:-n_outliers, 0], X[:-n_outliers, 1], c='white',
s=20, edgecolor='k')
c = plt.scatter(X[-n_outliers:, 0], X[-n_outliers:, 1], c='black',
s=20, edgecolor='k')
plt.axis('tight')
plt.legend(
[a.collections[0], b, c],
['learned decision function', 'true inliers', 'true outliers'],
prop=matplotlib.font_manager.FontProperties(size=10),
loc='lower right')
plt.xlabel("%d. %s (errors: %d)" % (i + 1, 'Elliptic Envelope', n_errors))
plt.xlim((-7, 7))
plt.ylim((-7, 7))
plt.suptitle("Outlier detection via Elliptic Envelope")
plt.show()
这篇关于我只想使用python使用椭圆形信封来检测离群值,怎么办?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文