使用 scikit-learn OneClassSVM 时获得每个新观察结果为异常值的概率 [英] Getting probability of each new observation being an outlier when using scikit-learn OneClassSVM

查看:58
本文介绍了使用 scikit-learn OneClassSVM 时获得每个新观察结果为异常值的概率的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是 scikit-learn 和 SVM 方法的新手.我的数据集与 scikit-learn OneClassSVM 配合良好,以便检测异常值;我使用观察训练 OneClassSVM,所有这些都是内点",然后使用 predict() 在我的测试数据集上生成二进制内点/离群预测.

I'm new to scikit-learn, and SVM methods in general. I've got my data set working well with scikit-learn OneClassSVM in order to detect outliers; I train the OneClassSVM using observation all of which are 'inliers' and then use predict() to generate binary inlier/outlier predictions on my testing set of data.

然而,为了进一步进行分析,我想获得与测试集中每个新观察相关的概率.例如.成为与每个新观察相关的异常值的概率.我注意到 scikit-learn 中的其他分类方法提供了传递参数概率 = True 来计算这个的能力,但 OneClassSVM 不提供这个.有没有简单的方法来获得这些结果?

However to continue further with my analysis I'd like to get the probabilities associated with each new observation in my test set. E.g. The probability of being an outlier associated with each new observation. I've noticed other classification methods in scikit-learn offer the ability to pass the parameter probability=True to compute this, but OneClassSVM does not offer this. Is there an easy way to get these results?

推荐答案

我一直在为您的同一问题寻找答案,直到我到达此页面.卡住了一段时间,然后,我回去检查原始 LIBSVM 包,因为 scikit-learn 的 OneClassSVM 是基于 LIBSVM 的实现,如所述 这里.

I was searching for an answer for the same question of yours until I got to this page. Stuck for sometime, then, I went back to check the original LIBSVM package since OneClassSVM of scikit-learn is based on the implementation of LIBSVM as stated here.

LIBSVM 主页,他们声明了以下内容用于激活某些 SVM 变体的返回概率输出分数的选项-b":-bprobability_estimates:是否训练SVC或SVR模型进行概率估计,0或1(默认0)换句话说,属于 SVM 类型(既不是 SVC 也不是 SVR)的一类 SVM 没有实现概率估计.

At the main page of LIBSVM, they state the following for option '-b' that is used to activate returning probability output scores for some variants of SVM: -b probability_estimates: whether to train a SVC or SVR model for probability estimates, 0 or 1 (default 0) In other words, the one-class SVM which is of type SVM (neither SVC nor SVR) does not have implementation for probability estimation.

如果我尝试使用 LIBSVM 的命令行界面强制执行此选项(即 -b),例如:./svm-train -s 2 -t 2 -b 1 heart_scale

If I go and try to force this option (i.e. -b) using the command line interface of LIBSVM, for example: ./svm-train -s 2 -t 2 -b 1 heart_scale

我收到以下错误消息:错误:尚不支持一类 SVM 概率输出

总而言之,LIBSVM 尚不支持这个非常需要的输出,因此,scikit-learn 暂时不提供它.我希望在不久的将来,他们会激活此功能并在此处更新线程.

In summary, this very desired output is not yet supported by LIBSVM and thus, scikit-learn is not offering it for the moment. I hope in near future, they activate this functionality and update the thread here.

这篇关于使用 scikit-learn OneClassSVM 时获得每个新观察结果为异常值的概率的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆