一类svm是否提供概率估计? [英] Does one-class svm provide a probability estimate?

查看:79
本文介绍了一类svm是否提供概率估计?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 Libsvm 进行异常检测(来自Java),但我不仅需要标签,还需要概率估计.我跟踪了代码,发现这是不可能的.特别是在函数svm_predict_values(..)中,我看到以下代码:

I am using Libsvm for outlier detection (from Java), but I need a probability estimate not just a label. I traced the code and found that this is not possible. In particular, in the function svm_predict_values(..) I see the following code:

if(model.param.svm_type == svm_parameter.ONE_CLASS)
        return (sum>0)?1:-1;
else
        return sum;

我了解到,一类SVM会在给定来自正常"类的样本或数据点的情况下,尝试估计某种概率分布的支持.给定一个新的数据点,并且鉴于该模型已学习到正态类分布的支持,我能否估算一个新数据点为正态"或异常值的可能性?似乎这是不可能的,这就是为什么Libsvm限制上述总和并仅返回成员资格标签的原因,但是我不明白为什么.如果可以从一类svm中获得概率估计值,那么花大量时间阅读

I understand that one-class SVM tries to estimate the support of some probability distribution given samples or data points from the "normal" class. Given a new data point, and given that the model has learned the support of the normal class distribution, can I get an estimate of the probability that a new data point is "normal" or an outlier?. It seems that this is not possible and that is why Libsvm thresholds the sum above and returns only a membership label, but I do not understand why. If it is possible to get a probability estimate from a one-class svm, I do not see how to do that in Libsvm after spending a lot of time reading the code.

之所以如此,是因为我不认为内核密度估计不能在高维环境中很好地工作,但是svm可能会遇到相同的问题.

The reason I went this rout is that I do not believe kernel density estimation would work well in a high dimensional setting, but maybe the svm is prone to the same issue.

推荐答案

我了解一类SVM会尝试估算给定正常"类中的样本或数据点对某些概率分布的支持

I understand that one-class SVM tries to estimate the support of some probability distribution given samples or data points from the "normal" class

问题是此句子对于SVM是错误的.总的来说-是的,这是建立分类器的一种很好的概率方法,可采用逻辑回归,神经网络等模型进行建模.但是,SVM并不是其中之一,它没有对SVM的正确概率解释,它并没有真正构造概率分布,而是直接寻找一个好的决策规则.还有更多的概率选择,例如相关向量机(RVM),它们是非凸的.二进制SVM可以为您提供概率估计的唯一原因是,在许多实现中都有一个小骗子",这是由Platt发起的,您只需在SVM上拟合另一个概率模型-通常基于SVM投影的Logistic回归.

The problem is this sentence is false for SVM. In general - yes, this would be a nice probabilistic approach to built a classifier, taken by models like logistic regression, neural nets, and many others. However, SVM is not one of them, there is no proper probabilistic interpretation of SVM, it does not really construct probability distribution but rather directly looks for a nice decision rule. There are more probabilistic alternatives, like Relevance Vector Machines (RVM), which are however, non-convex. The only reason why binary SVM can provide you with probability estimates is because there is a small "cheat" in many implementations, originated by Platt, where you simply fit another, probabilistic model on top of SVM - typically Logistic Regression on top of SVM projection.

那么,你能做什么?您可以使用其他概率更高的模型,也可以使用类似的作弊方法,然后首先通过SVM投影数据(这是所提供的代码中的总和"),然后在其上面放置Logistic回归,这将成为您的最佳选择.概率估计.

So, what can you do? You can either go for other, more probabilistic model, or use similar cheat, and first project your data through SVM (this is what "sum" is in the code provided) and then fit Logistic Regression on top of it, which will be your probability estimate.

这篇关于一类svm是否提供概率估计?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆