Vowpal Wabbit多类别分类预测概率 [英] Vowpal Wabbit multiple class classification predict probabilities
问题描述
我正在尝试用Vowpal Wabbit做多个分类问题.
I am trying to do multiple classification problem with Vowpal Wabbit.
我有一个火车文件,看起来像这样:
I have a train file that look like this:
1 | feature_space
1 |feature_space
2 | feature_space
2 |feature_space
3 | feature_space
3 |feature_space
作为输出,我想获得属于每个类的测试项目的概率,如下所示:
As an output I want to get probabilities of test item belonging to each class, like this:
1:0.13 2:0.57 3:0.30
1: 0.13 2:0.57 3:0.30
例如,思考sklearn分类器的predict_proba方法.
think of sklearn classifiers predict_proba methods, for example.
我尝试了以下操作:
1)vw -oaa 3 train.file -f model.file --loss_function logistic --link logistic 大众-p Forecast.file -t test.file -i model.file -raw_predictions = pred.txt
1) vw -oaa 3 train.file -f model.file --loss_function logistic --link logistic vw -p predict.file -t test.file -i model.file -raw_predictions = pred.txt
,但pred.txt文件为空(不包含任何记录,但已创建). Predict.file仅包含最终类,而没有概率.
but the pred.txt file is empty (contains no records, but is created). Predict.file contains only the final class, and no probabilities.
2)大众-csoaa3 train.file -f model.file --link logistic 我已经相应地修改了输入文件以适合CS格式. csoaa不接受带有以下错误消息的loss_function后勤信息:您正在使用非-1或1的标签,并且期望损失函数!"
2) vw - csoaa3 train.file -f model.file --link logistic I've modified the input files accordingly to fit the cs format. csoaa doesn't accept loss_function logistic with following error message: "You are using a label not -1 or 1 with a loss function expecting that!"
如果与默认的平方损失函数和类似的输出命令一起使用,我将获得pred.txt,其中包含每个项目每个类的原始预测,例如:
If used with default square loss function, and similar output command, I get pred.txt with raw predictions for each class per item, for example:
2.33 1.67 0.55
2.33 1.67 0.55
我相信这是最终的平方距离.
I believe it's the resulting square distance.
是否有办法让大众汽车输出类概率,或以某种方式将这些距离转换为概率?
Is there a way to get VW to output class probabilites or somehow convert these distances into probabilities?
推荐答案
VW 7.9.0版中存在一个错误,并在7.10.0版中进行了修复,从而导致原始预测文件为空.
There was a bug in VW version 7.9.0 and fixed in 7.10.0 resulting in the empty raw predictions file.
自 2015年11月以来,获取概率的最简单方法是使用--csoaa_ldf=mc --loss_function=logistic --probabilities -p probs.txt
.)
Since November 2015, the easiest way how to obtain probabilities is to use --oaa=N --loss_function=logistic --probabilities -p probs.txt
. (Or if you need label-dependent features: --csoaa_ldf=mc --loss_function=logistic --probabilities -p probs.txt
.)
这篇关于Vowpal Wabbit多类别分类预测概率的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!