--oaa 2和--loss_function = logistic在Vowpal Wabbit中的作用 [英] Effect of --oaa 2 and --loss_function=logistic in Vowpal Wabbit

查看:80
本文介绍了--oaa 2和--loss_function = logistic在Vowpal Wabbit中的作用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

对于二进制分类任务,我应该在大众汽车中使用哪些参数?例如,让我们使用 rcv1_small.dat .我认为更好使用逻辑损失函数(或铰链),而使用--oaa 2没有任何意义.但是,经验结果(在所有4个实验中均报告了渐进验证0/1损失)表明,最佳组合是--oaa 2而没有逻辑损失(即具有默认平方损失):

What parameters should I use in VW for a binary classification task? For example, let's use rcv1_small.dat. I thought it is better to use the logistic loss function (or hinge) and it makes no sense to use --oaa 2. However, the empirical results (with progressive validation 0/1 loss reported in all 4 experiments) show that best combination is --oaa 2 without logistic loss (i.e. with the default squared loss):

cd vowpal_wabbit/test/train-sets

cat rcv1_small.dat | vw --binary
# average loss = 0.0861

cat rcv1_small.dat | vw --binary --loss_function=logistic
# average loss = 0.0909

cat rcv1_small.dat | sed 's/^-1/2/' | vw --oaa 2
# average loss = 0.0857

cat rcv1_small.dat | sed 's/^-1/2/' | vw --oaa 2 --loss_function=logistic
# average loss = 0.0934

我的主要问题是:为什么--oaa 2给出的结果与--binary 不完全相同(在上述设置中)?

My primary question is: Why --oaa 2 does not give exactly the same results as --binary (in the above setting)?

我的第二个问题是:为什么优化逻辑损失不能改善0/1损失(与优化默认平方损失相比)?这是这个特定数据集的特定内容吗?

My secondary questions are: Why optimizing the logistic loss does not improve the 0/1 loss (compared to optimizing the default square loss)? Is this a specific of this particular dataset?

推荐答案

我在使用--csoaa时遇到了类似的情况.可以找到详细信息这里.我的猜测是,在N个类存在多类问题的情况下(无论您是否将2个指定为多个类),vw实际上都可以处理N个特征副本.当为每个可能的类进行预测/学习时,同一示例将获得不同的ft_offset值,并且此偏移量用于哈希算法中.因此,所有类都从同一数据集的行中获得独立"的功能集.当然,要素值是相同的,但是vw不保留值-仅要素权重.每种可能的类别的权重都不同.由于用于存储这些权重的RAM数量由-b(默认为-b 18)固定-您拥有的类越多,发生哈希冲突的机会就越大.您可以尝试增加-b的值,并检查--oaa 2--binary结果之间的差异是否正在减小.但是我可能做错了,因为我没有深入研究大众代码.

I have experienced something similar while using --csoaa. The details could be found here. My guess is that in case of multiclass problem with N classes (no matter that you specified 2 as a number of classes) vw virtually works with N copies of features. Same example gets different ft_offset value when it's predicted/learned for every possible class and this offset is used in hashing algorithm. So all classes get "independent" set of features from same dataset's row. Of course feature values are same, but vw doesn't keep values - only feature weights. And weights are different for each possible class. And as amount of RAM used for storing these weights is fixed with -b (-b 18 by default) - the more classes you have the more chance to get a hash collision. You can try to increase -b value and check if difference between --oaa 2 and --binary results is decreasing. But I might be wrong as I didn't go too deep into the vw code.

关于损失函数-您不能直接比较平方(默认)和逻辑损失函数的平均损失值.您应从平方损失获得的结果中获取原始预测值,并根据逻辑损失获得这些预测的损失.该函数将是:log(1 + exp(-label * prediction)其中label是先验已知答案.可以在float getLoss(float prediction, float label)) > loss_functions.cc .或者,您可以使用1.f / (1.f + exp(- prediction)将原始预测值初步缩放为[0..1],然后按照

As for loss function - you can't compare avg loss values of squared (default) and logistic loss functions directly. You shall get raw prediction values from result obtained with squared loss and get loss of these predictions in terms of logistic loss. The function will be: log(1 + exp(-label * prediction) where label is a priori known answer. Such functions (float getLoss(float prediction, float label) ) for all loss functions implemented in vw could be found in loss_functions.cc. Or you can preliminary scale raw prediction value to [0..1] with 1.f / (1.f + exp(- prediction) and then calc log loss as described on kaggle.com :

double val = 1.f / (1.f + exp(- prediction); // y = f(x) -> [0, 1]
if (val < 1e-15) val = 1e-15;
if (val > (1.0 - 1e-15)) val = 1.0 - 1e-15;
float xx = (label < 0)?0:1; // label {-1,1} -> {0,1}
double loss = xx*log(val) + (1.0 - xx) * log(1.0 - val);
loss *= -1;

您还可以使用'/vowpal_wabbit/utl/logistic'脚本或--link=logistic参数将原始预测缩放到[0..1].两者都使用1/(1+exp(-i)).

You can also scale raw predictions to [0..1] with '/vowpal_wabbit/utl/logistic' script or --link=logistic parameter. Both use 1/(1+exp(-i)).

这篇关于--oaa 2和--loss_function = logistic在Vowpal Wabbit中的作用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆