如何最好地应对“以上皆非”在图像分类中? [英] How best to deal with "None of the above" in Image Classification?

查看:95
本文介绍了如何最好地应对“以上皆非”在图像分类中?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这似乎是一个基本问题,在座的某些人必须对此发表意见。我有一个用48个类在CNTK中实现的图像分类器。如果该图片与48个类别中的任何一个都不十分匹配,那么我想得出一个结论,那就是它不在这48个图片类型中。我最初的想法只是,如果最终Softmax层的最高输出很低,我可以得出结论,测试图像匹配得不好。虽然我偶尔会看到这种情况,但在大多数测试中,当处理未知图像类型时,Softmax仍然会产生非常高的结果(并且是错误的)。但是,也许我的网络处于超负荷状态,如果不是,我的最初想法会很好。你怎么看?定义第49个名为无人之上的类的任何方法?

This seems to be a fundamental question which some of you out there must have an opinion on. I have an image classifier implemented in CNTK with 48 classes. If the image does not match any of the 48 classes very well, then I'd like to be able to conclude that it was not among these 48 image types. My original idea was simply that if the highest output of the final Softmax layer was low, I would be able to conclude that the test image matched none well. While I occasionally see this occur, in most testing, Softmax still produces a very high (and mistaken) result when handed an 'unknown image type'. But maybe my network is 'over fit' and if it wasn't, my original idea would work fine. What do you think? Any way to define a 49-th class called 'none-of-the-above'?

推荐答案

您确实有这两个类选项的确是正确的-设置后验概率(softmax值),并添加垃圾分类。

You really have these two options indeed--thresholding the posterior probabilities (softmax values), and adding a garbage class.

在我所在的地区(语音),这两种方法都是他们的位置:

In my area (speech), both approaches are their place:

如果以上都不是输入与以上具有相同的性质(例如,非语法输入),则阈值可以正常工作。注意,一个类别的后验概率等于一个减去选择该类别的错误率的估计值。拒绝任何后验< 50%的人会拒绝所有可能比您对错的事情。只要您不属于同一类,那么估计值就可能足够准确,也可以使它们正确。

If "none of the above" inputs are of the same nature as the "above" (e.g. non-grammatical inputs), thresholding works fine. Note that the posterior probability for a class is equal to one minus an estimate of the error rate for choosing this class. Rejecting anything with posterior < 50% would be rejecting all cases where you are more likely wrong than right. As long as your none-of-the-above classes are of similar nature, the estimate may be accurate enough to make this correct for them as well.

如果无以上中的输入具有相似的性质,但是您的班级数量很少(例如10位数字),或者如果输入的性质完全不同(例如,门猛烈撞击或有人咳嗽),则阈值通常会失败。然后,人们将训练一种垃圾模型。根据我们的经验,可以包含正确课程的培训数据。现在,非上流类也可能与正确的类匹配。但这是可以的,只要不对上述级别的人进行过度训练-它的分布将更加平坦,因此即使它与已知的类相匹配,也将以较低的分数与之匹配,因此不会与之抗衡

If "none of the above" inputs are of similar nature but your number of classes is very small (e.g. 10 digits), or if the inputs are of a totally different nature (e.g. a sound of a door slam or someone coughing), thresholding typically fails. Then, one would train a "garbage model." In our experience, it is OK to include the training data for the correct classes. Now the none-of-the-above class may match a correct class as well. But that's OK as long as the none-of-the-above class is not overtrained--its distribution will be much flatter, and thus even if it matches a known class, it will match it with a lower score and thus not win against the actual known class' softmax output.

最后,我会同时使用这两个类。一定要使用阈值(以捕捉系统可以排除的情况),然后使用垃圾模型,我将在任何模型上对其进行训练。我希望即使在仅有的数据中,在培训中包括正确的示例也不会造成损害(请检查安东发布的论文是否也适用于图像)。尝试合成数据也很有意义,例如通过随机组合来自不同图像的补丁。

In the end, I would use both. Definitely use a threshold (to catch the cases that the system can rule out) and use a garbage model, which I would just train it on whatever you have. I would expect that including the correct examples in training will not harm, even if it is the only data you have (please check the paper Anton posted for whether that applies to image as well). It may also make sense to try to synthesize data, e.g. by randomly combining patches from different images.

这篇关于如何最好地应对“以上皆非”在图像分类中?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆