我的CNN分类器对随机图像给出了错误的预测 [英] My CNN classifier gives wrong prediction on random images

查看:231
本文介绍了我的CNN分类器对随机图像给出了错误的预测的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用3个数据类别(身份证,护照,账单)训练了CNN分类器(使用tensorflow)。

当我使用属于这3个类别之一的图像进行测试时,它会给出正确的预测。但是,当我用错误的图像(例如汽车图像)对其进行测试时,它会一直给我预测(即,它预测该汽车属于ID卡类别)。

I trained my CNN classifier (using tensorflow) with 3 data categories (ID card, passport, bills).
When I test it with images that belong to one of the 3 categories, it gives the right prediction. However, when I test it with a wrong image (a car image for example) it keeps giving me prediction (i.e. it predicts that the car belongs the ID card category).

有没有办法使它显示错误消息而不是给出错误的预测?

推荐答案

应该以不同的方式解决。这称为开放集识别问题。您可以在Google上搜索并找到更多有关它的信息,但基本上是这样的:
您无法在所有可以想象的课程上训练分类器。它总是会遇到其他一些陌生的类,并且以前从未见过。

This should be tackled differently. This is known as open set recognition problem. You can google it and find more about it but basically it's this: You cannot train your classifier on every class imaginable. It will always run into some other class that it's not familiar with and that it hasn't already seen before.

有几种解决方案可供我选择。其中3个:

There are a few solutions from which I will single out the 3 of them:


  1. 单独的二进制分类器-您可以构建单独的二进制分类器来识别图像和根据帐单,护照或身份证是否在图像中,将它们分为两类。如果是这样,则应让您已经建立的算法来处理图像并将其分类为3类之一。如果第一个分类器说图像中存在其他对象,则可以立即丢弃该图像,因为它不是票据/护照/身份证的图像。

  1. Separate binary classifier - You can build separate binary classifier that recognizes images and sorts them in two categories depending on if the bill, passport or ID are in the image or not. If they are, it should let the algorithm you have already build to process the image and classify it into one of the 3 categories. If the first classifier says that some other object is in the image, you can immediately discard the image because it's not the image of bill/passport/ID.

阈值。在图像上有ID的情况下,ID的概率很高,票据和护照的概率也很低。在图像是其他物体(例如汽车)的情况下,这三个类别的概率很可能几乎相同。换句话说,这两个类别中的任何一个都不十分突出。在这种情况下,无论概率值是0.4还是类似的值,您都从生成的概率中选择最高的概率并将输出类别设置为该概率的类别。要解决此问题,您可以将阈值设置为0.7,例如,如果两个概率均未超过该阈值,则图片上还有其他内容(不是身份证,护照或账单)。

Thresholding. In the case when the ID is on the image, probability of the ID is high and probabilities for bill and passport are fairly low. In the case when the image is something else (ex. a car), the probabilities are most probably about the same for all 3 classes. In other words, probability for neither of the classes really stand out. That is a situation in which you pick the highest probability of the ones generated and set the output class to be the class of that probability, regardless the value of probability is 0.4 or something like that. To resolve this, you can set a threshold at, let's say 0.7, and say if neither of probabilities is over that threshold, there is something else on the picture (not ID, passport or bill).

创建第四个类:未知。如果选择此选项,则应将少量其他图像添加到数据集中,并标记为 unknown 。然后训练分类器,看看结果是什么。

Create the fourth class: Unknown. If you pick this option, you should add few of the other images to the dataset and label them unknown. Then train the classifier and see what the result is.

我建议1或2。希望对您有所帮助:)

I would recommend 1 or 2. Hope it helps :)

这篇关于我的CNN分类器对随机图像给出了错误的预测的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆