使用 CNN 进行二值图像分类 - 选择“否定"的最佳实践数据集? [英] Binary Image Classification with CNN - best practices for choosing "negative" dataset?

本文介绍了使用 CNN 进行二值图像分类 - 选择“否定"的最佳实践数据集?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

比如说,我想训练一个 CNN 来检测图像是否是汽车.

Say, I want to train a CNN to detect whether an image is a car or not.

选择非汽车"数据集有哪些最佳做法或方法?

What are some best practices or methods to choosing the "Not-Car" dataset?

因为这个数据集可能是无限的(基本上任何不是汽车的东西) - 是否有关于数据集需要多大的指南?它们是否应该包含与汽车非常相似但又不是汽车的物体(飞机、船等)?

Because this dataset could potentially be infinite (basically anything that is not a car) - is there a guideline on how big the dataset needs to be? Should they contain objects which are very similar to cars, but are not (planes, boats, etc.)?

推荐答案

与所有监督机器学习一样,训练集应该反映模型将要使用的真实分布.神经网络基本上是一个函数逼近器.您的实际目标是近似真实世界的分布,但实际上只能从中获取样本,而这个样本是神经网络唯一会看到的东西.对于训练流形之外的任何输入方式,输出将只是一个猜测(另请参阅 关于 AI.SE 的讨论).

Like in all of supervised machine learning, the training set should reflect the real distribution that the model is going to work with. Neural network is basically a function approximator. Your actual goal is to approximate the real-world distribution, but in practice it's only possible to get the sample from it, and this sample is the only thing a neural network will see. For any input way outside of the training manifold, the output will be a just a guess (see also this discussion on AI.SE).

因此,在选择否定数据集时,您应该回答的第一个问题是:此模型的可能用例是什么? 例如,如果您正在为智能手机构建应用程序,那么负样本应该可能包括街景、建筑物和商店的图片、人、室内环境等.智能手机摄像头的图像不太可能是野生动物或抽象绘画,也就是说,它在你的真实环境中是不可能的输入分布.

So when choosing a negative dataset, the first question you should answer is: What will be the likely use-case of this model? E.g., if you're building an app for a smartphone, then the negative sample should probably include street views, pictures of buildings and stores, people, indoor environment, etc. It's unlikely that the image from the smartphone camera will be a wild animal or abstract painting, i.e., it's an improbable input in your real distribution.

包含看起来像正类的图像(卡车、飞机、船等)是个好主意,因为低卷积层的特征(边缘、角落)非常相似,神经网络学习很重要正确的重要高级功能.

Including images that look like a positive class (trucks, airplanes, boats, etc) is a good idea, because the low-conv-layer features (edges, corners) will be very similar and it's important that the neural network learned important high-level features correctly.

一般来说,我会使用 5-10 倍于正面图片的负面图片.CIFAR-10 是一个很好的起点:在 50000 张训练图像中,有 5000 张是汽车,5000 是飞机,等等.事实上,构建一个 10 类分类器并不是一个坏主意.在这种情况下,您将通过阈值确定推断的类是汽车的确定性,将此 CNN 转换为二元分类器.CNN 不确定的任何内容都将被解释为不是汽车.

In general, I'd use 5-10x more negative images that positive ones. CIFAR-10 is a good starting point: out of 50000 training images 5000 are the cars, 5000 are the planes, etc. In fact, building a 10-class classifier is not a bad idea. In this case, you'll transform this CNN to a binary classifier by thresholding its certainty that the inferred class is a car. Anything that the CNN isn't certain about will be interpreted as not a car.

这篇关于使用 CNN 进行二值图像分类 - 选择“否定"的最佳实践数据集?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆