对于更快的RCNN caffe模型,应该输入什么合适的图像尺寸? [英] What should be appropriate image size input to faster RCNN caffe model?

查看:100
本文介绍了对于更快的RCNN caffe模型,应该输入什么合适的图像尺寸?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用caffe为Custom数据集训练Faster RCNN。我已经承认,考虑输入图像大小为600 * 1000,可以构建Faster RCNN caffe模型。我的自定义数据集中有许多尺寸为300 * 400的图像。我是否需要将图像零填充到最大尺寸为600 * 100或放大?如果两者都不是,则在对图像进行网络输入之前,应该对其进行适当的修改。请提出建议。

I am trying to train Faster RCNN using caffe for Custom dataset. I have acknowledged that the Faster RCNN caffe model is build considering input image size as 600*1000. I have many images with size 300*400 in my custom dataset. Do I need to zero pad the image upto size 600*100 or upscale it? If neither both, what should be appropriate modification to the images before giving it as input to the network. Please suggest.

谢谢。

推荐答案

快速RCNN在Pascal VOC图像上经过训练,其图像尺寸与您的图像尺寸非常接近(pascalVOC约为500×375)。您无需进行零填充或放大图像,如果您使用原始的python代码,则这是整个过程的一部分。我认为您可以按原样使用它。

Faster RCNN was trained on pascal VOC images with image sizes quite close from yours (~500×375 for pascalVOC). You don't need to zero pad or upscale your images, it is part of the overall process if you use the original python code. I think that you can just use it as it is.

在我看来,仅当图像较大而对象较小时,才应调整输入图像的大小。

In my opinion you should only resize your input images if your images are big and your objects small.

例如,我有3000x4000张图片,其中有100x100个要检测的对象。调整为600x1000大小后,我的对象接近25x25。但是接收场在网络中是硬编码的(ZF和VGG分别为171和228像素)。因此,在这种情况下,相对于此接受域,我的对象将很小。这意味着描述阳性的特征实际上将包含比前景更多的背景信息。

For example, I had 3000x4000 images, with 100x100 objects to detect. After resizing to 600x1000 my objects are close to 25x25. But the receptive field is hard coded in the network (171 and 228 pixels for ZF and VGG, respectively). So in this case, my object would be very small with respect to this receptive field. It means that the features describing a positive would actually contain more background info than foreground...

在那种情况下,我认为最好的方法是为培训阶段(可以为培训和测试使用不同的缩放比例)。

In that case, I think that the best approach is to cut the images for the training phase (you can have different scaling for training and testing).

这篇关于对于更快的RCNN caffe模型,应该输入什么合适的图像尺寸?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆