进行多尺度训练(yolov2) [英] Perform multi-scale training (yolov2)

查看:523
本文介绍了进行多尺度训练(yolov2)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想知道 YOLOv2 中的多尺度训练如何工作。

I am wondering how the multi-scale training in YOLOv2 works.

在本文中指出:


原始YOLO使用的输入分辨率为448×448。通过添加锚框,我们将分辨率更改为416×416。但是,由于我们的模型仅使用卷积和池化层,因此可以即时调整大小。我们希望YOLOv2能够在不同尺寸的图像上运行,因此我们将其训练到模型中。我们不固定输入图像的大小,而是每隔几次迭代就更改网络。我们的网络每10批将随机选择一个新的图像尺寸。 由于我们的模型下采样了32倍,因此我们从以下32的倍数中提取:{320,352,...,608}。因此,最小选项为320×320,最大选项为608×608。我们调整了尺寸网络到那个维度并继续训练。

The original YOLO uses an input resolution of 448 × 448. ith the addition of anchor boxes we changed the resolution to 416×416. However, since our model only uses convolutional and pooling layers it can be resized on the fly. We want YOLOv2 to be robust to running on images of different sizes so we train this into the model. Instead of fixing the input image size we change the network every few iterations. Every 10 batches our network randomly chooses a new image dimension size. "Since our model downsamples by a factor of 32, we pull from the following multiples of 32: {320, 352, ..., 608}. Thus the smallest option is 320 × 320 and the largest is 608 × 608. We resize the network to that dimension and continue training. "

我不明白只有卷积和池化层的网络 允许输入不同的分辨率。根据我构建神经网络的经验,如果将输入的分辨率更改为不同的比例,则该网络的参数数量将发生变化,即该网络的结构也将发生变化。

I don't get how a network with only convolutional and pooling layers allow input of different resolutions. From my experience of building neural networks, if you change the resolution of the input to different scale, the number of parameters of this network will change, that is, the structure of this network will change.

那么,YOLOv2如何动态更改此

So, how does YOLOv2 change this on the fly?

我读了yolov2的配置文件,但是得到的只是一个 random = 1 语句...

I read the configuration file for yolov2, but all I got was a random=1 statement...

推荐答案

在YoLo中,如果仅使用卷积层,则输出网格的大小将发生变化。

In YoLo if you are only using convolution layers , the size of the output gird changes.

例如,如果您的大小为:

For example if you have size of:


  1. 320x320,则输出大小为10x10

  1. 320x320, output size is 10x10

608x608,输出大小为19x19

608x608, output size is 19x19

然后计算

因此,您可以反向传播损失而无需添加任何其他参数。

Thus you can back propagate loss without adding any more parameters.

有关损失函数,请参见yolov1纸:

Refer yolov1 paper for the loss function:

纸张中的损失函数

因此,从理论上讲,您只能根据网格大小来调整此函数,而 >模型参数,您应该会很高兴。

You thus can in theory only adjust this function which depends upon the grid size and no model parameters and you should be good to go.

P aper链接: https://arxiv.org/pdf/1506.02640.pdf

Paper Link: https://arxiv.org/pdf/1506.02640.pdf

作者在视频解释中也提到了同样的内容。

In the video explanation by the author mentions the same.

时间:14:53

视频链接

这篇关于进行多尺度训练(yolov2)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆