理解奇怪的 YOLO 卷积层输出大小 [英] Understanding weird YOLO convolutional layer output size

查看:25
本文介绍了理解奇怪的 YOLO 卷积层输出大小的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图了解 Darknet 的工作原理,我正在查看 yolov3-tiny 配置文件,特别是 第 13 层(第 107 行).

I am trying to understand how Darknet works, and I was looking at the yolov3-tiny configuration file, specifically the layer number 13 (line 107).

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

内核的大小为 1x1,步长为 1,填充也为 1.当我使用darknet加载网络时,说明输出的宽高和输入的一样:

The size of the kernel is 1x1, the stride is 1 and the padding is 1 too. When I load the network using darknet, it indicates that the output width and height are the same as the input:

13 conv    256       1 x 1/ 1     13 x  13 x1024 ->   13 x  13 x 256

但是,既然内核是1x1并且有填充,那么宽度和高度不应该增加2吗?如果我理解正确,内核将遍历输入的所有像素"加上填充,所以对我来说宽度和高度应该增加 2*padding.

However, shouldn't the width and height increase by 2 since the kernel is 1x1 and there is padding? If I understand it correctly, the kernel is going to run through all the "pixels" of the input plus the padding, so it makes sense for me that the width and height should increase by 2*padding.

我使用了公式

output_size = ((input_size – kernel_size + 2*padding) / stride) + 1

然后检查出来.(13 - 1 + 2 * 1)/1 + 1 = 15

有人知道我错过了什么吗?

Does anybody know what I'm missing?

提前致谢.

推荐答案

我想通了.

我误解了图层中的 pad 参数.如果你想让 padding 为 1,你应该写:

I misunderstood the pad parameter in the layer. If you want the padding to be 1, you should write:

padding=1

pad 实际上是一个布尔值.设置为 1 时,图层的内边距将等于 size/2.

pad is actually a boolean. When set to one, the padding of the layer will be equal to size / 2.

在这种情况下,内核的大小为 1,因此填充最终为 1/2 = 0(整数运算).由于没有填充,输出的宽度和高度与输入的相同.

In this case, the size of the kernel was 1, and so the padding ends up being 1/2 = 0 (integer operation). Since there is no padding, the output width and height remains the same as in the input.

我应该使用 RTFM.

I should've RTFM.

这篇关于理解奇怪的 YOLO 卷积层输出大小的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆