YOLO对象检测:该算法如何预测比网格单元大的边界框? [英] YOLO object detection: how does the algorithm predict bounding boxes larger than a grid cell?

查看:499
本文介绍了YOLO对象检测:该算法如何预测比网格单元大的边界框?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图更好地了解YOLO2& 3种算法有效。该算法处理一系列卷积,直到降到 13x13 网格为止。然后,可以对每个网格单元中的对象以及这些对象的边界框进行分类。





如果您看这张图片,就会看到红色的边界框大于任何单独的网格单元。边界框也位于对象的中心。



我的问题与当网络激活基于单个网格单元时,预测边界框如何超出网格单元的大小有关。我的意思是,网格单元外部的所有内容对于预测该单元格中检测到的物体的边界框的神经元来说都是未知的。



更确切地说,这是我的问题:



1。该算法如何预测比网格单元大的边界框?



2。算法如何知道对象中心位于哪个单元格中?

解决方案


网格单元格外的所有内容对于预测该单元格中检测到的物体边界框的神经元来说都是未知的。


这不太正确。这些单元格对应图像的一个分区,如果对象的中心位于其中,则神经元已学会了响应。



但是,感受野<这些输出神经元的/ em>比细胞大得多,并且实际上覆盖了整个图像。因此,它能够识别并围绕比其指定的中心单元大得多的对象绘制边界框。输出神经元的场,但要小得多。它在某种程度上也是任意的,例如可以成像具有重叠的细胞-在这种情况下,当对象位于其细胞重叠区域的中心时,您会期望相邻的神经元同时触发。


I am trying to better understand how the YOLO2 & 3 algorithms works. The algorithm processes a series of convolutions until it gets down to a 13x13 grid. Then it is able to classify objects within each grid cell as well as the bounding boxes for those objects.

If you look at this picture, you see that the bounding box in red is larger than any individual grid cell. Also the bounding box is centered at the center of the object.

My questions of to do with how do the predicted bounding boxes exceed the size of the grid cell, when the network activations are based upon the individual grid cell. I mean everything outside of the grid cell should be unknown to the neurons predicting the bounding boxes for an object detected in that cell right.

More precisely here are my questions:

1. How does the algorithm predict bounding boxes that are larger than the grid cell?

2. How does the algorithm know in which cell the center of the object is located?

解决方案

everything outside of the grid cell should be unknown to the neurons predicting the bounding boxes for an object detected in that cell right.

It's not quite right. The cells correspond to a partition of the image where the neuron have learned to respond if the center of an object is located within.

However, the receptive field of those output neurons is much larger than the cell and actually cover the entire image. It is therefore able to recognize and draw a bounding box around an object much larger than its assigned "center cell".

So a cell is centered on the center of the receptive field of the output neuron but is a much smaller part. It is also somewhat arbitrary, and one could image for example to have overlapping cells -- in which case you would expect neighboring neurons to fire simultaneously when an object is centered in the overlapping zone of their cells.

这篇关于YOLO对象检测:该算法如何预测比网格单元大的边界框?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆