如何设计深度卷积神经网络? [英] How to design deep convolutional neural networks?

查看:89
本文介绍了如何设计深度卷积神经网络?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

据我了解,所有CNN都非常相似.它们都有卷积层,然后是池化层和relu层.有些具有专门的层,例如FlowNet和Segnet.我的疑问是我们应该如何决定要使用多少层,以及如何为网络中的每一层设置内核大小.我已经搜索了该问题的答案,但找不到具体答案.网络是使用反复试验设计的,还是我不知道的某些特定规则?如果您能澄清一下,我将非常感谢您.

As I understand it, all CNNs are quite similar. They all have a convolutional layers followed by pooling and relu layers. Some have specialised layers like FlowNet and Segnet. My doubt is how should we decide how many layers to use and how do we set the kernel size for each layer in the network. I have searched for an answer to this question but I couldn't find a concrete answer. Is the network designed using trial and error or are some specific rules that I am not aware of? If you could please clarify this, I would be very grateful to you.

推荐答案

简短的回答:如果有设计规则,我们还没有发现它们.

Short answer: if there are design rules, we haven't discovered them yet.

请注意,在计算中存在可比的问题.例如,请注意,只有少数基本的电子逻辑单元,这些门驱动着您的制造技术.所有计算设备都使用相同的布尔逻辑.有些具有专门的附加功能,例如光电输入或机械输出.

Note that there are comparable questions in computing. For instance, note that there is only a handful of basic electronic logic units, the gates that drive your manufacturing technology. All computing devices use the same Boolean logic; some have specialised additions, such as photoelectric input or mechanical output.

您如何决定如何设计计算设备?

How do you decide how to design your computing device?

设计取决于CNN的目的.输入特性,准确性,训练速度,得分速度,适应性,计算资源……所有这些都会影响设计.甚至对于给定的问题,还没有通用的解决方案.

The design depends on the purpose of the CNN. Input characteristics, accuracy, training speed, scoring speed, adaptation, computing resources, ... all of these affect the design. There is no generalized solution, even for a given problem (yet).

例如,考虑ImageNet分类问题.注意到目前为止获胜者和竞争者之间的结构差异:AlexNet,GoogleNet,ResNet,VGG等.如果您更改输入(例如,更改为MNIST),则这些都是过大的.如果您更改范例,则它们可能无用.GoogleNet可能是图像处理的王子,但是将口头法语翻译成书面英语实在是太可怕了.如果要在视频屏幕上实时跟踪冰球,请完全忘记这些实现.

For instance, consider the ImageNet classification problem. Note the structural differences between the winners and contenders so far: AlexNet, GoogleNet, ResNet, VGG, etc. If you change inputs (say, to MNIST), then these are overkill. If you change the paradigm, they may be useless. GoogleNet may be a prince of image processing, but it's horrid for translating spoken French to written English. If you want to track a hockey puck in real time on your video screen, forget these implementations entirely.

到目前为止,我们以经验方式进行此操作:许多人尝试了许多不同的方法来查看有效的方法.我们会得到感受,因为它们可以提高准确性,培训时间或我们要调整的任何因素.我们发现什么可以与总CPU时间一起很好地工作,或者我们可以并行地做什么.我们更改算法以利用矢量数学的优势,其长度是2的幂.我们更改了领域(例如,将图像处理转换为书面文本),然后重新开始,但是一旦考虑到某些类型的图层,我们就会模糊地了解可能会调整特定瓶颈的地方.

So far, we're doing this the empirical way: a lot of people try a lot of different things to see what works. We get feelings for what will improve accuracy, or training time, or whatever factor we want to tune. We find what works well with total CPU time, or what we can do in parallel. We change algorithms to take advantage of vector math in lengths that are powers of 2. We change problems slightly and see how the learning adapts elsewhere. We change domains (say, image processing to written text), and start all over -- but with a vague feeling of what might tune a particular bottleneck, once we get down to considering certain types of layers.

请记住,CNN真正流行的时间还不到6年.在大多数情况下,我们仍在尝试了解重要的问题.欢迎来到研究团队.

Remember, CNNs really haven't been popular for that long, barely 6 years. For the most part, we're still trying to learn what the important questions might be. Welcome to the research team.

这篇关于如何设计深度卷积神经网络?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆