避免深度神经网络中的梯度消失 [英] Avoiding vanishing gradient in deep neural networks

查看:107
本文介绍了避免深度神经网络中的梯度消失的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在研究Keras,以尝试深入学习.

I'm taking a look at Keras to try to dive into deep learning.

据我所知,由于梯度问题消失,仅堆叠几层致密层就可以有效地阻止反向传播.

From what I know, stacking just a few dense layers effectively stops back propagation from working due to vanishing gradient problem.

我发现有一个经过预先训练的 VGG-16 神经网络,您可以下载并在其之上构建.

I found out that there is a pre-trained VGG-16 neural network you can download and build on top of it.

此网络有16层,所以我想这是您遇到消失的梯度问题的领域.

This network has 16 layers so I guess, this is the territory where you hit the vanishing gradient problem.

假设我想在Keras中自己训练网络.我该怎么办?我是否应该将各层分为几类,并作为自动编码器独立地对其进行训练,而不是在其上堆叠分类器并对其进行训练? Keras中有内置的机制吗?

Suppose I wanted to train the network myself in Keras. How should I do it? Should I divide the layers into clusters and train them independently as autoecoders and than stack a classifier on top of it and train it? Is there a built-in mechanism for it in Keras?

推荐答案

不,消失的梯度问题并不像以前那样普遍,因为几乎所有网络(除循环网络外)都使用ReLU激活,而ReLU激活的可能性要小得多这个问题.

No, the vanishing gradient problem is not as prevalent as before, as pretty much all networks (except recurrent ones) use ReLU activations which are considerably less prone to have this problem.

您应该只是从头开始训练网络,然后看看它是如何工作的.不要尝试解决您尚未遇到的问题.

You should just train a network from scratch and see how it works. Do not try to deal with a problem that you don't have yet.

这篇关于避免深度神经网络中的梯度消失的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆