具有正则化数据的 tanh 错误饱和度的神经网络 [英] Neural Network with tanh wrong saturation with normalized data

查看:19
本文介绍了具有正则化数据的 tanh 错误饱和度的神经网络的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用的神经网络由 4 个输入神经元、1 个由 20 个神经元组成的隐藏层和一个由 7 个神经元输出层组成.

I'm using a neural network made of 4 input neurons, 1 hidden layer made of 20 neurons and a 7 neuron output layer.

我正在尝试针对 bcd 到 7 段算法训练它.我的数据被归一化 0 是 -1,1 是 1.

I'm trying to train it for a bcd to 7 segment algorithm. My data is normalized 0 is -1 and 1 is 1.

当输出错误评估发生时,神经元饱和错误.如果期望的输出是 1 而实际输出是 -1,则错误是 1-(-1)= 2.

When the output error evaluation happens, the neuron saturates wrong. If the desired output is 1 and the real output is -1, the error is 1-(-1)= 2.

当我乘以激活函数的导数error*(1-output)*(1+output)时,误差几乎变成0 因为<代码>2*(1-(-1)*(1-1).

When I multiply it by the derivative of the activation function error*(1-output)*(1+output), the error becomes almost 0 Because of 2*(1-(-1)*(1-1).

如何避免这种饱和误差?

How can I avoid this saturation error?

推荐答案

激活函数渐近线处的饱和是神经网络的常见问题.如果你看一下函数的图表,你不会感到惊讶:它们几乎是平坦的,这意味着一阶导数(几乎)为 0.网络无法学习更多.

Saturation at the asymptotes of of the activation function is a common problem with neural networks. If you look at a graph of the function, it doesn't surprise: They are almost flat, meaning that the first derivative is (almost) 0. The network cannot learn any more.

一个简单的解决方案是缩放激活函数来避免这个问题.比如用tanh()激活函数(我最喜欢的),当想要的输出在{-1, 1}时推荐使用如下激活函数:

A simple solution is to scale the activation function to avoid this problem. For example, with tanh() activation function (my favorite), it is recommended to use the following activation function when the desired output is in {-1, 1}:

f(x) = 1.7159 * tanh( 2/3 * x)  

因此,导数是

f'(x) = 1.14393 * (1- tanh( 2/3 * x))  

这将迫使梯度进入最非线性的值范围并加速学习.对于所有细节,我建议阅读 Yann LeCun 的优秀论文 Efficient Back-Prop.在 tanh() 激活函数的情况下,误差计算为

This will force the gradients into the most non-linear value range and speed up the learning. For all the details I recommend reading Yann LeCun's great paper Efficient Back-Prop. In the case of tanh() activation function, the error would be calculated as

error = 2/3 * (1.7159 - output^2) * (teacher - output)

这篇关于具有正则化数据的 tanh 错误饱和度的神经网络的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆