反向传播神经元网络方法-设计 [英] BackPropagation Neuron Network Approach - Design

查看:104
本文介绍了反向传播神经元网络方法-设计的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试制作一个数字识别程序.我将输入一个数字的黑白图像,我的输出层将触发相应的数字(在输出层的0-> 9个神经元中,将触发一个神经元).我完成了二维反向传播神经元网络的实现.我的拓扑大小为[5] [3]-> [3] [3]-> 1 [10].因此,它是一个2-D输入层,一个2-D隐藏层和一个1-D输出层.但是,我得到的结果很奇怪和错误(平均错误和输出值).

在此阶段进行调试非常耗时.因此,我很想听听这是否是正确的设计,所以我继续进行调试.这是我实施的流程步骤:

  • 构建网络:除输出层外,每层上都有一个偏差(无偏差).偏差的输出值始终为= 1.0,但是其连接权重在每次传递时都会像网络中的所有其他神经元一样更新.所有权重范围为0.000-> 1.000(无负数)

  • 获取输入数据(0 | OR | 1),并将第n个值设置为输入层中的第n个Neuron输出值.

  • 前馈:在每个层(输入层除外)的每个神经元'n'上:

    • 获取所连接神经元的SUM(输出值*连接权重)的结果 从上一层到第n个神经元.
    • 获取此SUM的TanHyperbolic-传递函数-作为结果
    • 将结果设置为第n个神经元的输出值
  • 获取结果:获取输出层中神经元的输出值

  • 反向传播:

    • 计算网络错误:在输出层上,获取SUM Neurons'(目标值-输出值)^ 2.将该SUM除以输出层的大小.获得其SquareRoot作为结果.计算平均误差=(OldAverageError * SmoothingFactor *结果)/(SmoothingFactor + 1.00)
    • 计算输出层梯度:对于每个输出神经元'n',第n个梯度=(第n个目标值-第n个输出值)*第n个输出值Tan双曲型导数
    • 计算隐藏层梯度:对于每个神经元'n',获得SUM(权重从该n个神经元*的双曲双导数*目的地神经元的梯度)作为结果.将(结果*第n个输出值)指定为渐变.
    • 更新所有权重:对于第n个神经元,从隐藏层开始,然后再回到输入层:计算NewDeltaWeight =(NetLearningRate * nth输出值* nth梯度+动量* OldDeltaWeight).然后将新权重指定为(OldWeight + NewDeltaWeight)
  • 重复过程.

这是我尝试输入的第七位数字.输出为Neuron#0和Neuron#6.神经元6应该带有1,而神经元0应该带有0.在我的结果中,除六个之外的所有Neuron都带有相同的值(#0是一个样本).

很抱歉,很长的帖子.如果您知道这一点,那么您可能会知道它在单个帖子中有多酷,以及有多大.预先谢谢你

解决方案

带对数丢失的Softmax通常用于多类输出层激活功能.您有多重类别/多项式:包含10个类别的10个可能的数字.

因此您可以尝试将输出层激活功能更改为softmax

http://en.wikipedia.org/wiki/Softmax_function

人工神经网络

在神经网络仿真中, softmax功能通常在网络的最后一层实现 用于分类.然后在日志中训练此类网络 损失(或交叉熵)机制,给出了 多项式逻辑回归.

让我们知道它有什么作用. –

I am trying to make a digit recognition program. I shall feed a white/black image of a digit and my output layer will fire the corresponding digit (one neuron shall fire, out of the 0 -> 9 neurons in the Output Layer). I finished implementing a Two-dimensional BackPropagation Neuron Network. My topology sizes are [5][3] -> [3][3] -> 1[10]. So it's One 2-D Input Layer, One 2-D Hidden Layer and One 1-D Output Layer. However I am getting weird and wrong results (Average Error and Output Values).

Debugging at this stage is kind of time consuming. Therefore, I would love to hear if this is the correct design so I continue debugging. Here are the flow steps of my implementation:

  • Build the Network: One Bias on each Layer except on the Output Layer (No Bias). A Bias's output value is always = 1.0, however its Connections Weights get updated on each pass like all other neurons in the network. All Weights range 0.000 -> 1.000 (no negatives)

  • Get Input data (0 | OR | 1) and set nth value as the nth Neuron Output Value in the input layer.

  • Feed Forward: On each Neuron 'n' in every Layer (except the Input Layer):

    • Get result of SUM (Output Value * Connection Weight) of connected Neurons from previous layer towards this nth Neuron.
    • Get TanHyperbolic - Transfer Function - of this SUM as Results
    • Set Results as the Output Value of this nth Neuron
  • Get Results: Take Output Values of Neurons in the Output Layer

  • BackPropagation:

    • Calculate Network Error: on the Output Layer, get SUM Neurons' (Target Values - Output Values)^2. Divide this SUM by the size of the Output Layer. Get its SquareRoot as Result. Compute Average Error = (OldAverageError * SmoothingFactor * Result) / (SmoothingFactor + 1.00)
    • Calculate Output Layer Gradients: for each Output Neuron 'n', nth Gradient = (nth Target Value - nth Output Value) * nth Output Value TanHyperbolic Derivative
    • Calculate Hidden Layer Gradients: for each Neuron 'n', get SUM (TanHyperbolic Derivative of a weight going from this nth Neuron * Gradient of the destination Neuron) as Results. Assign (Results * this nth Output Value) as the Gradient.
    • Update all Weights: Starting from the hidden Layer and back to the Input Layer, for nth Neuron: Compute NewDeltaWeight = (NetLearningRate * nth Output Value * nth Gradient + Momentum * OldDeltaWeight). Then assign New Weight as (OldWeight + NewDeltaWeight)
  • Repeat process.

Here is my attempt for digit number seven. The outputs are Neuron # zero and Neuron # 6. Neuron six should be carrying 1 and Neuron # zero should be carrying 0. In my results, all Neuron other than six are carrying the same value (# zero is a sample).

Sorry for the long post. If you know this then you probably know how cool it is and how large it is to be in a single post. Thank you in advance

解决方案

Softmax with log-loss is typically used for multiclass output layer activation function. You have multiclass/multinomial: with the 10 possible digits comprising the 10 classes.

So you can try changing your output layer activation function to softmax

http://en.wikipedia.org/wiki/Softmax_function

Artificial neural networks

In neural network simulations, the softmax function is often implemented at the final layer of a network used for classification. Such networks are then trained under a log loss (or cross-entropy) regime, giving a non-linear variant of multinomial logistic regression.

Let us know what effect that has. –

这篇关于反向传播神经元网络方法-设计的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆