用反向传播算法实现感知器 [英] Implementing a perceptron with backpropagation algorithm

查看:118
本文介绍了用反向传播算法实现感知器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试通过反向传播实现两层感知器,以解决奇偶校验问题.网络有4个二进制输入,第一层有4个隐藏单元,第二层有1个输出.我正在使用作为参考,但是遇到了问题与会合.

I am trying to implement a two-layer perceptron with backpropagation to solve the parity problem. The network has 4 binary inputs, 4 hidden units in the first layer and 1 output in the second layer. I am using this for reference, but am having problems with convergence.

首先,我会注意到我正在使用Sigmoid函数进行激活,因此派生函数是(据我所知)Sigmoid(v)*(1-Sigmoid(v)).因此,用于计算增量值.

First, I will note that I am using a sigmoid function for activation, and so the derivative is (from what I understand) the sigmoid(v) * (1 - sigmoid(v)). So, that is used when calculating the delta value.

因此,基本上,我设置了网络并运行了几个纪元(遍历每种可能的模式-在这种情况下,输入的是16种模式).在第一个时期之后,权重会稍有变化.在第二秒之后,权重不会改变并保持不变,所以无论我运行多少个纪元.我现在使用的学习率为0.1,偏差为+1.

So, basically I set up the network and run for just a few epochs (go through each possible pattern -- in this case, 16 patterns of input). After the first epoch, the weights are changed slightly. After the second, the weights do not change and remain so no matter how many more epochs I run. I am using a learning rate of 0.1 and a bias of +1 for now.

下面是用伪代码训练网络的过程(根据我检查过的消息源,我认为这是正确的):

The process of training the network is below in pseudocode (which I believe to be correct according to sources I've checked):

转发步骤:

v = SUM[weight connecting input to hidden * input value] + bias  
y = Sigmoid(v)  
set hidden.values to y  
v = SUM[weight connecting hidden to output * hidden value] + bias  
y = Sigmoid(v)  
set output value to y

输出层的反向传播:

error = desired - output.value  
outputDelta = error * output.value * (1 - output.value)

隐藏层的反向传播:

for each hidden neuron h:  
error = outputDelta * weight connecting h to output  
hiddenDelta[i] = error * h.value * (1 - h.value)

更新权重:

for each hidden neuron h connected to the output layer  
h.weight connecting h to output = learningRate * outputDelta * h.value

for each input neuron x connected to the hidden layer  
x.weight connecting x to h[i] = learningRate * hiddenDelta[i] * x.value

当然,此过程在各个时期之间循环,并且权重变化仍然存在.因此,我的问题是,在第二个时期之后,权重保持恒定有什么原因吗?如有必要,我可以发布代码,但是目前,我希望我可以忽略一些明显的事情.谢谢大家!

This process is of course looped through the epochs and the weight changes persist. So, my question is, are there any reasons that the weights remain constant after the second epoch? If necessary I can post my code, but at the moment I am hoping for something obvious that I'm overlooking. Thanks all!

这是sarnold建议的指向我的代码的链接:
MLP.java: http://codetidy.com/1903
Neuron.java: http://codetidy.com/1904
Pattern.java: http://codetidy.com/1905
input.txt: http://codetidy.com/1906

Here are the links to my code as suggested by sarnold:
MLP.java: http://codetidy.com/1903
Neuron.java: http://codetidy.com/1904
Pattern.java: http://codetidy.com/1905
input.txt: http://codetidy.com/1906

推荐答案

我认为我发现了问题;有趣的是,我发现的内容在您的高级描述中可见,但是我只发现代码中看起来很奇怪的内容.首先,说明:

I think I spotted the problem; funny enough, what I found is visible in your high-level description, but I only found what looked odd in the code. First, the description:

for each hidden neuron h connected to the output layer
h.weight connecting h to output = learningRate * outputDelta * h.value

for each input neuron x connected to the hidden layer
x.weight connecting x to h[i] = learningRate * hiddenDelta[i] * x.value

我相信h.weight应该相对于先前的重量进行更新.您的更新机制仅根据学习率,输出增量和节点的 value 进行设置.类似地,还基于节点的学习率,隐藏增量和来设置x.weight:

I believe the h.weight should be updated with respect to the previous weight. Your update mechanism sets it based only on the learning rate, the output delta, and the value of the node. Similarly, the x.weight is also being set based on the learning rate, the hidden delta, and the value of the node:

    /*** Weight updates ***/

    // update weights connecting hidden neurons to output layer
    for (i = 0; i < output.size(); i++) {
        for (Neuron h : output.get(i).left) {
            h.weights[i] = learningRate * outputDelta[i] * h.value;
        }
    }

    // update weights connecting input neurons to hidden layer
    for (i = 0; i < hidden.size(); i++) {
        for (Neuron x : hidden.get(i).left) {
            x.weights[i] = learningRate * hiddenDelta[i] * x.value;
        }
    }

我不知道什么是正确的解决方案;但我有两个建议:

I do not know what the correct solution is; but I have two suggestions:

  1. 替换这些行:

  1. Replace these lines:

        h.weights[i] = learningRate * outputDelta[i] * h.value;
        x.weights[i] = learningRate * hiddenDelta[i] * x.value;

包含以下几行:

        h.weights[i] += learningRate * outputDelta[i] * h.value;
        x.weights[i] += learningRate * hiddenDelta[i] * x.value;

(+=而不是=.)

替换这些行:

        h.weights[i] = learningRate * outputDelta[i] * h.value;
        x.weights[i] = learningRate * hiddenDelta[i] * x.value;

包含以下几行:

        h.weights[i] *= learningRate * outputDelta[i];
        x.weights[i] *= learningRate * hiddenDelta[i];

(忽略并简单地缩放现有权重.对于此更改,学习率应为1.05而不是.05.)

(Ignore the value and simply scale the existing weight. The learning rate should be 1.05 instead of .05 for this change.)

这篇关于用反向传播算法实现感知器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆