弹性反向传播神经网络 - 关于梯度的问题 [英] Resilient backpropagation neural network - question about gradient

查看:22
本文介绍了弹性反向传播神经网络 - 关于梯度的问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

首先我想说我对神经网络真的很陌生,我不太了解它;)

First I want to say that I'm really new to neural networks and I don't understand it very good ;)

我已经实现了反向传播神经网络的第一个 C# 实现.我已经使用 XOR 对其进行了测试,它看起来很有效.

I've made my first C# implementation of the backpropagation neural network. I've tested it using XOR and it looks it work.

现在我想更改我的实现以使用弹性反向传播(Rprop - http://en.wikipedia.org/wiki/Rprop).

Now I would like change my implementation to use resilient backpropagation (Rprop - http://en.wikipedia.org/wiki/Rprop).

定义说:Rprop 只考虑所有模式的偏导数的符号(而不是幅度),并且独立作用于每个权重".

The definition says: "Rprop takes into account only the sign of the partial derivative over all patterns (not the magnitude), and acts independently on each "weight".

谁能告诉我所有模式的偏导数是什么?我应该如何计算隐藏层神经元的偏导数.

Could somebody tell me what partial derivative over all patterns is? And how should I compute this partial derivative for a neuron in hidden layer.

非常感谢

更新:

我的实现基于此 Java 代码:www_.dia.fi.upm.es/~jamartin/downloads/bpnn.java

My implementation base on this Java code: www_.dia.fi.upm.es/~jamartin/downloads/bpnn.java

我的 backPropagate 方法如下所示:

My backPropagate method looks like this:

public double backPropagate(double[] targets)
    {
        double error, change;

        // calculate error terms for output
        double[] output_deltas = new double[outputsNumber];

        for (int k = 0; k < outputsNumber; k++)
        {

            error = targets[k] - activationsOutputs[k];
            output_deltas[k] = Dsigmoid(activationsOutputs[k]) * error;
        }

        // calculate error terms for hidden
        double[] hidden_deltas = new double[hiddenNumber];

        for (int j = 0; j < hiddenNumber; j++)
        {
            error = 0.0;

            for (int k = 0; k < outputsNumber; k++)
            {
                error = error + output_deltas[k] * weightsOutputs[j, k];
            }

            hidden_deltas[j] = Dsigmoid(activationsHidden[j]) * error;
        }

        //update output weights
        for (int j = 0; j < hiddenNumber; j++)
        {
            for (int k = 0; k < outputsNumber; k++)
            {
                change = output_deltas[k] * activationsHidden[j];
                weightsOutputs[j, k] = weightsOutputs[j, k] + learningRate * change + momentumFactor * lastChangeWeightsForMomentumOutpus[j, k];
                lastChangeWeightsForMomentumOutpus[j, k] = change;

            }
        }

        // update input weights
        for (int i = 0; i < inputsNumber; i++)
        {
            for (int j = 0; j < hiddenNumber; j++)
            {
                change = hidden_deltas[j] * activationsInputs[i];
                weightsInputs[i, j] = weightsInputs[i, j] + learningRate * change + momentumFactor * lastChangeWeightsForMomentumInputs[i, j];
                lastChangeWeightsForMomentumInputs[i, j] = change;
            }
        }

        // calculate error
        error = 0.0;

        for (int k = 0; k < outputsNumber; k++)
        {
            error = error + 0.5 * (targets[k] - activationsOutputs[k]) * (targets[k] - activationsOutputs[k]);
        }

        return error;
    }

那么我可以使用 change = hidden_​​deltas[j] * activationsInputs[i] 变量作为梯度(偏导数)来检查 sing 吗?

So can I use change = hidden_deltas[j] * activationsInputs[i] variable as a gradient (partial derivative) for checking the sing?

推荐答案

我认为over all patterns"只是意味着在每次迭代中"...看看

I think the "over all patterns" simply means "in every iteration"... take a look at the RPROP paper

对于偏导数:您已经实现了正常的反向传播算法.这是一种有效计算梯度的方法……在那里您可以计算单个神经元的 δ 值,它们实际上是负的 ∂E/∂w 值,即作为权重函数的全局误差的偏导数.

For the paritial derivative: you've already implemented the normal back-propagation algorithm. This is a method for efficiently calculate the gradient... there you calculate the δ values for the single neurons, which are in fact the negative ∂E/∂w values, i.e. the parital derivative of the global error as function of the weights.

因此,不是将权重与这些值相乘,而是取两个常数之一(η+ 或 η-),具体取决于符号是否已更改

so instead of multiplying the weights with these values, you take one of two constants (η+ or η-), depending on whether the sign has changed

这篇关于弹性反向传播神经网络 - 关于梯度的问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆