权重初始化 [英] Weight Initialisation

查看:17
本文介绍了权重初始化的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我计划将 Nguyen-Widrow 算法用于具有多个隐藏层的 NN.在研究的过程中,我发现了很多歧义,我想澄清一下.

I plan to use the Nguyen-Widrow Algorithm for an NN with multiple hidden layers. While researching, I found a lot of ambiguities and I wish to clarify them.

以下是 Nguyen-Widrow 算法的伪代码

The following is pseudo code for the Nguyen-Widrow Algorithm

      Initialize all weight of hidden layers with random values
      For each hidden layer{
          beta = 0.7 * Math.pow(hiddenNeurons, 1.0 / number of inputs);
          For each synapse{
             For each weight{
              Adjust weight by dividing by norm of weight for neuron and * multiplying by beta value
            }
          } 
      }

只是想说明 hiddenNeurons 的值是特定隐藏层的大小,还是网络中所有隐藏层的大小.我通过查看各种来源搞混了.

Just wanted to clarify whether the value of hiddenNeurons is the size of the particular hidden layer, or the size of all the hidden layers within the network. I got mixed up by viewing various sources.

换句话说,如果我有一个网络(3-2-2-2-3)(索引 0 是输入层,索引 4 是输出层),那么 hiddenNeurons 的值会是:

In other words, if I have a network (3-2-2-2-3) (index 0 is input layer, index 4 is output layer), would the value hiddenNeurons be:

NumberOfNeuronsInLayer(1) + NumberOfNeuronsInLayer(2) + NumberOfNeuronsInLaer(3)

或者只是

NumberOfNeuronsInLayer(i) ,其中 i 是我所在的当前层

NumberOfNeuronsInLayer(i) , where i is the current Layer I am at

那么,hiddenNeurons 的值是当前隐藏层的大小,输入值是前一个隐藏层的大小?

So, the hiddenNeurons value would be the size of the current hidden layer, and the input value would be the size of the previous hidden layer?

推荐答案

在我看来,您想要更精确的代码.以下是我参与的项目的一些实际代码行.希望你读过 C.它有点抽象和简化.有一个 struct nn,用于保存神经网络数据.您可能有自己的抽象数据类型.

Sounds to me like you want more precise code. Here are some actual code lines from a project I'm participating to. Hope you read C. It's a bit abstracted and simplified. There is a struct nn, that holds the neural net data. You probably have your own abstract data type.

我的项目中的代码行(有些简化):

Code lines from my project (somewhat simplified):

float *w = nn->the_weight_array;
float factor = 0.7f * powf( (float) nn->n_hidden, 1.0f / nn->n_input);

for( w in all weight )
    *w++ = random_range( -factor, factor );

/* Nguyen/Widrow */
w = nn->the_weight_array;
for( i = nn->n_input; i; i-- ){
    _scale_nguyen_widrow( factor, w, nn->n_hidden );
    w += nn->n_hidden;
}

调用的函数:

static void _scale_nguyen_widrow( float factor, float *vec, unsigned int size )
{
    unsigned int i;
    float magnitude = 0.0f;
    for ( i = 0; i < size; i++ )
        magnitude += vec[i] * vec[i];

    magnitude = sqrtf( magnitude );

    for ( i = 0; i < size; i++ )
         vec[i] *= factor / magnitude;
}

static inline float random_range( float min, float max)
{
    float range = fabs(max - min);
    return ((float)rand()/(float)RAND_MAX) * range + min;
}

提示:
在您实现 Nguyen/Widrow 权重初始化之后,您实际上可以在前向计算中添加一些代码行,将每个激活转储到一个文件中.然后你可以检查这组神经元对激活函数的影响程度.求平均值和标准差.您甚至可以使用绘图工具绘制它,即.图形.(无论如何,您都需要一个像 gnuplot 这样的绘图工具来绘制错误率等.)我为我的实现做到了这一点.情节很好,并且在我的项目中使用 Nguyen/Widrow 的初始学习速度要快得多.

Tip:
After you've implemented the Nguyen/Widrow weight initialization, you can actually add a little code line in the forward calculation that dumps each activation to a file. Then you can check how good the set of neurons hits the activation function. Find the mean and standard deviation. You can even plot it with a plotting tool, ie. gnuplot. (You need a plotting tool like gnuplot anyway for plotting error rates etc.) I did that for my implementation. The plots came out nice, and the initial learning became much faster using Nguyen/Widrow for my project.

PS:根据 Nguyen 和 Widrows 的意图,我不确定我的实现是否正确.我什至不认为我在乎,只要它确实改善了最初的学习.

PS: I'm not sure my implementation is correct according to Nguyen and Widrows intentions. I don't even think I care, as long as it does improve the initial learning.

祝你好运,
-Øystein

Good luck,
-Øystein

这篇关于权重初始化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆