为什么神经网络的权重应该初始化为随机数? [英] Why should weights of Neural Networks be initialized to random numbers?

本文介绍了为什么神经网络的权重应该初始化为随机数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试从头开始构建神经网络.所有 AI 文献都一致认为,权重应该初始化为随机数,以便网络更快收敛.

I am trying to build a neural network from scratch. Across all AI literature there is a consensus that weights should be initialized to random numbers in order for the network to converge faster.

但是为什么神经网络初始权重被初始化为随机数?

我在某处读到过这样做是为了打破对称性",这使得神经网络学习得更快.打破对称性如何让它学得更快?

I had read somewhere that this is done to "break the symmetry" and this makes the neural network learn faster. How does breaking the symmetry make it learn faster?

将权重初始化为 0 不是更好的主意吗?这样权重就能更快地找到它们的值(无论是正值还是负值)?

Wouldn't initializing the weights to 0 be a better idea? That way the weights would be able to find their values (whether positive or negative) faster?

除了希望权重在初始化时接近其最佳值之外,在随机化权重背后是否还有其他一些潜在的哲学?

Is there some other underlying philosophy behind randomizing the weights apart from hoping that they would be near their optimum values when initialized?

推荐答案

打破对称在这里是必不可少的,而不是为了性能的原因.想象前 2 层多层感知器(输入层和隐藏层):

Breaking symmetry is essential here, and not for the reason of performance. Imagine first 2 layers of multilayer perceptron (input and hidden layers):

在前向传播过程中,隐藏层中的每个单元都得到信号:

During forward propagation each unit in hidden layer gets signal:

也就是说,每个隐藏单元得到输入的总和乘以相应的权重.

That is, each hidden unit gets sum of inputs multiplied by the corresponding weight.

现在假设您将所有权重初始化为相同的值(例如零或一).在这种情况下,每个隐藏单元将得到完全相同的信号.例如.如果所有权重都初始化为 1,则每个单元获得等于输入总和的信号(并输出 sigmoid(sum(inputs))).如果所有的权重都为零,更糟糕的是,每个隐藏单元都会得到零信号.无论输入是什么 - 如果所有权重都相同,隐藏层中的所有单元也将相同.

Now imagine that you initialize all weights to the same value (e.g. zero or one). In this case, each hidden unit will get exactly the same signal. E.g. if all weights are initialized to 1, each unit gets signal equal to sum of inputs (and outputs sigmoid(sum(inputs))). If all weights are zeros, which is even worse, every hidden unit will get zero signal. No matter what was the input - if all weights are the same, all units in hidden layer will be the same too.

这是对称性的主要问题,也是您应该随机初始化权重(或至少使用不同值)的原因.请注意,此问题会影响使用每个连接的所有架构.

This is the main issue with symmetry and reason why you should initialize weights randomly (or, at least, with different values). Note, that this issue affects all architectures that use each-to-each connections.

这篇关于为什么神经网络的权重应该初始化为随机数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆