“破坏对称性"是什么意思?在神经网络编程的背景下? [英] What does it mean to "break symmetry"? in the context of neural network programming?

查看:58
本文介绍了“破坏对称性"是什么意思?在神经网络编程的背景下?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在神经网络编程和初始化的背景下,我听到了很多有关打破对称性"的知识.有人可以解释一下这是什么意思吗?据我所知,如果权重矩阵在初始化期间被填充为相同的值,则与神经元在向前和向后传播期间的表现相似.通过随机初始化(即在整个矩阵中不使用相同的值)可以更清楚地复制非对称行为.

I have heard a lot about "breaking the symmetry" within the context of neural network programming and initialization. Can somebody please explain what this means? As far as I can tell, it has something to do with neurons performing similarly during forward and backward propagation if the weight matrix is filled with identical values during initialization. Asymmetrical behavior would be more clearly replicated with random initialization, i.e., not using identical values throughout the matrix.

推荐答案

您的理解是正确的.

当所有初始值都相同时,例如将每个权重初始化为0,然后进行反向传播时,所有权重将获得相同的梯度,因此也将进行相同的更新.这就是所谓的对称性.

When all initial values are identical, for example initialize every weight to 0, then when doing backpropagation, all weights will get the same gradient, and hence the same update. This is what is referred to as the symmetry.

直觉上,这意味着所有节点都将学习相同的东西,我们不希望那样,因为我们希望网络学习不同种类的功能.这是通过随机初始化实现的,因为此后梯度将有所不同,并且每个节点将变得与其他节点更加不同,从而可以进行多种特征提取.这就是所谓的破坏对称性.

Intuitively, that means all nodes will learn the same thing, and we don't want that, because we want the network to learn different kinds of features. This is achieved by random initialization, since then the gradient will be different, and each node will grow to be more distinct to other nodes, enabling diverse feature extraction. This is what is referred to as breaking the symmetry.

这篇关于“破坏对称性"是什么意思?在神经网络编程的背景下?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆