随机化神经网络输入顺序的影响 [英] Effects of randomizing the order of inputs to a neural network

查看：58 发布时间：2021/12/31 17:04:37 artificial-intelligence machine-learning neural-network xor backpropagation

本文介绍了随机化神经网络输入顺序的影响的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

在我的高级算法和数据结构课程中，我的教授要求我们选择我们感兴趣的任何主题.他还告诉我们研究它并尝试在其中实施解决方案.我选择神经网络是因为它是我想学习很长时间的东西.

For my Advanced Algorithms and Data Structures class, my professor asked us to pick any topic that interested us. He also told us to research it and to try and implement a solution in it. I chose Neural Networks because it's something that I've wanted to learn for a long time.

我已经能够使用神经网络实现 AND、OR 和 XOR，该网络的神经元使用阶跃函数作为激活器.之后，我尝试实现一个反向传播神经网络，该网络学习识别 XOR 算子(使用 sigmoid 函数作为激活器).通过使用 3-3-1 网络(输入和隐藏层有 1 个偏置，权重随机初始化)，我能够在 90% 的时间内使其工作.在其他时候，它似乎陷入了我认为的局部最小值，但我不确定(我之前问过这个问题，人们告诉我不应该有局部最小值).

I've been able to implement an AND, OR, and XOR using a neural network whose neurons use a step function for the activator. After that I tried to implement a back-propagating neural network that learns to recognize the XOR operator (using a sigmoid function as the activator). I was able to get this to work 90% of the time by using a 3-3-1 network (1 bias at the input and hidden layer, with weights initialized randomly). At other times it seems to get stuck in what I think is a local minima, but I am not sure (I've asked questions on this before and people have told me that there shouldn't be a local minima).

在它工作的 90% 的时间里，我一直按以下顺序呈现我的输入:[0, 0], [0, 1], [1, 0], [1, 0] 预期输出设置为 [0, 1, 1, 0].当我始终如一地以相同的顺序呈现值时，网络最终会学习到模式.实际上，我以什么顺序发送它并不重要，只要每个 epoch 的顺序完全相同即可.

The 90% of the time it was working, I was consistently presenting my inputs in this order: [0, 0], [0, 1], [1, 0], [1, 0] with the expected output set to [0, 1, 1, 0]. When I present the values in the same order consistently, the network eventually learns the pattern. It actually doesn't matter in what order I send it in, as long as it is the exact same order for each epoch.

然后我实现了训练集的随机化，所以这次输入的顺序已经足够随机了.我现在注意到我的神经网络卡住了，错误正在减少，但速度非常小(每个时期都在变小).一段时间后，误差开始围绕一个值振荡(因此误差停止减少).

I then implemented a randomization of the training set, and so this time the order of inputs is sufficiently randomized. I've noticed now that my neural network gets stuck and the errors are decreasing, but at a very small rate (which is getting smaller at each epoch). After a while, the errors start oscillating around a value (so the error stops decreasing).

我是这个主题的新手，到目前为止我所知道的一切都是自学的(阅读教程、论文等).为什么输入的呈现顺序会改变我的网络的行为?是不是因为从一个输入到下一个输入的错误变化是一致的(因为排序是一致的)，这使得网络容易学习?

I'm a novice at this topic and everything I know so far is self-taught (reading tutorials, papers, etc.). Why does the order of presentation of inputs change the behavior of my network? Is it because the change in error is consistent from one input to the next (because the ordering is consistent), which makes it easy for the network to learn?

我该怎么做才能解决这个问题?我正在检查我的反向传播算法，以确保我已经正确地实现了它；目前它是通过学习率和动量实现的.我正在考虑查看其他增强功能，例如自适应学习率.然而，XOR 网络通常被描绘成一个非常简单的网络，所以我认为我不需要使用复杂的反向传播算法.

What can I do to fix this? I'm going over my backpropagation algorithm to make sure I've implemented it right; currently it is implemented with a learning rate and a momentum. I'm considering looking at other enhancements like an adaptive learning-rate. However, the XOR network is often portrayed as a very simple network and so I'm thinking that I shouldn't need to use a sophisticated backpropagation algorithm.

推荐答案

将构成训练集的观察结果(输入向量)呈现给网络的顺序仅在一个方面很重要——随机排列根据响应变量的观察结果中的 em> 是强烈首选而不是有序排列.

the order in which you present the observations (input vectors) comprising your training set to the network only matters in one respect--randomized arrangement of the observations according to the response variable is strongly preferred versus ordered arrangement.

例如，假设您的训练集有 150 个观测值，并且对于每个观测值，响应变量是三个类别标签(I、II 或 III 类)之一，因此观测值 1-50 属于 I 类，51II类-100，III类101-50.您不想做的是以该顺序将它们呈现给网络.换句话说，您不希望网络看到第 I 类中的所有 50 个观测值，然后是第 II 类中的所有 50 个，然后是第 III 类中的所有 50 个.

For instance, suppose you have 150 observations comprising your training set, and for each the response variable is one of three class labels (class I, II, or III), such that observations 1-50 are in class I, 51-100 in class II, and 101-50 in class III. What you do not want to do is present them to the network in that order. In other words, you do not want the network to see all 50 observations in class I, then all 50 in class II, then all 50 in class III.

在训练分类器的过程中发生了什么?好吧，最初您将四个观察结果呈现给您的网络，无序 [0, 1, 1, 0].

What happened during training your classifier? Well initially you were presenting the four observations to your network, unordered [0, 1, 1, 0].

我想知道在您的网络未能收敛的那些情况下，输入向量的顺序是什么?如果它是 [1, 1, 0, 0] 或 [0, 1, 1, 1]，这与我上面提到的这个有据可查的经验规则是一致的.

I wonder what was the ordering of the input vectors in those instances in which your network failed to converge? If it was [1, 1, 0, 0], or [0, 1, 1, 1], this is consistent with this well-documented empirical rule i mentioned above.

另一方面，我不得不怀疑这条规则是否适用于您的情况.原因是你的训练实例太少，即使顺序是 [1, 1, 0, 0]，训练多个时期(我相信你必须这样做)将意味着这个排序看起来更随机"而不是我上面提到的示例(即 [1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0] 是网络将如何在三个时期内提供训练数据).

诊断问题的一些建议:

On the other hand, i have to wonder whether this rule even applies in your case. The reason is that you have so few training instances that even if the order is [1, 1, 0, 0], training over multiple epochs (which i am sure you must be doing) will mean that this ordering looks more "randomized" rather than the exemplar i mentioned above (i.e., [1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0] is how the network would be presented with the training data over three epochs).

Some suggestions to diagnose the problem:

正如我上面提到的，看看在非收敛情况下输入向量的顺序——它们是按响应变量排序的吗?

As i mentioned above, look at the ordering of your input vectors in the non-convergence cases--are they sorted by response variable?

在不收敛的情况下，看看你的权重矩阵(我假设你有两个).查找任何非常大的值(例如，其他值的 100 倍，或初始化时使用的值的 100 倍).较大的权重会导致溢出.

In the non-convergence cases, look at your weight matrices (i assume you have two of them). Look for any values that are very large (e.g., 100x the others, or 100x the value it was initialized with). Large weights can cause overflow.

这篇关于随机化神经网络输入顺序的影响的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

随机化神经网络输入顺序的影响 [英] Effects of randomizing the order of inputs to a neural network

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录关闭

随机化神经网络输入顺序的影响 [英] Effects of randomizing the order of inputs to a neural network

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录 关闭

登录关闭