PyBrain 的神经网络训练不会收敛 [英] Neural Network training with PyBrain won't converge

查看:20
本文介绍了PyBrain 的神经网络训练不会收敛的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下代码,来自 PyBrain 教程:

from pybrain.datasets import SupervisedDataSet从 pybrain.supervised.trainers 导入 BackpropTrainer从 pybrain.tools.shortcuts 导入 buildNetwork从 pybrain.structure.modules 导入 TanhLayerds = 监督数据集(2, 1)ds.addSample((0,0), (0,))ds.addSample((0,1), (1,))ds.addSample((1,0), (1,))ds.addSample((1,1), (0,))net = buildNetwork(2, 3, 1, bias=True, hiddenclass=TanhLayer)训练师 = BackpropTrainer(net, ds)对于 inp,ds 中的 tar:打印 [net.activate(inp), tar]错误 = trainer.trainUntilConvergence()对于 inp,ds 中的 tar:打印 [net.activate(inp), tar]

然而,结果是一个没有经过很好训练的神经网络.当查看错误输出时,网络得到了正确的训练,但是它使用 'continueEpochs' 参数来训练更多,并且网络的性能再次变差.所以网络是收敛的,但是没有办法得到最好的训练好的网络.PyBrain 的文档暗示返回训练得最好的网络,但是它返回一个错误元组.

当将 continueEpochs 设置为 0 时,我收到一个错误(ValueError: max() arg 是一个空序列)所以 continueEpochs 必须大于 0.

PyBrain 是否真的维护了,因为看起来文档和代码有很大的不同.

解决方案

经过更多挖掘,我发现 PyBrain 教程中的示例完全不合适.

当我们查看源代码中的方法签名时,我们发现:

def trainUntilConvergence(self, dataset=None, maxEpochs=None,verbose=None, continueEpochs=10, validationProportion=0.25):

这意味着 25% 的训练集用于验证.尽管在训练数据网络时这是一种非常有效的方法,但当您拥有完整的可能性范围(即 4 行 XOR 2-in-1-out 解决方案集)时,您不会这样做.当一个人想要训练一个 XOR 集并且你删除其中一行进行验证时,这会立即导致你得到一个非常稀疏的训练集,其中一个可能的组合被省略,导致这些权重自动未被训练.

通常,当您省略 25% 的数据进行验证时,您会假设这 25% 覆盖了网络已经或多或少遇到的大部分"解决方案空间.在这种情况下,这是不正确的,它涵盖了网络完全未知的解决方案空间的 25%,因为您将其删除以进行验证.

因此,训练器正确地训练了网络,但是通过省略 25% 的 XOR 问题,这会导致网络训练得不好.

PyBrain 网站上的一个不同示例作为快速入门非常方便,因为在这个特定的 XOR 情况下,这个示例完全错误.你可能想知道他们是否自己尝试过这个例子,因为它只是随机输出训练有素的网络.

I have the following code, from the PyBrain tutorial:

from pybrain.datasets import SupervisedDataSet
from pybrain.supervised.trainers import BackpropTrainer
from pybrain.tools.shortcuts import buildNetwork
from pybrain.structure.modules import TanhLayer

ds = SupervisedDataSet(2, 1)
ds.addSample((0,0), (0,))
ds.addSample((0,1), (1,))
ds.addSample((1,0), (1,))
ds.addSample((1,1), (0,))

net     = buildNetwork(2, 3, 1, bias=True, hiddenclass=TanhLayer)
trainer = BackpropTrainer(net, ds)

for inp, tar in ds:
     print [net.activate(inp), tar]

errors  = trainer.trainUntilConvergence()

for inp, tar in ds:
     print [net.activate(inp), tar]

However the result is a neural network that is not trained well. When looking at the error output the network gets trained properly however it uses the 'continueEpochs' argument to train some more and the network is performing worse again. So the network is converging, but there is no way to get the best trained network. The documentation of PyBrain implies that the network is returned which is trained best, however it returns a Tuple of errors.

Whens etting continueEpochs to 0 I get an error (ValueError: max() arg is an empty sequence) so continueEpochs must be larger than 0.

Is PyBrain actually maintained because it seems there is a big difference in documentation and code.

解决方案

After some more digging I found that the example on the PyBrain's tutorial is completely out of place.

When we look at the method signature in the source code we find:

def trainUntilConvergence(self, dataset=None, maxEpochs=None, verbose=None, continueEpochs=10, validationProportion=0.25):

This means that 25% of the training set is used for validation. Although that is a very valid method when training a network on data you are not going to do this when you have the complete range of possiblities at your disposal, namely a 4-row XOR 2-in-1-out solution set. When one wants to train an XOR set and you remove one of the rows for validation that has as an immediate consequence that you get a very sparse training set where one of the possible combinations is omitted resulting automatically into those weights not being trained.

Normally when you omit 25% of the data for validation you do this by assuming that those 25% cover 'most' of the solution space the network already has encountered more or less. In this case this is not true and it covers 25% of the solution space completely unknown to the network since you removed it for validation.

So, the trainer was training the network correctly, but by omitting 25% of the XOR problem this results in a badly trained network.

A different example on the PyBrain website as a quickstart would be very handy, because this example is just plain wrong in this specific XOR case. You might wonder if they tried the example themselves, because it just outputs random badly trained networks.

这篇关于PyBrain 的神经网络训练不会收敛的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆