使用PyBrain进行的神经网络训练不会收敛 [英] Neural Network training with PyBrain won't converge

查看:104
本文介绍了使用PyBrain进行的神经网络训练不会收敛的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下代码,来自PyBrain教程:

from pybrain.datasets import SupervisedDataSet
from pybrain.supervised.trainers import BackpropTrainer
from pybrain.tools.shortcuts import buildNetwork
from pybrain.structure.modules import TanhLayer

ds = SupervisedDataSet(2, 1)
ds.addSample((0,0), (0,))
ds.addSample((0,1), (1,))
ds.addSample((1,0), (1,))
ds.addSample((1,1), (0,))

net     = buildNetwork(2, 3, 1, bias=True, hiddenclass=TanhLayer)
trainer = BackpropTrainer(net, ds)

for inp, tar in ds:
     print [net.activate(inp), tar]

errors  = trainer.trainUntilConvergence()

for inp, tar in ds:
     print [net.activate(inp), tar]

但是结果是训练得不好的神经网络.在查看错误输出时,网络得到了正确的训练,但是它使用"continueEpochs"自变量来进行更多的训练,因此网络再次表现不佳.因此,网络正在融合,但是没有办法获得训练有素的网络. PyBrain的文档表明,返回的网络经过了最佳培训,但是返回了错误的元组.

当将continueEpochs设置为0时,我得到一个错误(ValueError:max()arg为空序列),因此continueEpochs必须大于0.

PyBrain实际上得到了维护,因为似乎在文档和代码上有很大的不同.

解决方案

进一步挖掘之后,我发现PyBrain教程中的示例完全不合适.

当我们在源代码中查看方法签名时,会发现:

def trainUntilConvergence(self, dataset=None, maxEpochs=None, verbose=None, continueEpochs=10, validationProportion=0.25):

这意味着训练集的25%用于验证.尽管这是在数据上训练网络的一种非常有效的方法,但是当您拥有完备的各种可能性(即4行XOR 2合1出解决方案集)时,您就不会这样做.如果要训练一个XOR集,而您删除了其中一行进行验证,则直接得到的结果是非常稀疏的训练集,其中省略了可能的组合之一,从而自动导致那些权重不被训练.

通常,当您忽略25%的数据以进行验证时,可以通过假设这些25%的数据覆盖了网络已经或多或少遇到的大部分"解决方案空间来进行此操作.在这种情况下,这是不正确的,因为它已删除进行验证,所以它覆盖了网络完全不知道的解决方案空间的25%.

因此,培训师正在正确地训练网络,但是通过忽略25%的XOR问题,这会导致训练不好的网络.

作为快速入门,在PyBrain网站上使用其他示例非常方便,因为在这种特定的XOR情况下,该示例完全是错误的.您可能想知道他们是否亲自尝试过该示例,因为它仅输出随机的,训练有素的网络.

I have the following code, from the PyBrain tutorial:

from pybrain.datasets import SupervisedDataSet
from pybrain.supervised.trainers import BackpropTrainer
from pybrain.tools.shortcuts import buildNetwork
from pybrain.structure.modules import TanhLayer

ds = SupervisedDataSet(2, 1)
ds.addSample((0,0), (0,))
ds.addSample((0,1), (1,))
ds.addSample((1,0), (1,))
ds.addSample((1,1), (0,))

net     = buildNetwork(2, 3, 1, bias=True, hiddenclass=TanhLayer)
trainer = BackpropTrainer(net, ds)

for inp, tar in ds:
     print [net.activate(inp), tar]

errors  = trainer.trainUntilConvergence()

for inp, tar in ds:
     print [net.activate(inp), tar]

However the result is a neural network that is not trained well. When looking at the error output the network gets trained properly however it uses the 'continueEpochs' argument to train some more and the network is performing worse again. So the network is converging, but there is no way to get the best trained network. The documentation of PyBrain implies that the network is returned which is trained best, however it returns a Tuple of errors.

Whens etting continueEpochs to 0 I get an error (ValueError: max() arg is an empty sequence) so continueEpochs must be larger than 0.

Is PyBrain actually maintained because it seems there is a big difference in documentation and code.

解决方案

After some more digging I found that the example on the PyBrain's tutorial is completely out of place.

When we look at the method signature in the source code we find:

def trainUntilConvergence(self, dataset=None, maxEpochs=None, verbose=None, continueEpochs=10, validationProportion=0.25):

This means that 25% of the training set is used for validation. Although that is a very valid method when training a network on data you are not going to do this when you have the complete range of possiblities at your disposal, namely a 4-row XOR 2-in-1-out solution set. When one wants to train an XOR set and you remove one of the rows for validation that has as an immediate consequence that you get a very sparse training set where one of the possible combinations is omitted resulting automatically into those weights not being trained.

Normally when you omit 25% of the data for validation you do this by assuming that those 25% cover 'most' of the solution space the network already has encountered more or less. In this case this is not true and it covers 25% of the solution space completely unknown to the network since you removed it for validation.

So, the trainer was training the network correctly, but by omitting 25% of the XOR problem this results in a badly trained network.

A different example on the PyBrain website as a quickstart would be very handy, because this example is just plain wrong in this specific XOR case. You might wonder if they tried the example themselves, because it just outputs random badly trained networks.

这篇关于使用PyBrain进行的神经网络训练不会收敛的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆