如何序列化/反序列化 pybrain 网络? [英] How to serialize/deserialized pybrain networks?

查看:30
本文介绍了如何序列化/反序列化 pybrain 网络?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

PyBrain 是一个 Python 库,提供(除其他外)易于使用的人工神经网络.>

我无法使用 pickle 或 cPickle 正确序列化/反序列化 PyBrain 网络.

看下面的例子:

from pybrain.datasets import SupervisedDataSet从 pybrain.tools.shortcuts 导入 buildNetwork从 pybrain.supervised.trainers 导入 BackpropTrainer导入 cPickle 作为泡菜将 numpy 导入为 np#生成一些数据np.random.seed(93939393)数据 = 监督数据集(2, 1)对于 x 范围内的 x(10):y = x * 3z = x + y + 0.2 * np.random.randn()data.addSample((x, y), (z,))#建立一个网络并训练它net1 = buildNetwork( data.indim, 2, data.outdim )trainer1 = BackpropTrainer(net1, dataset=data,verbose=True)对于 xrange(4) 中的 i:trainer1.trainEpochs(1)在 %d 个时期后打印 '	value: %.2f'%(i, net1.activate((1, 4))[0])

这是上面代码的输出:

总错误:201.5019984760 epoch 后的值:2.79总错误:152.4876163821个时期后的价值:5.44总错误:120.480925612个时期后的价值:7.56总错误:97.98840434523个时期后的价值:8.41

如您所见,网络总误差随着训练的进行而减少.您还可以看到预测值接近预期值 12.

现在我们将做一个类似的练习,但将包括序列化/反序列化:

打印'创建net2'net2 = buildNetwork(data.indim, 2, data.outdim)trainer2 = BackpropTrainer(net2, dataset=data,verbose=True)trainer2.trainEpochs(1)在 %d 个时期后打印 '	value: %.2f'%(1, net2.activate((1, 4))[0])#到现在为止还挺好.让我们测试泡菜pickle.dump(net2, open('testNetwork.dump', 'w'))net2 = pickle.load(open('testNetwork.dump'))trainer2 = BackpropTrainer(net2, dataset=data,verbose=True)打印使用pickle加载net2,继续训练"对于 xrange(1, 4) 中的 i:trainer2.trainEpochs(1)在 %d 个时期后打印 '	value: %.2f'%(i, net2.activate((1, 4))[0])

这是这个块的输出:

创建net2总错误:176.3393786391个时期后的价值:5.45使用pickle加载net2,继续训练总错误:123.3921818591个时期后的价值:5.45总错误:94.28676376232个时期后的价值:5.45总错误:78.0767111143个时期后的价值:5.45

如您所见,似乎训练对网络产生了一些影响(报告的总错误值继续下降),但是网络的输出值冻结在与第一次训练迭代相关的值上.

是否有任何我需要注意的缓存机制会导致这种错误行为?有没有更好的方法来序列化/反序列化 pybrain 网络?

相关版本号:

  • Python 2.6.5(r265:79096,2010 年 3 月 19 日,21:48:26)[MSC v.1500 32 位(英特尔)]
  • Numpy 1.5.1
  • cPickle 1.71
  • pybrain 0.3

附言我在项目的网站上创建了一个错误报告,并将保留 SO 和错误跟踪器更新j

解决方案

原因

导致这种行为的机制是 PyBrain 模块中对参数(.params)和派生类(.derivs)的处理:实际上,所有的网络参数都被存储在一个数组中,但是各个 ModuleConnection 对象可以访问他们自己的".params,然而这只是一个视图整个数组的一部分.这允许在同一数据结构上进行本地和网络范围的写入和读取.

显然,这个切片视图链接会因酸洗-解酸而丢失.

解决方案

插入

net2.sorted = Falsenet2.sortModules()

从文件加载后(重新创建此共享),它应该可以工作.

PyBrain is a python library that provides (among other things) easy to use Artificial Neural Networks.

I fail to properly serialize/deserialize PyBrain networks using either pickle or cPickle.

See the following example:

from pybrain.datasets            import SupervisedDataSet
from pybrain.tools.shortcuts     import buildNetwork
from pybrain.supervised.trainers import BackpropTrainer
import cPickle as pickle
import numpy as np 

#generate some data
np.random.seed(93939393)
data = SupervisedDataSet(2, 1)
for x in xrange(10):
    y = x * 3
    z = x + y + 0.2 * np.random.randn()  
    data.addSample((x, y), (z,))

#build a network and train it    

net1 = buildNetwork( data.indim, 2, data.outdim )
trainer1 = BackpropTrainer(net1, dataset=data, verbose=True)
for i in xrange(4):
    trainer1.trainEpochs(1)
    print '	value after %d epochs: %.2f'%(i, net1.activate((1, 4))[0])

This is the output of the above code:

Total error: 201.501998476
    value after 0 epochs: 2.79
Total error: 152.487616382
    value after 1 epochs: 5.44
Total error: 120.48092561
    value after 2 epochs: 7.56
Total error: 97.9884043452
    value after 3 epochs: 8.41

As you can see, network total error decreases as the training progresses. You can also see that the predicted value approaches the expected value of 12.

Now we will do a similar exercise, but will include serialization/deserialization:

print 'creating net2'
net2 = buildNetwork(data.indim, 2, data.outdim)
trainer2 = BackpropTrainer(net2, dataset=data, verbose=True)
trainer2.trainEpochs(1)
print '	value after %d epochs: %.2f'%(1, net2.activate((1, 4))[0])

#So far, so good. Let's test pickle
pickle.dump(net2, open('testNetwork.dump', 'w'))
net2 = pickle.load(open('testNetwork.dump'))
trainer2 = BackpropTrainer(net2, dataset=data, verbose=True)
print 'loaded net2 using pickle, continue training'
for i in xrange(1, 4):
        trainer2.trainEpochs(1)
        print '	value after %d epochs: %.2f'%(i, net2.activate((1, 4))[0])

This is the output of this block:

creating net2
Total error: 176.339378639
    value after 1 epochs: 5.45
loaded net2 using pickle, continue training
Total error: 123.392181859
    value after 1 epochs: 5.45
Total error: 94.2867637623
    value after 2 epochs: 5.45
Total error: 78.076711114
    value after 3 epochs: 5.45

As you can see, it seems that the training has some effect on the network (the reported total error value continues to decrease), however the output value of the network freezes on a value that was relevant for the first training iteration.

Is there any caching mechanism that I need to be aware of that causes this erroneous behaviour? Are there better ways to serialize/deserialize pybrain networks?

Relevant version numbers:

  • Python 2.6.5 (r265:79096, Mar 19 2010, 21:48:26) [MSC v.1500 32 bit (Intel)]
  • Numpy 1.5.1
  • cPickle 1.71
  • pybrain 0.3

P.S. I have created a bug report on the project's site and will keep both SO and the bug tracker updatedj

解决方案

Cause

The mechanism that causes this behavior is the handling of parameters (.params) and derivatives (.derivs) in PyBrain modules: in fact, all network parameters are stored in one array, but the individual Module or Connection objects have access to "their own" .params, which, however are just a view on a slice of the total array. This allows both local and network-wide writes and read-outs on the same data-structure.

Apparently this slice-view link gets lost by pickling-unpickling.

Solution

Insert

net2.sorted = False
net2.sortModules()

after loading from the file (which recreates this sharing), and it should work.

这篇关于如何序列化/反序列化 pybrain 网络?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆