了解Keras LSTM(lstm_text_generation.py)-RAM内存问题 [英] understanding Keras LSTM ( lstm_text_generation.py ) - RAM memory issues

查看:319
本文介绍了了解Keras LSTM(lstm_text_generation.py)-RAM内存问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 Keras Theano 后端进入LSTM RNN.尝试使用来自keras回购中的lstm示例时 lstm_text_generation.py的整个代码在github 上,我有一件不太清楚的事情:它是对输入数据(文本字符)进行矢量化的方式:

I'm diving into LSTM RNN with Keras and Theano backend. While trying to use lstm examples from keras' repo whole code of lstm_text_generation.py on github, I've got one thing that isn't pretty clear to me: the way it's vectorizing the input data (text characters):

# cut the text in semi-redundant sequences of maxlen characters
maxlen = 40
step = 3
sentences = []
next_chars = []
for i in range(0, len(text) - maxlen, step):
    sentences.append(text[i: i + maxlen])
    next_chars.append(text[i + maxlen])
print('nb sequences:', len(sentences))

#np - means numpy
print('Vectorization...')
X = np.zeros((len(sentences), maxlen, len(chars)), dtype=np.bool)
y = np.zeros((len(sentences), len(chars)), dtype=np.bool)
for i, sentence in enumerate(sentences):
    for t, char in enumerate(sentence):
        X[i, t, char_indices[char]] = 1
    y[i, char_indices[next_chars[i]]] = 1

在这里,如您所见,它们生成带有 Numpy 的零列表,然后以这种方式将"1"放在由输入字符编码序列定义的每个列表的特定位置.

Here, as you can see, they generate lists of zeros with Numpy and then put '1' to particular position of each list defined by input characters encoding sequences in such way.

问题是:为什么他们使用该算法?有可能以某种方式对其进行优化吗?也许可以以其他方式对输入数据进行编码,而不用使用庞大的列表列表?问题在于它对输入数据有严格的限制:生成大于10 Mb文本的矢量会导致Python的MemoryError(处理它需要数十Gb的RAM!).

The question is: why did they use that algorithm? is it possible to optimize it somehow? maybe it's possible to encode input data in some other way, not using huge lists of lists? The problem is that it has severe limits of input data: generating such vectors for >10 Mb text causes MemoryError of Python (dozens of Gbs RAM needed to process it!).

谢谢,伙计们.

推荐答案

在Keras中至少有两个优化可以用来减少这种情况下所需的内存量:

There are at least two optimizations in Keras which you could use in order to decrease amount of memory which is need in this case:

  1. 一个嵌入层,它可以仅接受一个整数整数完整的一个热向量.而且-该层可以在网络训练的最后阶段之前进行预训练-因此您可以在模型中注入一些先验知识(甚至可以在网络拟合期间对其进行微调).

  1. An Embedding layer which makes it possible to accept only a single integer intead of full one hot vector. Moreover - this layer could be pretrained before the final stage of network training - so you could inject some prior knowledge into your model (and even finetune it during the network fitting).

fit_generator 方法可以使用预定的方法训练网络生成网络拟合中需要的对(x, y)的生成器.您可以例如将整个数据集保存到磁盘,并使用生成器界面逐部分读取它.

A fit_generator method makes it possible to train a network using a predefinied generator which would produce pairs (x, y) need in network fitting. You could e.g. save the whole dataset to disk and read it part by part using a generator interface.

当然-这两种方法可以混合使用.我认为,简单是您提供的示例中这种实现背后的原因.

Of course - both of this methods could be mixed. I think that simplicity was the reason behind this kind of implementation in the example you provided.

这篇关于了解Keras LSTM(lstm_text_generation.py)-RAM内存问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆