了解Keras LSTM(lstm_text_generation.py)-RAM内存问题 [英] understanding Keras LSTM ( lstm_text_generation.py ) - RAM memory issues

查看：319 发布时间：2020/4/25 10:45:21 python numpy out-of-memory theano keras

本文介绍了了解Keras LSTM(lstm_text_generation.py)-RAM内存问题的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在使用 Keras 和 Theano 后端进入LSTM RNN.尝试使用来自keras回购中的lstm示例时 lstm_text_generation.py的整个代码在github 上，我有一件不太清楚的事情:它是对输入数据(文本字符)进行矢量化的方式:

I'm diving into LSTM RNN with Keras and Theano backend. While trying to use lstm examples from keras' repo whole code of lstm_text_generation.py on github, I've got one thing that isn't pretty clear to me: the way it's vectorizing the input data (text characters):

# cut the text in semi-redundant sequences of maxlen characters
maxlen = 40
step = 3
sentences = []
next_chars = []
for i in range(0, len(text) - maxlen, step):
    sentences.append(text[i: i + maxlen])
    next_chars.append(text[i + maxlen])
print('nb sequences:', len(sentences))

#np - means numpy
print('Vectorization...')
X = np.zeros((len(sentences), maxlen, len(chars)), dtype=np.bool)
y = np.zeros((len(sentences), len(chars)), dtype=np.bool)
for i, sentence in enumerate(sentences):
    for t, char in enumerate(sentence):
        X[i, t, char_indices[char]] = 1
    y[i, char_indices[next_chars[i]]] = 1

在这里，如您所见，它们生成带有 Numpy 的零列表，然后以这种方式将"1"放在由输入字符编码序列定义的每个列表的特定位置.

Here, as you can see, they generate lists of zeros with Numpy and then put '1' to particular position of each list defined by input characters encoding sequences in such way.

问题是:为什么他们使用该算法?有可能以某种方式对其进行优化吗?也许可以以其他方式对输入数据进行编码，而不用使用庞大的列表列表?问题在于它对输入数据有严格的限制:生成大于10 Mb文本的矢量会导致Python的MemoryError(处理它需要数十Gb的RAM！).

The question is: why did they use that algorithm? is it possible to optimize it somehow? maybe it's possible to encode input data in some other way, not using huge lists of lists? The problem is that it has severe limits of input data: generating such vectors for >10 Mb text causes MemoryError of Python (dozens of Gbs RAM needed to process it!).

谢谢，伙计们.

了解Keras LSTM(lstm_text_generation.py)-RAM内存问题 [英] understanding Keras LSTM ( lstm_text_generation.py ) - RAM memory issues

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

了解Keras LSTM(lstm_text_generation.py)-RAM内存问题 [英] understanding Keras LSTM ( lstm_text_generation.py ) - RAM memory issues

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭