训练神经网络时资源耗尽-Keras [英] Resource Exhausted when training a neural network - keras
问题描述
我有65668个文件的数据集.
I have a dataset of 65668 files.
我正在使用Keras进行CNN,这是我的图层:
I am using Keras for a CNN, and these are my layers:
embedding_layer = Embedding(len(word_index) + 1,
EMBEDDING_DIM,
weights=[embedding_matrix],
input_length=MAX_SEQUENCE_LENGTH,
trainable=True)
sequence_input = Input(shape=(MAX_SEQUENCE_LENGTH,), dtype='int32')
embedded_sequences = embedding_layer(sequence_input)
x = Conv1D(128, 5, activation='relu')(embedded_sequences)
x = MaxPooling1D(5)(x)
x = Conv1D(256, 5, activation='relu')(x)
x = MaxPooling1D(5)(x)
x = Flatten()(x)
x = Dense(128, activation='relu')(x)
preds = Dense(len(labels_index), activation='softmax')(x)
第一个嵌入层在GloVE.6B.100d上进行了训练. 拟合数据:
First embedding layer is trained on GloVE.6B.100d. Fitting the data:
# fitting the data
model.fit(x_train, y_train, validation_data=(x_val, y_val),
epochs=20, batch_size=128)
MAX_SEQUENCE_LENGTH
是500.
我正在使用Nvidia GeForce 940MX GPU进行培训,
我在堆栈中得到以下错误:
The MAX_SEQUENCE_LENGTH
is 500.
I am training on the GPU, Nvidia GeForce 940MX,
and I get the following error as part of the stack :
资源用尽:分配具有形状的张量[15318793,100]并通过分配器GPU_0_bfc在/job:localhost/replica:0/task:0/device:GPU:0上键入float
Resource exhausted: OOM when allocating tensor with shape[15318793,100] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
我尝试将批处理大小减小到16,甚至8,但仍然出现相同的错误. 问题可能是什么?
I tried reducing batch size to 16, even 8 and I still get the same error. What can the issue be?
推荐答案
问题出在您的Embedding
上.它需要分配大小为15318793 * 100 * 4 bytes = 5.7 GB
的矩阵,该矩阵肯定大于GeForce 940 MX
内存.有几种方法可以解决此问题:
The problem lies in your Embedding
. It needs to allocate a matrix of size 15318793 * 100 * 4 bytes = 5.7 GB
which is definitely greater than GeForce 940 MX
memory. There are few ways on how you could overcome this issue:
-
减小词汇量/语料库大小:尝试例如1M最常用的单词,而不是完整的单词集.这将大大减少嵌入矩阵的大小.
Decrease the vocabulary/corpus size: Try to take e.g. 1M most frequent words instead of full words set. This will drastically decrease the embedding matrix size.
使用生成器而不是Embedding
:可以使用生成器将序列转换为单词向量序列,而不使用Embedding
.
使用Embedding
的线性变换而不是重新训练嵌入-正如您提到的那样,使用标志trainable=False
使您的算法正常工作,您可以将其设置为False
并添加:
Use linear transformation of Embedding
instead of retraining your embedding - as you mentioned that with flag trainable=False
made your algorithm working you can set it to False
and add:
Dense(new_embedding_size, activation='linear')(embedding)
训练基于现有嵌入的新嵌入.
to train a new embedding based on existing one.
更换设备-如果您的RAM
内存很大,则可以尝试以下策略:
Change device - if you have huge RAM
memory you can try the following strategy:
with tf.device('/cpu:0'):
embedding_layer = Embedding(len(word_index) + 1,
EMBEDDING_DIM,
weights=[embedding_matrix],
input_length=MAX_SEQUENCE_LENGTH,
trainable=True)
sequence_input = Input(shape=(MAX_SEQUENCE_LENGTH,), dtype='int32')
embedded_sequences = embedding_layer(sequence_input)
在此设计中,将使用CPU
和RAM
进行Embedding
层的计算.缺点是RAM
和GPU
之间的传输可能真的很慢.
In this design computations of Embedding
layer would be made using CPU
and RAM
. The downside is the fact that transfer between RAM
and GPU
might be really slow.
这篇关于训练神经网络时资源耗尽-Keras的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!