训练神经网络时资源耗尽 - keras [英] Resource Exhausted when training a neural network - keras
问题描述
我有一个包含 65668 个文件的数据集.
I have a dataset of 65668 files.
我将 Keras 用于 CNN,这些是我的层:
I am using Keras for a CNN, and these are my layers:
embedding_layer = Embedding(len(word_index) + 1,
EMBEDDING_DIM,
weights=[embedding_matrix],
input_length=MAX_SEQUENCE_LENGTH,
trainable=True)
sequence_input = Input(shape=(MAX_SEQUENCE_LENGTH,), dtype='int32')
embedded_sequences = embedding_layer(sequence_input)
x = Conv1D(128, 5, activation='relu')(embedded_sequences)
x = MaxPooling1D(5)(x)
x = Conv1D(256, 5, activation='relu')(x)
x = MaxPooling1D(5)(x)
x = Flatten()(x)
x = Dense(128, activation='relu')(x)
preds = Dense(len(labels_index), activation='softmax')(x)
第一个嵌入层是在 GloVE.6B.100d 上训练的.拟合数据:
First embedding layer is trained on GloVE.6B.100d. Fitting the data:
# fitting the data
model.fit(x_train, y_train, validation_data=(x_val, y_val),
epochs=20, batch_size=128)
MAX_SEQUENCE_LENGTH
是 500.我正在 GPU 上进行训练,Nvidia GeForce 940MX,作为堆栈的一部分,我收到以下错误:
The MAX_SEQUENCE_LENGTH
is 500.
I am training on the GPU, Nvidia GeForce 940MX,
and I get the following error as part of the stack :
资源耗尽:分配形状为 [15318793,100] 的张量并在/job:localhost/replica:0/task:0/device:GPU:0 上通过分配器 GPU_0_bfc 输入 float 时出现 OOM
Resource exhausted: OOM when allocating tensor with shape[15318793,100] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
我尝试将批量大小减少到 16,甚至 8,但我仍然遇到相同的错误.可能是什么问题?
I tried reducing batch size to 16, even 8 and I still get the same error. What can the issue be?
推荐答案
问题在于您的嵌入
.它需要分配一个大小为15318793 * 100 * 4 bytes = 5.7 GB
的矩阵,这绝对大于GeForce 940 MX
内存.有几种方法可以解决这个问题:
The problem lies in your Embedding
. It needs to allocate a matrix of size 15318793 * 100 * 4 bytes = 5.7 GB
which is definitely greater than GeForce 940 MX
memory. There are few ways on how you could overcome this issue:
减少词汇量/语料库大小:尝试采用例如1M 个最常用的词而不是完整的词集.这将大大减少嵌入矩阵的大小.
Decrease the vocabulary/corpus size: Try to take e.g. 1M most frequent words instead of full words set. This will drastically decrease the embedding matrix size.
使用生成器代替 Embedding
:您可以使用生成器将序列转换为词向量序列,而不是使用 Embedding
.
使用 Embedding
的线性变换而不是重新训练您的嵌入 - 正如您提到的,使用标志 trainable=False
制作了您的算法工作时,您可以将其设置为 False
并添加:
Use linear transformation of Embedding
instead of retraining your embedding - as you mentioned that with flag trainable=False
made your algorithm working you can set it to False
and add:
Dense(new_embedding_size, activation='linear')(embedding)
基于现有嵌入训练新嵌入.
to train a new embedding based on existing one.
更换设备 - 如果您有巨大的RAM
内存,您可以尝试以下策略:
Change device - if you have huge RAM
memory you can try the following strategy:
with tf.device('/cpu:0'):
embedding_layer = Embedding(len(word_index) + 1,
EMBEDDING_DIM,
weights=[embedding_matrix],
input_length=MAX_SEQUENCE_LENGTH,
trainable=True)
sequence_input = Input(shape=(MAX_SEQUENCE_LENGTH,), dtype='int32')
embedded_sequences = embedding_layer(sequence_input)
在此设计中,Embedding
层的计算将使用 CPU
和 RAM
.缺点是 RAM
和 GPU
之间的传输可能真的很慢.
In this design computations of Embedding
layer would be made using CPU
and RAM
. The downside is the fact that transfer between RAM
and GPU
might be really slow.
这篇关于训练神经网络时资源耗尽 - keras的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!