训练神经网络时资源耗尽 - keras [英] Resource Exhausted when training a neural network - keras

查看:28
本文介绍了训练神经网络时资源耗尽 - keras的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含 65668 个文件的数据集.

I have a dataset of 65668 files.

我将 Keras 用于 CNN,这些是我的层:

I am using Keras for a CNN, and these are my layers:

embedding_layer = Embedding(len(word_index) + 1,
                        EMBEDDING_DIM,
                        weights=[embedding_matrix],
                        input_length=MAX_SEQUENCE_LENGTH,
                        trainable=True)
sequence_input = Input(shape=(MAX_SEQUENCE_LENGTH,), dtype='int32')
embedded_sequences = embedding_layer(sequence_input)
x = Conv1D(128, 5, activation='relu')(embedded_sequences)
x = MaxPooling1D(5)(x)
x = Conv1D(256, 5, activation='relu')(x)
x = MaxPooling1D(5)(x)
x = Flatten()(x)
x = Dense(128, activation='relu')(x)
preds = Dense(len(labels_index), activation='softmax')(x)

第一个嵌入层是在 GloVE.6B.100d 上训练的.拟合数据:

First embedding layer is trained on GloVE.6B.100d. Fitting the data:

# fitting the data
model.fit(x_train, y_train, validation_data=(x_val, y_val),
      epochs=20, batch_size=128)

MAX_SEQUENCE_LENGTH 是 500.我正在 GPU 上进行训练,Nvidia GeForce 940MX,作为堆栈的一部分,我收到以下错误:

The MAX_SEQUENCE_LENGTH is 500. I am training on the GPU, Nvidia GeForce 940MX, and I get the following error as part of the stack :

资源耗尽:分配形状为 [15318793,100] 的张量并在/job:localhost/replica:0/task:0/device:GPU:0 上通过分配器 GPU_0_bfc 输入 float 时出现 OOM

Resource exhausted: OOM when allocating tensor with shape[15318793,100] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc

我尝试将批量大小减少到 16,甚至 8,但我仍然遇到相同的错误.可能是什么问题?

I tried reducing batch size to 16, even 8 and I still get the same error. What can the issue be?

推荐答案

问题在于您的嵌入.它需要分配一个大小为15318793 * 100 * 4 bytes = 5.7 GB的矩阵,这绝对大于GeForce 940 MX内存.有几种方法可以解决这个问题:

The problem lies in your Embedding. It needs to allocate a matrix of size 15318793 * 100 * 4 bytes = 5.7 GB which is definitely greater than GeForce 940 MX memory. There are few ways on how you could overcome this issue:

  1. 减少词汇量/语料库大小:尝试采用例如1M 个最常用的词而不是完整的词集.这将大大减少嵌入矩阵的大小.

  1. Decrease the vocabulary/corpus size: Try to take e.g. 1M most frequent words instead of full words set. This will drastically decrease the embedding matrix size.

使用生成器代替 Embedding:您可以使用生成器将序列转换为词向量序列,而不是使用 Embedding.

使用 Embedding 的线性变换而不是重新训练您的嵌入 - 正如您提到的,使用标志 trainable=False 制作了您的算法工作时,您可以将其设置为 False 并添加:

Use linear transformation of Embedding instead of retraining your embedding - as you mentioned that with flag trainable=False made your algorithm working you can set it to False and add:

Dense(new_embedding_size, activation='linear')(embedding)

基于现有嵌入训练新嵌入.

to train a new embedding based on existing one.

更换设备 - 如果您有巨大的RAM 内存,您可以尝试以下策略:

Change device - if you have huge RAM memory you can try the following strategy:

with tf.device('/cpu:0'):    
    embedding_layer = Embedding(len(word_index) + 1,
        EMBEDDING_DIM,
        weights=[embedding_matrix],
        input_length=MAX_SEQUENCE_LENGTH,
        trainable=True)
    sequence_input = Input(shape=(MAX_SEQUENCE_LENGTH,), dtype='int32')
    embedded_sequences = embedding_layer(sequence_input)

在此设计中,Embedding 层的计算将使用 CPURAM.缺点是 RAMGPU 之间的传输可能真的很慢.

In this design computations of Embedding layer would be made using CPU and RAM. The downside is the fact that transfer between RAM and GPU might be really slow.

这篇关于训练神经网络时资源耗尽 - keras的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆