训练神经网络时资源耗尽-Keras [英] Resource Exhausted when training a neural network - keras

查看:237
本文介绍了训练神经网络时资源耗尽-Keras的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有65668个文件的数据集.

I have a dataset of 65668 files.

我正在使用Keras进行CNN,这是我的图层:

I am using Keras for a CNN, and these are my layers:

embedding_layer = Embedding(len(word_index) + 1,
                        EMBEDDING_DIM,
                        weights=[embedding_matrix],
                        input_length=MAX_SEQUENCE_LENGTH,
                        trainable=True)
sequence_input = Input(shape=(MAX_SEQUENCE_LENGTH,), dtype='int32')
embedded_sequences = embedding_layer(sequence_input)
x = Conv1D(128, 5, activation='relu')(embedded_sequences)
x = MaxPooling1D(5)(x)
x = Conv1D(256, 5, activation='relu')(x)
x = MaxPooling1D(5)(x)
x = Flatten()(x)
x = Dense(128, activation='relu')(x)
preds = Dense(len(labels_index), activation='softmax')(x)

第一个嵌入层在GloVE.6B.100d上进行了训练. 拟合数据:

First embedding layer is trained on GloVE.6B.100d. Fitting the data:

# fitting the data
model.fit(x_train, y_train, validation_data=(x_val, y_val),
      epochs=20, batch_size=128)

MAX_SEQUENCE_LENGTH是500. 我正在使用Nvidia GeForce 940MX GPU进行培训, 我在堆栈中得到以下错误:

The MAX_SEQUENCE_LENGTH is 500. I am training on the GPU, Nvidia GeForce 940MX, and I get the following error as part of the stack :

资源用尽:分配具有形状的张量[15318793,100]并通过分配器GPU_0_bfc在/job:localhost/replica:0/task:0/device:GPU:0上键入float

Resource exhausted: OOM when allocating tensor with shape[15318793,100] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc

我尝试将批处理大小减小到16,甚至8,但仍然出现相同的错误. 问题可能是什么?

I tried reducing batch size to 16, even 8 and I still get the same error. What can the issue be?

推荐答案

问题出在您的Embedding上.它需要分配大小为15318793 * 100 * 4 bytes = 5.7 GB的矩阵,该矩阵肯定大于GeForce 940 MX内存.有几种方法可以解决此问题:

The problem lies in your Embedding. It needs to allocate a matrix of size 15318793 * 100 * 4 bytes = 5.7 GB which is definitely greater than GeForce 940 MX memory. There are few ways on how you could overcome this issue:

  1. 减小词汇量/语料库大小:尝试例如1M最常用的单词,而不是完整的单词集.这将大大减少嵌入矩阵的大小.

  1. Decrease the vocabulary/corpus size: Try to take e.g. 1M most frequent words instead of full words set. This will drastically decrease the embedding matrix size.

使用生成器而不是Embedding :可以使用生成器将序列转换为单词向量序列,而不使用Embedding.

使用Embedding的线性变换而不是重新训练嵌入-正如您提到的那样,使用标志trainable=False使您的算法正常工作,您可以将其设置为False并添加:

Use linear transformation of Embedding instead of retraining your embedding - as you mentioned that with flag trainable=False made your algorithm working you can set it to False and add:

Dense(new_embedding_size, activation='linear')(embedding)

训练基于现有嵌入的新嵌入.

to train a new embedding based on existing one.

更换设备-如果您的RAM内存很大,则可以尝试以下策略:

Change device - if you have huge RAM memory you can try the following strategy:

with tf.device('/cpu:0'):    
    embedding_layer = Embedding(len(word_index) + 1,
        EMBEDDING_DIM,
        weights=[embedding_matrix],
        input_length=MAX_SEQUENCE_LENGTH,
        trainable=True)
    sequence_input = Input(shape=(MAX_SEQUENCE_LENGTH,), dtype='int32')
    embedded_sequences = embedding_layer(sequence_input)

在此设计中,将使用CPURAM进行Embedding层的计算.缺点是RAMGPU之间的传输可能真的很慢.

In this design computations of Embedding layer would be made using CPU and RAM. The downside is the fact that transfer between RAM and GPU might be really slow.

这篇关于训练神经网络时资源耗尽-Keras的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆