TensorFlow-在fit_generator中使用class_weights会导致内存泄漏 [英] TensorFlow - Using class_weights in fit_generator causes memory leak

查看:359
本文介绍了TensorFlow-在fit_generator中使用class_weights会导致内存泄漏的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在TensorFlow中,在fit_generator中使用class_weights时,导致训练过程持续消耗越来越多的CPU RAM,直到耗尽为止.在每个时期之后,内存使用量都会逐步增加.有关可重现的示例,请参见下文.为了使可重现的示例保持较小,我减小了数据集的大小和批处理大小,这表明内存增加的趋势.在训练我的实际数据时,它消耗了70个EPOCS的全部128GB RAM.

In TensorFlow, when using class_weights in fit_generator causes the training process to continuously consume more and more CPU RAM until depletion. There is a stepped increased in memory usage after each epoch. See below for the reproducible example. To keep the reproducible example small, I decreased the size of the dataset and batch size, which shows the trend of increasing memory. While training with my actual data, it depletes the full 128GB RAM by 70 EPOCS.

有人遇到这个问题或对此有任何建议吗?我的数据具有不平衡的数据,因此我必须使用class_weights,但是我不能为此长时间进行训练.

Anyone ran into this problem or have any suggestions on this? My data has unbalanced data so I have to use class_weights but I cannot run the training for long with this.

在下面的代码示例中,如果您注释掉了类权重,则程序将在不消耗内存的情况下进行训练.

In the code sample below, if you comment out the class weights, the program trains without depleting memory.

第一张图片显示了使用class_weights的内存使用情况,第二张图片显示了不使用class_weights的使用情况.

First image shows memory usage with class_weights while second one shows usage without class_weights.

import tensorflow as tf
tf.enable_eager_execution()
import numpy as np

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import CuDNNLSTM, Dense
from tensorflow.keras.optimizers import Adadelta


feature_count = 25
batch_size = 16
look_back = 5
target_groups = 10

def random_data_generator( ):
    x_data_size =(batch_size, look_back, feature_count) # batches, lookback, features
    x_data = np.random.uniform(low=-1.0, high=5, size=x_data_size)

    y_data_size = (batch_size, target_groups)
    Y_data = np.random.randint(low=1, high=21, size=y_data_size)

    return x_data, Y_data

def get_simple_Dataset_generator():        
    while True:
        yield random_data_generator()

def build_model():
    model = Sequential()
    model.add(CuDNNLSTM(feature_count,
                    batch_input_shape=(batch_size,look_back, feature_count),
                    stateful=False))  
    model.add(Dense(target_groups, activation='softmax'))
    optimizer = Adadelta(learning_rate=1.0, epsilon=None) 
    model.compile(loss='categorical_crossentropy', optimizer=optimizer) 
    return model


def run_training():

    model = build_model()
    train_generator = get_simple_Dataset_generator()
    validation_generator = get_simple_Dataset_generator()
    class_weights = {0:2, 1:8, 2:1, 3:4, 4:8, 5:35, 6:30, 7:4, 8:5, 9:3}

    model.fit_generator(generator = train_generator,
            steps_per_epoch=1,
            epochs=1000,            
            verbose=2,
            validation_data=validation_generator,
            validation_steps=20,
            max_queue_size = 10,
            workers = 0, 
            use_multiprocessing = False,
            class_weight = class_weights
            )

if __name__ == '__main__': 
    run_training()

推荐答案

对于任何将来的用户,夜间版本中似乎都存在一个错误,此错误似乎在以后的夜间版本中已得到修复.错误报告中有更多详细信息.

For any future users, there seems to be a bug in the nightly build which seems to be fixed in the subsequent nightly builds. More details here in bug report.

https://github.com/tensorflow/tensorflow/issues/31253

这篇关于TensorFlow-在fit_generator中使用class_weights会导致内存泄漏的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆