Keras在每个时期占用的内存量会无限增加 [英] Keras occupies an indefinitely increasing amount of memory for each epoch

查看:632
本文介绍了Keras在每个时期占用的内存量会无限增加的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在运行一种遗传超参数搜索算法,它会迅速使所有可用内存饱和.

经过几次测试后,在不同时期之间以及在训练不同模型时,keras所需的内存量似乎都增加了.随着minibatch大小的增加,问题变得更加严重,minibatch的大小为1〜5至少给了我足够的时间来查看内存使用率在最初的几次拟合中确实迅速增加,然后随着时间的推移缓慢但稳定地增加. /p>

我已经检查过 keras无限期地预测内存交换增加 Keras:执行超参数网格搜索时内存不足 ,以及 Keras(TensorFlow,CPU):训练顺序模型循环会消耗内存,所以我已经在每次迭代后清除keras会话并重置tensorflow的图.

我还尝试显式删除模型和历史记录对象并运行gc.collect(),但无济于事.

我在CPU上运行Keras 2.2.4,tensorflow 1.12.0,Python 3.7.0.我为每个基因运行的代码以及用于测量内存使用情况的回调:

import tensorflow as tf
import keras as K

class MemoryCallback(K.callbacks.Callback):
    def on_epoch_end(self, epoch, log={}):
        print(resource.getrusage(resource.RUSAGE_SELF).ru_maxrss)


def Rateme(self,loss,classnum,patience,epochs,DWIshape,Mapshape,lr,TRAINDATA,TESTDATA,TrueTrain, TrueTest,ModelBuilder,maxthreads):

K.backend.set_session(K.backend.tf.Session(config=K.backend.tf.ConfigProto(intra_op_parallelism_threads=maxthreads, inter_op_parallelism_threads=maxthreads)))

#Early Stopping
STOP=K.callbacks.EarlyStopping(monitor='val_acc', min_delta=0.001,
                               patience=patience, verbose=0, mode='max')
#Build model
Model=ModelBuilder(DWIshape, Mapshape, dropout=self.Dropout,
                      regularization=self.Regularization,
                      activ='relu', DWIconv=self.nDWI, DWIsize=self.sDWI,
                      classes=classnum, layers=self.nCNN,
                      filtersize=self.sCNN,
                      FClayers=self.FCL, last=self.Last)
#Compile
Model.compile(optimizer=K.optimizers.Adam(lr,decay=self.Decay), loss=loss, metrics=['accuracy'])
#Fit
his=Model.fit(x=TRAINDATA,y=TrueTrain,epochs=epochs,batch_size=5, shuffle=True, validation_data=(TESTDATA,TrueTest), verbose=0, callbacks=[STOP, MemoryCallback()]) #check verbose and callbacks
#Extract 
S=Model.evaluate(x=TESTDATA, y=TrueTest,verbose=1)[1]
del his
del Model
del rateme
K.backend.clear_session()
tf.reset_default_graph()
gc.collect()

return S

解决方案

最后,我只是使用bash脚本重新启动了每次训练之间的python会话,找不到更好的方法来避免内存占用爆炸 >

I'm running a genetic hyperparameter search algorithm and it quickly saturates all available memory.

After a few tests it looks like the amount of memory required by keras increases both between different epochs and when training different models. The problem becomes a lot worse as the minibatch size increases, a minibatch size of 1~5 at least gives me enough time to see the memory usage rise up really fast in the first few fits and then slowly but steadily keep increasing over time.

I already checked keras predict memory swap increase indefinitely, Keras: Out of memory when doing hyper parameter grid search, and Keras (TensorFlow, CPU): Training Sequential models in loop eats memory, so I am already clearing keras session and resetting tensorflow's graph after each iteration.

I also tried explicitly deleting the model and history object and running gc.collect() but to no avail.

Im running Keras 2.2.4, tensorflow 1.12.0, Python 3.7.0 on CPU. The code I'm running for each gene and the callback I'm using to measure the memory usage:

import tensorflow as tf
import keras as K

class MemoryCallback(K.callbacks.Callback):
    def on_epoch_end(self, epoch, log={}):
        print(resource.getrusage(resource.RUSAGE_SELF).ru_maxrss)


def Rateme(self,loss,classnum,patience,epochs,DWIshape,Mapshape,lr,TRAINDATA,TESTDATA,TrueTrain, TrueTest,ModelBuilder,maxthreads):

K.backend.set_session(K.backend.tf.Session(config=K.backend.tf.ConfigProto(intra_op_parallelism_threads=maxthreads, inter_op_parallelism_threads=maxthreads)))

#Early Stopping
STOP=K.callbacks.EarlyStopping(monitor='val_acc', min_delta=0.001,
                               patience=patience, verbose=0, mode='max')
#Build model
Model=ModelBuilder(DWIshape, Mapshape, dropout=self.Dropout,
                      regularization=self.Regularization,
                      activ='relu', DWIconv=self.nDWI, DWIsize=self.sDWI,
                      classes=classnum, layers=self.nCNN,
                      filtersize=self.sCNN,
                      FClayers=self.FCL, last=self.Last)
#Compile
Model.compile(optimizer=K.optimizers.Adam(lr,decay=self.Decay), loss=loss, metrics=['accuracy'])
#Fit
his=Model.fit(x=TRAINDATA,y=TrueTrain,epochs=epochs,batch_size=5, shuffle=True, validation_data=(TESTDATA,TrueTest), verbose=0, callbacks=[STOP, MemoryCallback()]) #check verbose and callbacks
#Extract 
S=Model.evaluate(x=TESTDATA, y=TrueTest,verbose=1)[1]
del his
del Model
del rateme
K.backend.clear_session()
tf.reset_default_graph()
gc.collect()

return S

解决方案

In the end I just restarted the python session between each training sessions with a bash script, couldn't find a better way to avoid an exploding memory footprint

这篇关于Keras在每个时期占用的内存量会无限增加的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆