执行模型后清除Tensorflow GPU内存 [英] Clearing Tensorflow GPU memory after model execution

查看:979
本文介绍了执行模型后清除Tensorflow GPU内存的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经训练了3个模型,现在正在运行代码,依次加载3个检查点中的每一个并使用它们运行预测.我正在使用GPU.

I've trained 3 models and am now running code that loads each of the 3 checkpoints in sequence and runs predictions using them. I'm using the GPU.

在加载第一个模型时,它会预先分配整个GPU内存(我要用于处理第一批数据).但是它不会在完成时卸载内存.加载第二个模型时,同时使用tf.reset_default_graph()with tf.Graph().as_default()仍会从第一个模型中完全消耗GPU内存,然后使第二个模型的内存不足.

When the first model is loaded it pre-allocates the entire GPU memory (which I want for working through the first batch of data). But it doesn't unload memory when it's finished. When the second model is loaded, using both tf.reset_default_graph() and with tf.Graph().as_default() the GPU memory still is fully consumed from the first model, and the second model is then starved of memory.

除了使用Python子进程或多处理程序解决该问题(我通过Google搜索找到的唯一解决方案)以外,还有其他方法可以解决此问题吗?

Is there a way to resolve this, other than using Python subprocesses or multiprocessing to work around the problem (the only solution I've found on via google searches)?

推荐答案

2016年6月发布的git问题( https://github.com/tensorflow/tensorflow/issues/1727 )表示存在以下问题:

A git issue from June 2016 (https://github.com/tensorflow/tensorflow/issues/1727) indicates that there is the following problem:

目前,GPUDevice中的分配器属于ProcessState, 这本质上是全局单例.第一次使用GPU 对其进行初始化,并在进程关闭时释放自身.

currently the Allocator in the GPUDevice belongs to the ProcessState, which is essentially a global singleton. The first session using GPU initializes it, and frees itself when the process shuts down.

因此,唯一的解决方法是使用进程并在计算后将其关闭.

Thus the only workaround would be to use processes and shut them down after the computation.

示例代码:

import tensorflow as tf
import multiprocessing
import numpy as np

def run_tensorflow():

    n_input = 10000
    n_classes = 1000

    # Create model
    def multilayer_perceptron(x, weight):
        # Hidden layer with RELU activation
        layer_1 = tf.matmul(x, weight)
        return layer_1

    # Store layers weight & bias
    weights = tf.Variable(tf.random_normal([n_input, n_classes]))


    x = tf.placeholder("float", [None, n_input])
    y = tf.placeholder("float", [None, n_classes])
    pred = multilayer_perceptron(x, weights)

    cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=y))
    optimizer = tf.train.AdamOptimizer(learning_rate=0.001).minimize(cost)

    init = tf.global_variables_initializer()

    with tf.Session() as sess:
        sess.run(init)

        for i in range(100):
            batch_x = np.random.rand(10, 10000)
            batch_y = np.random.rand(10, 1000)
            sess.run([optimizer, cost], feed_dict={x: batch_x, y: batch_y})

    print "finished doing stuff with tensorflow!"


if __name__ == "__main__":

    # option 1: execute code with extra process
    p = multiprocessing.Process(target=run_tensorflow)
    p.start()
    p.join()

    # wait until user presses enter key
    raw_input()

    # option 2: just execute the function
    run_tensorflow()

    # wait until user presses enter key
    raw_input()

因此,如果您在创建的进程中调用函数run_tensorflow()并关闭该进程(选项1),则会释放内存.如果仅运行run_tensorflow()(选项2),则在函数调用后内存不会释放.

So if you would call the function run_tensorflow() within a process you created and shut the process down (option 1), the memory is freed. If you just run run_tensorflow() (option 2) the memory is not freed after the function call.

这篇关于执行模型后清除Tensorflow GPU内存的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆