TensorFlow/Keras多线程模型拟合 [英] TensorFlow/Keras multi-threaded model fitting

查看:362
本文介绍了TensorFlow/Keras多线程模型拟合的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用多个线程(和tensorflow后端)训练具有不同参数值的多个keras模型.我已经看到了在多个线程中使用同一模型的一些示例,但是在这种特殊情况下,我遇到了有关冲突图等的各种错误.这是我想做的一个简单示例:

I'm attempting to train multiple keras models with different parameter values using multiple threads (and the tensorflow backend). I've seen a few examples of using the same model within multiple threads, but in this particular case, I run into various errors regarding conflicting graphs, etc. Here's a simple example of what I'd like to be able to do:

from concurrent.futures import ThreadPoolExecutor
import numpy as np
import tensorflow as tf
from keras import backend as K
from keras.layers import Dense
from keras.models import Sequential


sess = tf.Session()


def example_model(size):
    model = Sequential()
    model.add(Dense(size, input_shape=(5,)))
    model.add(Dense(1))
    model.compile(optimizer='sgd', loss='mse')
    return model


if __name__ == '__main__':
    K.set_session(sess)
    X = np.random.random((10, 5))
    y = np.random.random((10, 1))
    models = [example_model(i) for i in range(5, 10)]

    e = ThreadPoolExecutor(4)
    res_list = [e.submit(model.fit, X, y) for model in models]

    for res in res_list:
        print(res.result())

产生的错误为ValueError: Tensor("Variable:0", shape=(5, 5), dtype=float32_ref) must be from the same graph as Tensor("Variable_2/read:0", shape=(), dtype=float32)..我也尝试过在线程内初始化模型,这会导致类似的失败.

The resulting error is ValueError: Tensor("Variable:0", shape=(5, 5), dtype=float32_ref) must be from the same graph as Tensor("Variable_2/read:0", shape=(), dtype=float32).. I've also tried initializing the models within the threads which gives a similar failure.

是否有关于最佳解决方案的想法?我完全不喜欢这个确切的结构,但是我更希望能够使用多个线程而不是进程,因此所有模型都在相同的GPU内存分配中训练.

Any thoughts on the best way to go about this? I'm not at all attached to this exact structure, but I'd prefer to be able to use multiple threads rather than processes so all the models are trained within the same GPU memory allocation.

推荐答案

Tensorflow图不是线程安全的(请参见

Tensorflow Graphs are not threadsafe (see https://www.tensorflow.org/api_docs/python/tf/Graph) and when you create a new Tensorflow Session, it by default uses the default graph.

您可以通过在并行化函数中创建带有新图的新会话并在其中构造您的keras模型来解决此问题.

You can get around this by creating a new session with a new graph in your parallelized function and constructing your keras model there.

以下是一些代码,可在每个可用的GPU上并行创建并拟合模型:

Here is some code that creates and fits a model on each available gpu in parallel:

import concurrent.futures
import numpy as np

import keras.backend as K
from keras.layers import Dense
from keras.models import Sequential

import tensorflow as tf
from tensorflow.python.client import device_lib

def get_available_gpus():
    local_device_protos = device_lib.list_local_devices()
    return [x.name for x in local_device_protos if x.device_type == 'GPU']

xdata = np.random.randn(100, 8)
ytrue = np.random.randint(0, 2, 100)

def fit(gpu):
    with tf.Session(graph=tf.Graph()) as sess:
        K.set_session(sess)
        with tf.device(gpu):
            model = Sequential()
            model.add(Dense(12, input_dim=8, activation='relu'))
            model.add(Dense(8, activation='relu'))
            model.add(Dense(1, activation='sigmoid'))

            model.compile(loss='binary_crossentropy', optimizer='adam')
            model.fit(xdata, ytrue, verbose=0)

            return model.evaluate(xdata, ytrue, verbose=0)

gpus = get_available_gpus()
with concurrent.futures.ThreadPoolExecutor(len(gpus)) as executor:
    results = [x for x in executor.map(fit, gpus)]
print('results: ', results)

这篇关于TensorFlow/Keras多线程模型拟合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆