同时运行多个tensorflow会话 [英] Running multiple tensorflow sessions concurrently

查看：194 发布时间：2020/5/24 21:03:29 python parallel-processing python-multiprocessing tensorflow

本文介绍了同时运行多个tensorflow会话的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试在具有64个CPU的CentOS 7计算机上同时运行多个TensorFlow会话.我的同事报告说，他可以使用以下两个代码块使用4个内核在其计算机上产生并行加速:

I am trying to run several sessions of TensorFlow concurrently on a CentOS 7 machine with 64 CPUs. My colleague reports that he can use the following two blocks of code to produce a parallel speedup on his machine using 4 cores:

mnist.py

import numpy as np
import input_data
from PIL import Image
import tensorflow as tf
import time


def main(randint):
    print 'Set new seed:', randint
    np.random.seed(randint)
    tf.set_random_seed(randint)
    mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

    # Setting up the softmax architecture
    x = tf.placeholder("float", [None, 784])
    W = tf.Variable(tf.zeros([784, 10]))
    b = tf.Variable(tf.zeros([10]))
    y = tf.nn.softmax(tf.matmul(x, W) + b)

    # Setting up the cost function
    y_ = tf.placeholder("float", [None, 10])
    cross_entropy = -tf.reduce_sum(y_*tf.log(y))
    train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)

    # Initialization 
    init = tf.initialize_all_variables()
    sess = tf.Session(
        config=tf.ConfigProto(
            inter_op_parallelism_threads=1,
            intra_op_parallelism_threads=1
        )
    )
    sess.run(init)

    for i in range(1000):
        batch_xs, batch_ys = mnist.train.next_batch(100)
        sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})

    correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))

    print sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels})

if __name__ == "__main__":
    t1 = time.time()
    main(0)
    t2 = time.time()
    print "time spent: {0:.2f}".format(t2 - t1)

parallel.py

import multiprocessing
import numpy as np

import mnist
import time

t1 = time.time()
p1 = multiprocessing.Process(target=mnist.main,args=(np.random.randint(10000000),))
p2 = multiprocessing.Process(target=mnist.main,args=(np.random.randint(10000000),))
p3 = multiprocessing.Process(target=mnist.main,args=(np.random.randint(10000000),))
p1.start()
p2.start()
p3.start()
p1.join()
p2.join()
p3.join()
t2 = time.time()
print "time spent: {0:.2f}".format(t2 - t1)

特别是，他说他遵守

Running a single process took: 39.54 seconds
Running three processes took: 54.16 seconds

但是，当我运行代码时:

However, when I run the code:

python mnist.py
==> Time spent: 5.14

python parallel.py 
==> Time spent: 37.65

如您所见，通过使用多处理程序，我的速度显着下降，而我的同事却没有.有谁知道为什么会发生这种情况以及如何解决它?

As you can see, I get a significant slowdown by using multiprocessing whereas my colleague does not. Does anyone have any insight as to why this could be occurring and what can be done to fix it?

编辑

这是一些示例输出.请注意，加载数据似乎是并行进行的，但是训练各个模型的结果在输出中会非常有序(并且可以在程序执行时通过查看top中的CPU使用率来验证)

Here is some example output. Notice that loading the data seems to occur in parallel, but training the individual models has a very sequential look in the output (and which can be verified by looking at CPU usage in top as the program executes)

#$ python parallel.py 
Set new seed: 9672406
Extracting MNIST_data/train-images-idx3-ubyte.gz
Set new seed: 4790824
Extracting MNIST_data/train-images-idx3-ubyte.gz
Set new seed: 8011659
Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz
I tensorflow/core/common_runtime/local_device.cc:25] Local device intra op parallelism threads: 1
I tensorflow/core/common_runtime/local_session.cc:45] Local session inter op parallelism threads: 1
0.9136
I tensorflow/core/common_runtime/local_device.cc:25] Local device intra op parallelism threads: 1
I tensorflow/core/common_runtime/local_session.cc:45] Local session inter op parallelism threads: 1
0.9149
I tensorflow/core/common_runtime/local_device.cc:25] Local device intra op parallelism threads: 1
I tensorflow/core/common_runtime/local_session.cc:45] Local session inter op parallelism threads: 1
0.8931
time spent: 41.36

另一个编辑

假设我们希望确认此问题似乎是由TensorFlow引起的，而不是由多处理引起的.我将mnist.py的内容替换为一个大循环，如下所示:

Suppose we wish to confirm that the issue is seemingly with TensorFlow and not with multiprocessing. I replaced the contents of mnist.py with a big loop as follows:

def main(randint):
    c = 0
    for i in xrange(100000000):
        c += i

对于输出:

#$ python mnist.py
==> time spent: 5.16
#$ python parallel.py 
==> time spent: 4.86

因此，我认为这里的问题不在于多重处理本身.

Hence I think the problem here is not with multiprocessing itself.

同时运行多个tensorflow会话 [英] Running multiple tensorflow sessions concurrently

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

同时运行多个tensorflow会话 [英] Running multiple tensorflow sessions concurrently

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭