使用 input_producer/limit_epochs/epochs:0 局部变量跨多个线程访问纪元值 [英] Accessing epoch value across multiple threads using input_producer/limit_epochs/epochs:0 local variable

查看:22
本文介绍了使用 input_producer/limit_epochs/epochs:0 局部变量跨多个线程访问纪元值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试在使用多个 cpu 线程读取数据时提取当前纪元号.然而,在试用代码期间,我观察到一个没有任何意义的输出.考虑下面的代码:

I tried to extract the current epoch number while reading data using multiple cpu threads. However during a trial code I observed an output which did not make any sense. Consider the code below :

with tf.Session() as sess:
        train_filename_queue = tf.train.string_input_producer(trainimgs, num_epochs=4, shuffle=True)
        value = train_filename_queue.dequeue()
        init_op = tf.group(tf.global_variables_initializer(), tf.local_variables_initializer())
        sess.run(init_op)
        coord = tf.train.Coordinator()
        tf.train.start_queue_runners(coord=coord)
        collections = [v.name for v in tf.get_collection(tf.GraphKeys.LOCAL_VARIABLES,\
                                                         scope='input_producer/limit_epochs/epochs:0')]
        print(collections)

        threads = [threading.Thread(target=work, args=(coord, value, sess, collections)) for i in \
                   range(20)]
        for t in threads:
            t.start()
        coord.join(threads)
        coord.request_stop()

work 函数定义如下:

def work(coord, val, sess, collections):
    counter = 0
    while not coord.should_stop():
        try:
            epoch = sess.run(collections[0])
            filename = sess.run(val).decode(encoding='UTF-8')
            print(filename + ' ' + str(epoch))
        except tf.errors.OutOfRangeError:
            coord.request_stop()
    return None

我得到的输出如下:

I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties:
name: GeForce GTX TITAN X
major: 5 minor: 2 memoryClockRate (GHz) 1.076
pciBusID 0000:84:00.0
Total memory: 11.92GiB
Free memory: 11.80GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0:   Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX TITAN X, pci bus id: 0000:84:00.0)
I tensorflow/compiler/xla/service/platform_util.cc:58] platform CUDA present with 1 visible devices
I tensorflow/compiler/xla/service/platform_util.cc:58] platform Host present with 20 visible devices
I tensorflow/compiler/xla/service/service.cc:180] XLA service executing computations on platform Host. Devices:
I tensorflow/compiler/xla/service/service.cc:187]   StreamExecutor device (0): <undefined>, <undefined>
I tensorflow/compiler/xla/service/platform_util.cc:58] platform CUDA present with 1 visible devices
I tensorflow/compiler/xla/service/platform_util.cc:58] platform Host present with 20 visible devices
I tensorflow/compiler/xla/service/service.cc:180] XLA service executing computations on platform CUDA. Devices:
I tensorflow/compiler/xla/service/service.cc:187]   StreamExecutor device (0): GeForce GTX TITAN X, Compute Capability 5.2
['input_producer/limit_epochs/epochs:0']
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_4760.JPEG 0 2
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_703.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_11768.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_3271.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_1015.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_730.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_1945.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_3149.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_4209.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_40.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_11768.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_4760.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_703.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_4209.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_40.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_730.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_3271.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_1015.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_3149.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_1945.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_40.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_4209.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_730.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_1945.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_4760.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_3271.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_703.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_1015.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_11768.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_3149.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_4209.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_11768.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_4760.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_730.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_703.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_3149.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_3271.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_1945.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_1015.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_40.JPEG 0 4

每行最后一个数字对应input_producer/limit_epochs/epochs:0'局部变量的值.

The last number in each line corresponds to the value of input_producer/limit_epochs/epochs:0' local variable.

对于第一次试验,我只在队列中保留了 10 张图像,这意味着我应该得到总共 40 行的输出,这是我得到的.

For a first trial, I kept only 10 images in the queue meaning I should get a total of 40 lines of output, which I get.

  • 但是,我应该得到相同数量的 1,2,3 和 4 作为每行中的最后一个字符,因为每个文件名都应该在 4 个时期的每个时期中提取.

为什么我在所有行中得到相同的数字 4?

Why am I getting the same number 4 in all the lines ?

更多信息

  • 我尝试使用 range(1)(对于单个线程),结果仍然相同.
  • 不要理会数字0".它只是对应文件的标签.我以这种方式保存了图像文件名.

推荐答案

我做了很多实验,最后总结如下:

I did a lot of experiments and finally concluded the following :

我曾经相信 -

tf.train.string_input_producer() 将队列按历元排列.
这意味着,第一个完整的纪元被排队(在多个
如果容量小于文件名的数量,则暂存)然后
进一步的纪元被排入队列.

tf.train.string_input_producer() enqueues a queue epoch-wise.
Meaning that, first one complete epoch is enqueued (in multiple
stages if capacity is less than the number of filenames) and then
further epochs are enqueued.

事实并非如此.

tf.start_queue_runners()被执行时,所有的epochs一起排队(如果容量小于 number,则分多个阶段文件名).tf.train.string_input_producer 使用局部变量 epochs:0 来维护正在排队的 epoch.一旦 epochs:0 到达 num_epochs,它就会保持不变,无论有多少线程从队列中出队,它都不会改变.

When tf.start_queue_runners() is executed, all the epochs are enqueued together (in multiple stages if capacity is less than number of filenames). The local variable epochs:0 is used by tf.train.string_input_producer to maintain the epoch that is being enqueued. Once epochs:0 reaches num_epochs, it remains constant and no matter how many threads are dequeuing from the queue, it does not change.

当您捕获 epochs:0 的值时,它会为您提供计数器 epochs 的瞬时值,并告诉您当时数据集的哪个时期是被排队.它不会告诉您正在出列的数据集的哪个时期.

When you capture the value of epochs:0 it gives you the instantaneous value of the counter epochs and it tells you that at that time which epoch of the dataset is being enqueued. It does not tell you that which epoch of the dataset are you dequeuing.

所以从 epochs:0 local_variable 获取当前 epoch 的值是一个坏主意.

So it is a bad idea to get the value of the current epoch from the epochs:0 local_variable.

这篇关于使用 input_producer/limit_epochs/epochs:0 局部变量跨多个线程访问纪元值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆