从梯度/损失计算中解耦出队操作 [英] Decouple dequeue operation from gradient/loss computation

查看:25
本文介绍了从梯度/损失计算中解耦出队操作的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

为了支持更大的数据集,我目前正在尝试停止使用提要并开始使用队列.使用队列对于 tensorflow 中的优化器来说效果很好,因为它们只为每个出队操作评估一次梯度.但是,我已经实现了与其他执行线搜索的优化器的接口,我不仅需要评估梯度,还需要评估同一批次的多个点的损失.不幸的是,在正常的排队系统中,每次损失评估都会执行一次出队,而不是多次为同一批次计算.

I'm currently trying to move away from using feeds and start using queues in order to support larger datasets. Using queues works fine for the optimizers in tensorflow, since they only evaluate the gradient once for each dequeue operation. However, I have implemented interfaces with other optimizers that perform line searches, and I need to evaluate not just the gradient, but also the loss at multiple points for the same batch. Unfortunately, with the normal queueing system each loss evaluation will execute a dequeue instead of computing for the same batch several times.

有没有办法将出队操作与梯度/损失计算解耦,这样我可以执行一次出队,然后在当前批次上执行多次梯度/损失计算?

Is there a way to decouple the dequeuing operation from the gradient/loss computation in such a way that I can execute dequeue once and then execute the gradient/loss computation several times on the current batch?

请注意,我的输入张量的大小在批次之间是可变的.我们使用分子数据,每个分子都有不同数量的原子.这与图像数据完全不同,图像数据通常缩放到具有相同的尺寸.

Please note that the size of my input tensor is variable between batches. We work with molecular data, and each molecule has a different number of atoms. This is quite different from image data, where everything is typically scaled to have identical dimensions.

推荐答案

通过创建一个存储出队值的变量来解耦它,然后依赖这个变量而不是出队操作.推进队列发生在 assign

Decouple it by creating a variable stores dequeued value, and then depend on this variable instead of dequeue op. Advancing the queue happens during assign

解决方案#1:固定大小的数据,使用变量

Solution #1: fixed size data, use Variables

(image_batch_live,) = tf.train.batch([image],batch_size=5,num_threads=1,capacity=614)

image_batch = tf.Variable(
  tf.zeros((batch_size, image_size, image_size, color_channels)),
  trainable=False,
  name="input_values_cached")

advance_batch = tf.assign(image_batch, image_batch_live)

现在 image_batch 给出队列的最新值而不推进队列,advance_batch 推进队列.

Now image_batch gives latest value of queue without advancing it and advance_batch advances the queue.

解决方案#2:可变大小的数据,使用持久张量

Solution #2: variable size data, use persistent Tensors

这里我们通过引入dequeue_opdequeue_op2 来解耦工作流.所有计算都依赖于 dequeue_op2,它被提供给 dequeue_op 的保存值.使用 get_session_tensor/get_session_handle 可确保实际数据保留在 TensorFlow 运行时中,并且通过 feed_dict 传递的值是一个短字符串标识符.API 有点尴尬,因为 dummy_handle,我提出了这个问题 这里

Here we decouple the workflow by introducing dequeue_op and dequeue_op2. All computation depends on dequeue_op2 which is fed the saved value of dequeue_op. Using get_session_tensor/get_session_handle ensures that actual data remains in TensorFlow runtime and the value that's passed through feed_dict is a short string identifier. The API is a little awkward because of dummy_handle, I've brought up this issue here

import tensorflow as tf
def create_session():
    sess = tf.InteractiveSession(config=tf.ConfigProto(operation_timeout_in_ms=3000))
    return sess

tf.reset_default_graph()

sess = create_session()
dt = tf.int32
dummy_handle = sess.run(tf.get_session_handle(tf.constant(1)))
q = tf.FIFOQueue(capacity=20, dtypes=[dt])
enqueue_placeholder = tf.placeholder(dt, shape=[None])
enqueue_op = q.enqueue(enqueue_placeholder)
dequeue_op = q.dequeue()
size_op = q.size()

dequeue_handle_op = tf.get_session_handle(dequeue_op)
dequeue_placeholder, dequeue_op2 = tf.get_session_tensor(dummy_handle, dt)
compute_op1 = tf.reduce_sum(dequeue_op2)
compute_op2 = tf.reduce_sum(dequeue_op2)+1


# fill queue with variable size data
for i in range(10):
    sess.run(enqueue_op, feed_dict={enqueue_placeholder:[1]*(i+1)})
sess.run(q.close())

try:
    while(True):
        dequeue_handle = sess.run(dequeue_handle_op) # advance the queue
        val1 = sess.run(compute_op1, feed_dict={dequeue_placeholder: dequeue_handle.handle})
        val2 = sess.run(compute_op2, feed_dict={dequeue_placeholder: dequeue_handle.handle})
        size = sess.run(size_op)
        print("val1 %d, val2 %d, queue size %d" % (val1, val2, size))
except tf.errors.OutOfRangeError:
    print("Done")

当你运行它时,你应该会看到类似下面的内容

You should see something like below when you run it

val1 1, val2 2, queue size 9
val1 2, val2 3, queue size 8
val1 3, val2 4, queue size 7
val1 4, val2 5, queue size 6
val1 5, val2 6, queue size 5
val1 6, val2 7, queue size 4
val1 7, val2 8, queue size 3
val1 8, val2 9, queue size 2
val1 9, val2 10, queue size 1
val1 10, val2 11, queue size 0
Done

这篇关于从梯度/损失计算中解耦出队操作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆