TF slice_input_producer不使张量保持同步 [英] TF slice_input_producer not keeping tensors in sync
问题描述
我正在将图像读取到我的TF网络中,但是我还需要与之相关的标签.
I'm reading images into my TF network, but I also need the associated labels along with them.
因此,我尝试遵循此答案,但是输出的标签实际上与我的图像不匹配每一批都进来.
So I tried to follow this answer, but the labels that are output don't actually match the images that I'm getting in every batch.
我的图像名称采用dir/3.jpg
格式,因此我只是从图像文件名中提取标签.
The names of my images are in the format dir/3.jpg
, so I just extract the label from the image file name.
truth_filenames_np = ...
truth_filenames_tf = tf.convert_to_tensor(truth_filenames_np)
# get the labels
labels = [f.rsplit("/", 1)[1] for f in truth_filenames_np]
labels_tf = tf.convert_to_tensor(labels)
# *** This line should make sure both input tensors are synced (from my limited understanding)
# My list is also already shuffled, so I set shuffle=False
truth_image_name, truth_label = tf.train.slice_input_producer([truth_filenames_tf, labels_tf], shuffle=False)
truth_image_value = tf.read_file(truth_image_name)
truth_image = tf.image.decode_jpeg(truth_image_value)
truth_image.set_shape([IMAGE_DIM, IMAGE_DIM, 3])
truth_image = tf.cast(truth_image, tf.float32)
truth_image = truth_image/255.0
# Another key step, where I batch them together
truth_images_batch, truth_label_batch = tf.train.batch([truth_image, truth_label], batch_size=mb_size)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(coord=coord)
for i in range(epochs):
print "Epoch ", i
X_truth_batch = truth_images_batch.eval()
X_label_batch = truth_label_batch.eval()
# Here I display all the images in this batch, and then I check which file numbers they actually are.
# BUT, the images that are displayed don't correspond with what is printed by X_label_batch!
print X_label_batch
plot_batch(X_truth_batch)
coord.request_stop()
coord.join(threads)
我做错什么了吗,或者slice_input_producer确实不能确保其输入张量得到同步吗?
Am I doing something wrong, or does the slice_input_producer not actually ensure that its input tensors are synced?
在旁边:
我还注意到,当我从tf.train.batch获得批次时,该批次中的元素在我给它的原始列表中彼此相邻,但是批次顺序不是原始顺序. 示例:如果我的数据是["dir/1.jpg","dir/2.jpg","dir/3.jpg","dir/4.jpg","dir/5.jpg," dir/6 .jpg],那么我可以得到批处理(batch_size = 2)[" dir/3.jpg," dir/4.jpg],然后得到批处理[" dir/1.jpg," dir/2 .jpg],然后是最后一个. 因此,由于订单与批处理订单不匹配,因此甚至很难为标签使用FIFO队列.
I also noticed that when I get a batch from tf.train.batch, the elements in the batch are adjacent to each other in the original list I gave it, but the batch order isn't in the original order. Example: If my data is ["dir/1.jpg", "dir/2.jpg", "dir/3.jpg", "dir/4.jpg", "dir/5.jpg, "dir/6.jpg"], then I may get the batch (with batch_size=2) ["dir/3.jpg", "dir/4.jpg"], then batch ["dir/1.jpg", "dir/2.jpg"], and then the last one. So this makes it hard to even just use a FIFO queue for the labels since the order won't match the batch order.
推荐答案
以下是一个完整的可运行示例,它再现了该问题:
Here is a complete runnable example that reproduces the problem:
import tensorflow as tf
truth_filenames_np = ['dir/%d.jpg' % j for j in range(66)]
truth_filenames_tf = tf.convert_to_tensor(truth_filenames_np)
# get the labels
labels = [f.rsplit("/", 1)[1] for f in truth_filenames_np]
labels_tf = tf.convert_to_tensor(labels)
# My list is also already shuffled, so I set shuffle=False
truth_image_name, truth_label = tf.train.slice_input_producer(
[truth_filenames_tf, labels_tf], shuffle=False)
# # Another key step, where I batch them together
# truth_images_batch, truth_label_batch = tf.train.batch(
# [truth_image_name, truth_label], batch_size=11)
epochs = 7
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(coord=coord)
for i in range(epochs):
print("Epoch ", i)
X_truth_batch = truth_image_name.eval()
X_label_batch = truth_label.eval()
# Here I display all the images in this batch, and then I check
# which file numbers they actually are.
# BUT, the images that are displayed don't correspond with what is
# printed by X_label_batch!
print(X_truth_batch)
print(X_label_batch)
coord.request_stop()
coord.join(threads)
打印的内容是:
Epoch 0
b'dir/0.jpg'
b'1.jpg'
Epoch 1
b'dir/2.jpg'
b'3.jpg'
Epoch 2
b'dir/4.jpg'
b'5.jpg'
Epoch 3
b'dir/6.jpg'
b'7.jpg'
Epoch 4
b'dir/8.jpg'
b'9.jpg'
Epoch 5
b'dir/10.jpg'
b'11.jpg'
Epoch 6
b'dir/12.jpg'
b'13.jpg'
因此,基本上每个eval调用都会再次运行该操作!添加批处理并没有什么不同-只是打印批处理(前11个文件名,后11个标签,依此类推)
So basically each eval call runs the operation another time ! Adding the batching does not make a difference to that - just prints batches (the first 11 filenames followed by the next 11 labels and so on)
我看到的解决方法是:
for i in range(epochs):
print("Epoch ", i)
pair = tf.convert_to_tensor([truth_image_name, truth_label]).eval()
print(pair[0])
print(pair[1])
可以正确打印:
Epoch 0
b'dir/0.jpg'
b'0.jpg'
Epoch 1
b'dir/1.jpg'
b'1.jpg'
# ...
但是对于违反最不惊奇原则的行为却无能为力.
but does nothing for the violation of the principle of the least surprise.
编辑:另一种实现方式:
import tensorflow as tf
truth_filenames_np = ['dir/%d.jpg' % j for j in range(66)]
truth_filenames_tf = tf.convert_to_tensor(truth_filenames_np)
labels = [f.rsplit("/", 1)[1] for f in truth_filenames_np]
labels_tf = tf.convert_to_tensor(labels)
truth_image_name, truth_label = tf.train.slice_input_producer(
[truth_filenames_tf, labels_tf], shuffle=False)
epochs = 7
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
tf.train.start_queue_runners(sess=sess)
for i in range(epochs):
print("Epoch ", i)
X_truth_batch, X_label_batch = sess.run(
[truth_image_name, truth_label])
print(X_truth_batch)
print(X_label_batch)
这是更好的方法,因为tf.convert_to_tensor
并且co只接受相同类型/形状等的张量.
That's a much better way as tf.convert_to_tensor
and co only accept tensors of same type/shape etc.
请注意,为简单起见,我删除了协调器,但是会导致警告:
Note that I removed the coordinator for simplicity, which however results in a warning:
W c:\ tf_jenkins \ home \ workspace \ release-win \ device \ cpu \ os \ windows \ tensorflow \ core \ kernels \ queue_base.cc:294] _0_input_producer/input_producer/fraction_of_32_full/fraction_of_32_full:跳过已取消的入队尝试队列未关闭
W c:\tf_jenkins\home\workspace\release-win\device\cpu\os\windows\tensorflow\core\kernels\queue_base.cc:294] _0_input_producer/input_producer/fraction_of_32_full/fraction_of_32_full: Skipping cancelled enqueue attempt with queue not closed
请参见此
这篇关于TF slice_input_producer不使张量保持同步的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!