将目录中的图像加载为 Tensorflow 数据集 [英] Loading Images in a Directory As Tensorflow Data set

查看:35
本文介绍了将目录中的图像加载为 Tensorflow 数据集的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对 ML 比较陌生,对 TensorfFlow 也很陌生.我花了很多时间在 TensorFlow MINST 教程以及 https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/how_tos/reading_data 尝试弄清楚如何读取我自己的数据,但我有点困惑.

I'm relatively new to ML and very much new to TensorfFlow. I've spent quite a bit of time on the TensorFlow MINST tutorial as well as https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/how_tos/reading_data to try and figure out how to read my own data, but I'm getting a bit confused.

我在/images/0_Non/目录中有一堆图像 (.png).我正在尝试将它们制作成一个 TensorFlow 数据集,这样我就可以基本上在第一遍运行 MINST 教程中的内容.

I have a bunch of images (.png) in a directory /images/0_Non/. I'm trying to make these into a TensorFlow Data set so then I can basically run the stuff from the MINST tutorial on it as a first pass.

import tensorflow as tf

# Make a queue of file names including all the JPEG images files in the relative image directory.
filename_queue = tf.train.string_input_producer(tf.train.match_filenames_once("../images/0_Non/*.png"))

image_reader = tf.WholeFileReader()

# Read a whole file from the queue, the first returned value in the tuple is the filename which we are ignoring.
_, image_file = image_reader.read(filename_queue)

image = tf.image.decode_png(image_file)

# Start a new session to show example output.
with tf.Session() as sess:
    # Required to get the filename matching to run.
    tf.initialize_all_variables().run()

    # Coordinate the loading of image files.
    coord = tf.train.Coordinator()
    threads = tf.train.start_queue_runners(coord=coord)

    # Get an image tensor and print its value.
    image_tensor = sess.run([image])
    print(image_tensor)

    # Finish off the filename queue coordinator.
    coord.request_stop()
    coord.join(threads)

我有点不明白这里发生了什么.所以看起来 image 是一个张量而 image_tensor 是一个 numpy 数组?

I'm having a bit of trouble understanding what's going on here. So it seems like image is a tensor and image_tensor is an numpy array?

如何将我的图像放入数据集?我还尝试遵循 Iris 示例,该示例用于将我带到这里的 CSV 示例:https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/learn/python/learn/datasets/base.py,但不是不知道如何让它适用于我有一堆 png 的情况.

How do I get my images into a data set? I also tried following along the Iris example which is for a CSV which brought me to here: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/learn/python/learn/datasets/base.py, but wasn't sure how to get this to work for my case where I have a bunch of png's.

谢谢!

推荐答案

最近添加的 tf.数据 API 可以更轻松地做到这一点:

The recently added tf.data API makes it easier to do this:

import tensorflow as tf

# Make a Dataset of file names including all the PNG images files in
# the relative image directory.
filename_dataset = tf.data.Dataset.list_files("../images/0_Non/*.png")

# Make a Dataset of image tensors by reading and decoding the files.
image_dataset = filename_dataset.map(lambda x: tf.decode_png(tf.read_file(x)))

# NOTE: You can add additional transformations, like 
# `image_dataset.batch(BATCH_SIZE)` or `image_dataset.repeat(NUM_EPOCHS)`
# in here.

iterator = image_dataset.make_one_shot_iterator()
next_image = iterator.get_next()

# Start a new session to show example output.
with tf.Session() as sess:

  try:

    while True:
      # Get an image tensor and print its value.
      image_array = sess.run([next_image])
      print(image_tensor)

  except tf.errors.OutOfRangeError:
    # We have reached the end of `image_dataset`.
    pass

请注意,对于训练,您需要从某处获取标签.Dataset.zip() 转换是一种将 image_dataset 与来自不同来源的标签数据集组合在一起的可能方法.

Note that for training you will need to get labels from somewhere. The Dataset.zip() transformation is a possible way to combine together image_dataset with a dataset of labels from a different source.

这篇关于将目录中的图像加载为 Tensorflow 数据集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆