TensorFlow tfrecords:tostring() 改变图像的维度 [英] TensorFlow tfrecords: tostring() changes dimension of image

查看:32
本文介绍了TensorFlow tfrecords:tostring() 改变图像的维度的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经建立了一个模型来在 TensorFlow 中训练卷积自编码器.我按照 从 TF 文档中读取数据的说明 来读取我自己的 233 大小的图像x 233 x 3.这是根据这些指令改编的我的 convert_to() 函数:

I have built a model to train a convolutional autoencoder in TensorFlow. I followed the instructions on Reading Data from the TF documentation to read in my own images of size 233 x 233 x 3. Here is my convert_to() function adapted from those instructions:

def convert_to(images, name):
  """Converts a dataset to tfrecords."""
  num_examples = images.shape[0]
  rows = images.shape[1]
  cols = images.shape[2]
  depth = images.shape[3]

  filename = os.path.join(FLAGS.tmp_dir, name + '.tfrecords')
  print('Writing', filename)
  writer = tf.python_io.TFRecordWriter(filename)
  for index in range(num_examples):
    print(images[index].size)
    image_raw = images[index].tostring()
    print(len(image_raw))
    example = tf.train.Example(features=tf.train.Features(feature={
        'height': _int64_feature(rows),
        'width': _int64_feature(cols),
        'depth': _int64_feature(depth),
        'image_raw': _bytes_feature(image_raw)}))
    writer.write(example.SerializeToString())
  writer.close()

当我在 for 循环开始打印图像的大小时,大小是 162867,但是当我在 .tostring() 行之后打印时,大小是 1302936.这会导致问题,因为模型认为我的输入是应该的 8 倍.将示例中的image_raw"条目更改为 _int64_feature(image_raw) 还是更改将其转换为字符串的方式更好?

When I print the size of the image at the start of the for loop, the size is 162867, but when I print after the .tostring() line, the size is 1302936. This causes problems down the road because the model thinks my input is 8x what it should be. Is it better to change the 'image_raw' entry in the Example to _int64_feature(image_raw) or to change the way I convert it to a string?

或者,问题可能出在我的 read_and_decode() 函数中,例如字符串未正确解码或示例未解析...?

Alternatively, the problem could be in my read_and_decode() function, e.g. the string is not properly being decoded or the example not being parsed...?

def read_and_decode(self, filename_queue):
    reader = tf.TFRecordReader()

    _, serialized_example = reader.read(filename_queue)
    features = tf.parse_single_example(
        serialized_example,
        features={
            'height': tf.FixedLenFeature([], tf.int64),
            'width': tf.FixedLenFeature([], tf.int64),
            'depth': tf.FixedLenFeature([], tf.int64),
            'image_raw': tf.FixedLenFeature([], tf.string)
      })

    # Convert from a scalar string tensor to a uint8 tensor
    image = tf.decode_raw(features['image_raw'], tf.uint8)

    # Reshape into a 233 x 233 x 3 image and apply distortions
    image = tf.reshape(image, (self.input_rows, self.input_cols, self.num_filters))

    image = data_sets.normalize(image)
    image = data_sets.apply_augmentation(image)

    return image

谢谢!

推荐答案

我可能对你的问题有一些答案.

I may have some answers to your problem.

首先,在 .tostring() 方法之后,您的图像长 8 倍是完全正常的.后者以字节为单位转换您的数组.它的命名很糟糕,因为在 python 3 中,一个字节与字符串不同(但它们在 python 2 中是相同的).默认情况下,我猜您的图像是在 int64 中定义的,因此每个元素都将使用 8 个字节(或 64 位)进行编码.在您的示例中,图像的 162867 像素以 1302936 字节编码...

First, it's perfectly normal that your image is 8x longer after the .tostring() method. The latter converts your array in bytes. It's badly named because in python 3 a byte differs from a string (but they are the same in python 2). By default, I guess that your image is defined in int64, so each element will be encoded with 8 bytes (or 64 bits). In your example, the 162867 pixels of your image are encoded in 1302936 bytes...

关于您在解析过程中的错误,我认为这是因为您将数据写入 int64(用 64 位编码的整数,所以 8 个字节)并在 uint8(用 8 位编码的无符号整数,所以 1 个字节)中读取它们).如果在 int64 或 int8 中定义相同的整数,则它的字节序列将不同.在使用 tfrecord 文件时,以字节为单位写入图像是一种很好的做法,但您也需要使用正确的类型以字节为单位读取它们.

Concerning your error during the parsing, I think it comes from the fact you write your data in int64 (integers encoded with 64 bits, so 8 bytes) and read them in uint8 (unsigned integers encoded with 8 bits, so 1 byte). The same integer will have a different sequence of bytes if it's defined in int64 or int8. Write your image in bytes is the good practice when it comes to use tfrecord files, but you'll need to read them in bytes as well, using the proper type.

对于您的代码,请改用 image = tf.decode_raw(features['image_raw'], tf.int64).

For your code, try image = tf.decode_raw(features['image_raw'], tf.int64) instead.

这篇关于TensorFlow tfrecords:tostring() 改变图像的维度的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆