TensorFlow tfrecords:tostring()更改图像的尺寸 [英] TensorFlow tfrecords: tostring() changes dimension of image

查看:108
本文介绍了TensorFlow tfrecords:tostring()更改图像的尺寸的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我建立了一个在TensorFlow中训练卷积自动编码器的模型.我遵循了关于从TF文档中读取数据的说明,以读取自己的大小为233的图像x 233 x3.这是我根据这些指令改编的convert_to()函数:

I have built a model to train a convolutional autoencoder in TensorFlow. I followed the instructions on Reading Data from the TF documentation to read in my own images of size 233 x 233 x 3. Here is my convert_to() function adapted from those instructions:

def convert_to(images, name):
  """Converts a dataset to tfrecords."""
  num_examples = images.shape[0]
  rows = images.shape[1]
  cols = images.shape[2]
  depth = images.shape[3]

  filename = os.path.join(FLAGS.tmp_dir, name + '.tfrecords')
  print('Writing', filename)
  writer = tf.python_io.TFRecordWriter(filename)
  for index in range(num_examples):
    print(images[index].size)
    image_raw = images[index].tostring()
    print(len(image_raw))
    example = tf.train.Example(features=tf.train.Features(feature={
        'height': _int64_feature(rows),
        'width': _int64_feature(cols),
        'depth': _int64_feature(depth),
        'image_raw': _bytes_feature(image_raw)}))
    writer.write(example.SerializeToString())
  writer.close()

当我在for循环的开始处打印图像的大小时,大小为162867,但是当我在.tostring()行之后打印时,大小为1302936.这会在以后产生问题,因为该模型认为我的输入是应该输入的8倍.将示例中的"image_raw"条目更改为_int64_feature(image_raw)还是更改将其转换为字符串的方式更好?

When I print the size of the image at the start of the for loop, the size is 162867, but when I print after the .tostring() line, the size is 1302936. This causes problems down the road because the model thinks my input is 8x what it should be. Is it better to change the 'image_raw' entry in the Example to _int64_feature(image_raw) or to change the way I convert it to a string?

或者,问题可能出在我的read_and_decode()函数中,例如字符串未正确解码或示例未解析...?

Alternatively, the problem could be in my read_and_decode() function, e.g. the string is not properly being decoded or the example not being parsed...?

def read_and_decode(self, filename_queue):
    reader = tf.TFRecordReader()

    _, serialized_example = reader.read(filename_queue)
    features = tf.parse_single_example(
        serialized_example,
        features={
            'height': tf.FixedLenFeature([], tf.int64),
            'width': tf.FixedLenFeature([], tf.int64),
            'depth': tf.FixedLenFeature([], tf.int64),
            'image_raw': tf.FixedLenFeature([], tf.string)
      })

    # Convert from a scalar string tensor to a uint8 tensor
    image = tf.decode_raw(features['image_raw'], tf.uint8)

    # Reshape into a 233 x 233 x 3 image and apply distortions
    image = tf.reshape(image, (self.input_rows, self.input_cols, self.num_filters))

    image = data_sets.normalize(image)
    image = data_sets.apply_augmentation(image)

    return image

谢谢!

推荐答案

对于您的问题,我可能有一些答案.

I may have some answers to your problem.

首先,在.tostring()方法之后,图像长8倍是完全正常的.后者将您的数组转换为字节.它的名字很不好,因为在python 3中一个字节与一个字符串不同(但是在python 2中它们是相同的).默认情况下,我猜您的图像是在int64中定义的,因此每个元素将使用8个字节(或64位)进行编码.在您的示例中,图像的162867像素被编码为1302936字节...

First, it's perfectly normal that your image is 8x longer after the .tostring() method. The latter converts your array in bytes. It's badly named because in python 3 a byte differs from a string (but they are the same in python 2). By default, I guess that your image is defined in int64, so each element will be encoded with 8 bytes (or 64 bits). In your example, the 162867 pixels of your image are encoded in 1302936 bytes...

关于解析过程中的错误,我认为这是因为您将数据写入int64(以64位编码的整数,因此为8个字节)中写入数据,然后以uint8(以8位编码的无符号整数,因此为1个字节)读取了它们).如果在int64或int8中定义相同的整数,则其字节序列将有所不同.使用tfrecord文件时,以字节为单位写入映像是一种很好的做法,但是您也需要使用适当的类型以字节为单位读取它们.

Concerning your error during the parsing, I think it comes from the fact you write your data in int64 (integers encoded with 64 bits, so 8 bytes) and read them in uint8 (unsigned integers encoded with 8 bits, so 1 byte). The same integer will have a different sequence of bytes if it's defined in int64 or int8. Write your image in bytes is the good practice when it comes to use tfrecord files, but you'll need to read them in bytes as well, using the proper type.

对于您的代码,请尝试使用image = tf.decode_raw(features['image_raw'], tf.int64).

For your code, try image = tf.decode_raw(features['image_raw'], tf.int64) instead.

这篇关于TensorFlow tfrecords:tostring()更改图像的尺寸的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆