tensorflow image_resize 在未知图像大小上弄乱了图像 [英] tensorflow image_resize mess up image on unknown image size

查看:25
本文介绍了tensorflow image_resize 在未知图像大小上弄乱了图像的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个可变尺寸图像列表,并希望将它们标准化为 256x256 尺寸.我使用了以下代码

I have a list of variable size image and wish to standardise them into 256x256 size. I used the following code

import tensorflow as tf
import matplotlib.pyplot as plt

file_contents = tf.read_file('image.jpg')
im = tf.image.decode_jpeg(file_contents)
im = tf.image.resize_images(im, 256, 256)

sess = tf.Session()
sess.run(tf.initialize_all_variables())

img = sess.run(im)

plt.imshow(img)
plt.show()

然而,tf.resize_images() 往往会弄乱图像.但是,使用 tf.reshape() 似乎允许 resize_image() 正确运行

However, tf.resize_images() tend to mess up the image. However, using tf.reshape() seems to allow resize_image() function correctly

TensorFlow 版本:0.8.0

Tensorflow version : 0.8.0

原始图片:

调整大小的图像:

我知道 skimage 包可以处理我需要的东西,但是我希望享受 tf.train.shuffle_batch() 中的功能.我尽量避免维护 2 个相同的数据集(具有 1 个固定图像大小),因为 Caffe 处理它们似乎没有问题.

I know skimage package can handle what I need, however I wish to enjoy the function from tf.train.shuffle_batch(). I try to avoid maintaining 2 identical dataset ( with 1 fixed image size ) since Caffe seems to have no problem handling them.

推荐答案

发生这种情况是因为 image_resize() 在相​​邻像素之间执行插值,并返回浮点数而不是 0-255 范围内的整数.这就是 NEAREST_NEIGHBOR 起作用的原因:它采用附近像素之一的值而不做进一步的数学运算.假设您有一些相邻像素的值为 240、241.NEAREST_NEIGHBOR 将返回 240 或 241.使用任何其他方法,该值可能类似于 240.5,并且在不舍入的情况下返回,我故意假设这样您就可以决定哪个更好为您(地板,四舍五入等).另一边的 plt.imshow() 在面对浮点值时,只解释小数部分,就好像它们是 0.0 和 1.0 之间全比例的像素值.要使上述代码工作,可能的解决方案之一是:

This happens because image_resize() is performing an interpolation between adjacent pixels, and returning floats instead of integers in the range 0-255. That's why NEAREST_NEIGHBOR does work: it takes the value of one of the near pixels without doing further math. Suppose you have some adjacent pixels with values 240, 241. NEAREST_NEIGHBOR will return either 240 or 241. With any other method, the value could be something like 240.5, and is returned without rounding it, I assume intentionally so you can decide what is better for you (floor, round up, etc). The plt.imshow() on the other side, when facing float values, interprets only the decimal part, as if they were pixel values in a full scale between 0.0 and 1.0. To make the above code work, one of the possible solutions would be:

import numpy as np
plt.imshow(img.astype(np.uint8))

这篇关于tensorflow image_resize 在未知图像大小上弄乱了图像的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆