TensorFlow 相当于 PyTorch 的 transforms.Normalize() [英] TensorFlow equivalent of PyTorch's transforms.Normalize()

查看:67
本文介绍了TensorFlow 相当于 PyTorch 的 transforms.Normalize()的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试推断最初在 PyTorch 中构建的 TFLite 模型.我一直在遵循

归一化后的最小和最大像素范围分别为(-2.1179039301310043,2.63999999999999997).

选项 2

我们可以使用

归一化后的最小和最大像素范围分别为(-2.0357144,2.64).

选项 3

这更像是减去平均值mean并除以平均值std.

norm_img = ((tf.cast(np.array(img), tf.float32)/255.0) - 0.449)/0.226plt.figure(figsize=(25,10))子图(121);imshow(norm_img.numpy());标题(f'标准化图像\n min-px:\{norm_img.numpy().min()} \n 最大像素:{norm_img.numpy().max()}')子图(122);hist(norm_img.numpy().ravel(), bins=50, density=True);\title('直方图 - 像素分布')

归一化后的最小和最大像素范围分别为(-1.9867257,2.4380531).最后,如果我们与 pytorch 方式相比,这些方法之间没有太大区别.

import torchvision.transforms 作为转换transform_norm = transforms.Compose([transforms.ToTensor(),transforms.Normalize(mean=[0.485, 0.456, 0.406],std=[0.229, 0.224, 0.225]),])norm_pt = transform_norm(img)plt.figure(figsize=(25,10))子图(121);imshow(np.array(norm_pt).transpose(1, 2, 0));\标题(f'标准化图像\n min-px:\{np.array(norm_pt).min()} \n 最大像素:{np.array(norm_pt).max()}')子图(122);hist(np.array(norm_pt).ravel(), bins=50, density=True);\title('直方图 - 像素分布')

归一化后的最小和最大像素范围分别为(-2.117904,2.64).

I'm trying to inference a TFLite model that was originally built in PyTorch. I have been following along the lines of the PyTorch implementation and have to preprocess images along the RGB channels. I found the closest TensorFlow equivalent of transforms.Normalize() to be tf.image.per_image_standardization() (documentation). Although this is a pretty good match, tf.image.per_image_standardization() does this by taking mean and std across the channels and applies it to them. Here's their full implementation from here

def per_image_standardization(image):
  """Linearly scales `image` to have zero mean and unit norm.
  This op computes `(x - mean) / adjusted_stddev`, where `mean` is the average
  of all values in image, and
  `adjusted_stddev = max(stddev, 1.0/sqrt(image.NumElements()))`.
  `stddev` is the standard deviation of all values in `image`. It is capped
  away from zero to protect against division by 0 when handling uniform images.
  Args:
    image: 3-D tensor of shape `[height, width, channels]`.
  Returns:
    The standardized image with same shape as `image`.
  Raises:
    ValueError: if the shape of 'image' is incompatible with this function.
  """
  image = ops.convert_to_tensor(image, name='image')
  _Check3DImage(image, require_static=False)
  num_pixels = math_ops.reduce_prod(array_ops.shape(image))

  image = math_ops.cast(image, dtype=dtypes.float32)
  image_mean = math_ops.reduce_mean(image)

  variance = (math_ops.reduce_mean(math_ops.square(image)) -
              math_ops.square(image_mean))
  variance = gen_nn_ops.relu(variance)
  stddev = math_ops.sqrt(variance)

  # Apply a minimum normalization that protects us against uniform images.
  min_stddev = math_ops.rsqrt(math_ops.cast(num_pixels, dtypes.float32))
  pixel_value_scale = math_ops.maximum(stddev, min_stddev)
  pixel_value_offset = image_mean

  image = math_ops.subtract(image, pixel_value_offset)
  image = math_ops.div(image, pixel_value_scale)
  return image

whereas PyTorch's transforms.Normalize() allows us to mention the mean and std to be applied across each channel like below.

# transformation
    pose_transform = transforms.Compose([
        transforms.ToTensor(),
        transforms.Normalize(mean=[0.485, 0.456, 0.406],
                             std=[0.229, 0.224, 0.225]),
    ])

What would be a way to get this functionality in TensorFlow 2.x?

Edit: I created a quick botch that seems to solve this by defining a function as such:

def normalize_image(image, mean, std):
    for channel in range(3):
        image[:,:,channel] = (image[:,:,channel] - mean[channel])/std[channel]
    
    return image

I'm not sure how efficient this is but seems to get the job done. I still have to convert the output to a tensor before inputing to the model.

解决方案

The workaround that you mentioned seems ok. But using for...loop to compute normalization to each RGB channel for a single image can be a bit problematic when you deal with a large dataset in the data pipeline (generator or tf.data). But it's ok anyway. Here is the demonstration of your approach, and later we will provide two possible alternatives that might work for you easily.

from PIL import Image 
from matplotlib.pyplot import imshow, subplot, title, hist

# load image (RGB)
img = Image.open('/content/9.jpg')

def normalize_image(image, mean, std):
    for channel in range(3):
        image[:,:,channel] = (image[:,:,channel] - mean[channel]) / std[channel]
    return image

OP_approach = normalize_image(np.array(img) / 255.0, 
                            mean=[0.485, 0.456, 0.406], 
                            std=[0.229, 0.224, 0.225])

Now, let's observe the transform properties afterward.

plt.figure(figsize=(25,10))
subplot(121); imshow(OP_approach); title(f'Normalized Image \n min-px: \
    {OP_approach.min()} \n max-pix: {OP_approach.max()}')
subplot(122); hist(OP_approach.ravel(), bins=50, density=True); \ 
                                    title('Histogram - pixel distribution')

The range of minimum and maximum pixel after normalization are (-2.1179039301310043, 2.6399999999999997) respectively.

Option 2

We can use the tf. keras...Normalization preprocessing layer to do the same. It takes two important arguments which are mean and, variance (square of the std).

from tensorflow.keras.experimental.preprocessing import Normalization

input_data = np.array(img)/255
layer = Normalization(mean=[0.485, 0.456, 0.406], 
                      variance=[np.square(0.299), 
                                np.square(0.224), 
                                np.square(0.225)])

plt.figure(figsize=(25,10))
subplot(121); imshow(layer(input_data).numpy()); title(f'Normalized Image \n min-px: \
   {layer(input_data).numpy().min()} \n max-pix: {layer(input_data).numpy().max()}')
subplot(122); hist(layer(input_data).numpy().ravel(), bins=50, density=True);\
   title('Histogram - pixel distribution')

The range of minimum and maximum pixel after normalization are (-2.0357144, 2.64) respectively.

Option 3

This is more like subtracting the average mean and divide by the average std.

norm_img = ((tf.cast(np.array(img), tf.float32) / 255.0) - 0.449) / 0.226

plt.figure(figsize=(25,10))
subplot(121); imshow(norm_img.numpy()); title(f'Normalized Image \n min-px: \
{norm_img.numpy().min()} \n max-pix: {norm_img.numpy().max()}')
subplot(122); hist(norm_img.numpy().ravel(), bins=50, density=True); \
title('Histogram - pixel distribution')

The range of minimum and maximum pixel after normalization are (-1.9867257, 2.4380531) respectively. Lastly, if we compare to the pytorch way, there is not that much difference among these approaches.

import torchvision.transforms as transforms

transform_norm = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406],
                            std=[0.229, 0.224, 0.225]),
])
norm_pt = transform_norm(img)

plt.figure(figsize=(25,10))
subplot(121); imshow(np.array(norm_pt).transpose(1, 2, 0));\
  title(f'Normalized Image \n min-px: \
  {np.array(norm_pt).min()} \n max-pix: {np.array(norm_pt).max()}')
subplot(122); hist(np.array(norm_pt).ravel(), bins=50, density=True); \
  title('Histogram - pixel distribution')

The range of minimum and maximum pixel after normalization are (-2.117904, 2.64) respectively.

这篇关于TensorFlow 相当于 PyTorch 的 transforms.Normalize()的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆