Keras和TensorFlow Hub版本的MobileNetV2之间的区别 [英] Difference between Keras and TensorFlow Hub Version of MobileNetV2

查看:359
本文介绍了Keras和TensorFlow Hub版本的MobileNetV2之间的区别的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在研究一种迁移学习方法,当使用keras.applications中的MobileNetV2和TensorFlow Hub上可用的MobileNetV2时,得到了截然不同的结果.这两个版本都在此处此处从同一检查点中提取其权重 mobilenet_v2_1.0_224 . 这样可以复制差异,您可以在此处找到Colab笔记本:

I am working on a transfer learning approach and got very different results when using the MobileNetV2 from keras.applications and the one available on TensorFlow Hub. This seems strange to me as both versions claim here and here to extract their weights from the same checkpoint mobilenet_v2_1.0_224. This is how the differences can be reproduced, you can find the Colab Notebook here:

!pip install tensorflow-gpu==2.1.0
import tensorflow as tf
import numpy as np
import tensorflow_hub as hub
from tensorflow.keras.applications.mobilenet_v2 import MobileNetV2

def create_model_keras():
  image_input = tf.keras.Input(shape=(224, 224, 3))
  out = MobileNetV2(input_shape=(224, 224, 3),
                  include_top=True)(image_input)
  model = tf.keras.models.Model(inputs=image_input, outputs=out)
  model.compile(optimizer='adam', loss=["categorical_crossentropy"])
  return model

def create_model_tf():
  image_input = tf.keras.Input(shape=(224, 224 ,3))
  out = hub.KerasLayer("https://tfhub.dev/google/tf2-preview/mobilenet_v2/classification/4",
                      input_shape=(224, 224, 3))(image_input)
  model = tf.keras.models.Model(inputs=image_input, outputs=out)
  model.compile(optimizer='adam', loss=["categorical_crossentropy"])
  return model

当我尝试预测随机批次时,结果不相等:

When I try to predict on a random batch, the results are not equal:

keras_model = create_model_keras()
tf_model = create_model_tf()
np.random.seed(42)
data = np.random.rand(32,224,224,3)
out_keras = keras_model.predict_on_batch(data)
out_tf = tf_model.predict_on_batch(data)
np.array_equal(out_keras, out_tf)

keras.applications的版本输出总和为1,但TensorFlow Hub的版本不输出.这两个版本的形状也不同:TensorFlow Hub具有1001个标签,keras.applications具有1000个标签.

The output of the version from keras.applications sums up to 1 but the version from TensorFlow Hub does not. Also the shape of the two versions is different: TensorFlow Hub has 1001 labels, keras.applications has 1000.

np.sum(out_keras[0]), np.sum(out_tf[0])

打印(1.0000001, -14.166359)

这些差异的原因是什么?我想念什么吗?

What is the reason for these differences? Am I missing something?

编辑18.02.2020

正如Szymon Maszke指出的那样,TFHub版本返回logits.这就是为什么我在create_model_tf上添加Softmax层,如下所示: out = tf.keras.layers.Softmax()(x)

As Szymon Maszke pointed out, the TFHub version returns logits. That's why i added a Softmax layer to the create_model_tf as follows: out = tf.keras.layers.Softmax()(x)

arnoegw提到TfHub版本需要将图像标准化为[0,1],而keras版本需要将图像标准化为[-1,1].当我在测试图像上使用以下预处理时:

arnoegw mentioned that the TfHub version requires an image normalized to [0,1], whereas the keras version requires normalization to [-1,1]. When I use the following preprocessing on a test image:

from tensorflow.keras.applications.mobilenet_v2 import preprocess_input
img = tf.keras.preprocessing.image.load_img("/content/panda.jpeg", target_size=(224,224))
img = tf.keras.preprocessing.image.img_to_array(img)
img = preprocess_input(img)

img = tf.io.read_file("/content/panda.jpeg")
img = tf.image.decode_jpeg(img)
img = tf.image.convert_image_dtype(img, tf.float32)
img = tf.image.resize(img, (224,224))

两者都能正确预测相同的标签,并且满足以下条件:np.allclose(out_keras, out_tf[:,1:], rtol=0.8)

Both correctly predict the same label and the following condition is true: np.allclose(out_keras, out_tf[:,1:], rtol=0.8)

编辑2 2020年2月18日 在我写这本书之前,不可能将格式互相转换.这是由错误引起的.

Edit 2 18.02.2020 Before I wrote that it is not possible to convert the formats into each other. This was caused by a bug.

推荐答案

有一些已记录的差异:

  • 就像Szymon所说的那样,TF Hub版本返回logits(在将softmax函数转换为概率之前),这是一种常见的做法,因为可以从logits计算出更大的数值稳定性来计算交叉熵损失.

  • Like Szymon said, the TF Hub version returns logits (before the softmax function that turns them into probabilities), which is a common practice, because the cross-entropy loss can be computed with greater numerical stability from the logits.

TF Hub模型假定float32输入在[0,1]范围内,这是您从tf.image.decode_jpeg(...)后跟tf.image.convert_image_dtype(..., tf.float32)所获得的. Keras代码使用特定于模型的范围(可能是[-1,+ 1]).

The TF Hub model assumes float32 inputs in the range of [0,1], which is what you get from tf.image.decode_jpeg(...) followed by tf.image.convert_image_dtype(..., tf.float32). The Keras code uses a model-specific range (likely [-1,+1]).

TF Hub模型在返回其所有1001输出类时更完整地反映了原始SLIM检查点.如文档中链接的ImageNetLabels.txt中所述,添加的类0是背景"(又名材料").这就是对象检测用来指示图像背景的对象,而不是任何已知类的对象.

The TF Hub model reflects the original SLIM checkpoint more completely in returning all its 1001 output classes. As stated in the ImageNetLabels.txt linked from the documentation, the added class 0 is "background" (aka. "stuff"). That is what object detection uses to indicate image background as opposed to an object of any known class.

这篇关于Keras和TensorFlow Hub版本的MobileNetV2之间的区别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆