Keras VGG16 preprocess_input 模式 [英] Keras VGG16 preprocess_input modes

查看:46
本文介绍了Keras VGG16 preprocess_input 模式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用的是 Keras VGG16 模型.

我看到有一个 preprocess_input 方法与 VGG16 模型结合使用.此方法似乎调用了 imagenet_utils.py 中的 preprocess_input 方法 其中(视情况而定)调用 _preprocess_numpy_input 方法在 imagenet_utils.py 中.

I've seen it there is a preprocess_input method to use in conjunction with the VGG16 model. This method appears to call the preprocess_input method in imagenet_utils.py which (depending on the case) calls _preprocess_numpy_input method in imagenet_utils.py.

preprocess_input 有一个 mode 参数,它需要caffe"、tf"或torch".如果我在 Keras 中使用带有 TensorFlow 后端的模型,我绝对应该使用 mode="tf" 吗?

The preprocess_input has a mode argument which expects "caffe", "tf", or "torch". If I'm using the model in Keras with TensorFlow backend, should I absolutely use mode="tf"?

如果是,这是因为 Keras 加载的 VGG16 模型是用经过相同预处理的图像训练的(即将输入图像的范围从 [0,255] 更改为输入范围 [-1,1])?

If yes, is this because the VGG16 model loaded by Keras was trained with images which underwent the same preprocessing (i.e. changed input image's range from [0,255] to input range [-1,1])?

此外,测试模式的输入图像是否也应进行此预处理?我相信最后一个问题的答案是肯定的,但我想得到一些保证.

Also, should the input images for testing mode also undergo this preprocessing? I'm confident the answer to the last question is yes, but I would like some reassurance.

我希望 Francois Chollet 能够正确地完成它,但查看 https://github.com/fchollet/deep-learning-models/blob/master/vgg16.py 使用 mode="tf".

I would expect Francois Chollet to have done it correctly, but looking at https://github.com/fchollet/deep-learning-models/blob/master/vgg16.py either he is or I am wrong about using mode="tf".

更新信息

@FalconUA 将我引导至牛津的 VGG有一个模型部分,其中包含 16 层模型的链接.找到关于 preprocessing_input mode 参数 tf 缩放到 -1 到 1 和 caffe 减去一些平均值的信息按照模型 16 层模型中的链接:信息页.在描述部分,它说:

@FalconUA directed me to the VGG at Oxford which has a Models section with links for the 16-layer model. The information about the preprocessing_input mode argument tf scaling to -1 to 1 and caffe subtracting some mean values is found by following the link in the Models 16-layer model: information page. In the Description section it says:

在论文中,模型被表示为经过尺度抖动训练的配置D.输入图像应该通过平均像素(而不是平均图像)减法以零为中心.即,应该减去以下BGR值:[103.939、116.779、123.68]."

"In the paper, the model is denoted as the configuration D trained with scale jittering. The input images should be zero-centered by mean pixel (rather than mean image) subtraction. Namely, the following BGR values should be subtracted: [103.939, 116.779, 123.68]."

推荐答案

这里的 mode 不是关于后端,而是关于模型在什么框架上训练和移植的. 在 VGG16 的 keras 链接中,声明:

The mode here is not about the backend, but rather about on what framework the model was trained on and ported from. In the keras link to VGG16, it is stated that:

这些权重是从 牛津大学的 VGG 发布的权重移植过来的

所以 VGG16 和 VGG19 模型在 Caffe 中训练并移植到 TensorFlow,因此 mode == 'caffe' 在这里(范围从 0 到 255 然后提取平均值 [103.939,116.779, 123.68]).

So the VGG16 and VGG19 models were trained in Caffe and ported to TensorFlow, hence mode == 'caffe' here (range from 0 to 255 and then extract the mean [103.939, 116.779, 123.68]).

较新的网络,如 MobileNetShuffleNet 是在 TensorFlow 上训练的,所以 mode'tf'它们和输入在 -1 到 1 的范围内以零为中心.

Newer networks, like MobileNet and ShuffleNet were trained on TensorFlow, so mode is 'tf' for them and the inputs are zero-centered in the range from -1 to 1.

这篇关于Keras VGG16 preprocess_input 模式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆