在 ML-Engine 预测上出错,但本地预测工作正常 [英] Getting error on ML-Engine predict but local predict works fine

查看:39
本文介绍了在 ML-Engine 预测上出错,但本地预测工作正常的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在这里搜索了很多,但很遗憾没有找到答案.

I have searched a lot here but unfortunately could not find an answer.

我在本地机器上运行 TensorFlow 1.3(通过 MacOS 上的 PiP 安装),并使用 提供 ssd_mobilenet_v1_coco"检查点.

I am running TensorFlow 1.3 (installed via PiP on MacOS) on my local machine, and have created a model using the provided "ssd_mobilenet_v1_coco" checkpoints.

我设法在本地和 ML-Engine (Runtime 1.2) 上进行训练,并成功地将我的 savedModel 部署到 ML-Engine.

I managed to train locally and on the ML-Engine (Runtime 1.2), and successfully deployed my savedModel to the ML-Engine.

本地预测(下面的代码)工作正常,我得到了模型结果

Local predictions (below code) work fine and I get the model results

gcloud ml-engine local predict --model-dir=... --json-instances=request.json

 FILE request.json: {"inputs": [[[242, 240, 239], [242, 240, 239], [242, 240, 239], [242, 240, 239], [242, 240, 23]]]}

但是,当部署模型并尝试在 ML-ENGINE 上运行以使用以下代码进行远程预测时:

However when deploying the model and trying to run on the ML-ENGINE for remote predictions with the code below:

gcloud ml-engine predict --model "testModel" --json-instances request.json(SAME JSON FILE AS BEFORE)

我收到此错误:

{
  "error": "Prediction failed: Exception during model execution: AbortionError(code=StatusCode.INVALID_ARGUMENT, details=\"NodeDef mentions attr 'data_format' not in Op<name=DepthwiseConv2dNative; signature=input:T, filter:T -> output:T; attr=T:type,allowed=[DT_FLOAT, DT_DOUBLE]; attr=strides:list(int); attr=padding:string,allowed=[\"SAME\", \"VALID\"]>; NodeDef: FeatureExtractor/MobilenetV1/MobilenetV1/Conv2d_1_depthwise/depthwise = DepthwiseConv2dNative[T=DT_FLOAT, _output_shapes=[[-1,150,150,32]], data_format=\"NHWC\", padding=\"SAME\", strides=[1, 1, 1, 1], _device=\"/job:localhost/replica:0/task:0/cpu:0\"](FeatureExtractor/MobilenetV1/MobilenetV1/Conv2d_0/Relu6, FeatureExtractor/MobilenetV1/Conv2d_1_depthwise/depthwise_weights/read)\n\t [[Node: FeatureExtractor/MobilenetV1/MobilenetV1/Conv2d_1_depthwise/depthwise = DepthwiseConv2dNative[T=DT_FLOAT, _output_shapes=[[-1,150,150,32]], data_format=\"NHWC\", padding=\"SAME\", strides=[1, 1, 1, 1], _device=\"/job:localhost/replica:0/task:0/cpu:0\"](FeatureExtractor/MobilenetV1/MobilenetV1/Conv2d_0/Relu6, FeatureExtractor/MobilenetV1/Conv2d_1_depthwise/depthwise_weights/read)]]\")"
}

我在这里看到了类似的东西:https://github.com/tensorflow/models/issues/1581

I saw something similar here: https://github.com/tensorflow/models/issues/1581

关于数据格式"参数的问题.但不幸的是,我无法使用该解决方案,因为我已经在使用 TensorFlow 1.3.

About the problem being with the "data-format" parameter. But unfortunately I could not use that solution since I am already on TensorFlow 1.3.

似乎也可能是MobilenetV1的问题:https://github.com/tensorflow/models/issues/2153

It also seems that it might be a problem with MobilenetV1: https:// github.com/ tensorflow/models/issues/2153

有什么想法吗?

推荐答案

我遇到了类似的问题.此问题是由于用于训练和推理的 Tensorflow 版本不匹配所致.我通过使用 Tensorflow - 1.4 进行训练和推理解决了这个问题.

I had a similar issue. This issue is due to mismatch in Tensorflow versions used for training and inference. I solved the issue by using Tensorflow - 1.4 for both training and inference.

请参考这个答案.

这篇关于在 ML-Engine 预测上出错,但本地预测工作正常的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆