Cloud ML Engine 中部署的重新训练的 inception_v3 模型始终输出相同的预测 [英] Retrained inception_v3 model deployed in Cloud ML Engine always outputs the same predictions

本文介绍了Cloud ML Engine 中部署的重新训练的 inception_v3 模型始终输出相同的预测的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我按照 codelab TensorFlow For Poets 使用迁移学习inception_v3.它生成 retrained_graph.pb 和 retrained_labels.txt 文件,可用于在本地进行预测(运行 label_image.py).

I followed the codelab TensorFlow For Poets for transfer learning using inception_v3. It generates retrained_graph.pb and retrained_labels.txt files, which can used to make predictions locally (running label_image.py).

然后,我想将此模型部署到 Cloud ML Engine,以便进行在线预测.为此,我必须将 retrained_graph.pb 导出为 SavedModel 格式.我按照 这个答案来自 Google 的 @rhaertel80Flowers Cloud ML Engine 中的 "nofollow noreferrer">这个 python 文件教程.这是我的代码:

Then, I wanted to deploy this model to Cloud ML Engine, so that I could make online predictions. For that, I had to export the retrained_graph.pb to SavedModel format. I managed to do it by following the indications in this answer from Google's @rhaertel80 and this python file from the Flowers Cloud ML Engine Tutorial. Here is my code:

import tensorflow as tf
from tensorflow.contrib import layers

from tensorflow.python.saved_model import builder as saved_model_builder
from tensorflow.python.saved_model import signature_constants
from tensorflow.python.saved_model import signature_def_utils
from tensorflow.python.saved_model import tag_constants
from tensorflow.python.saved_model import utils as saved_model_utils


export_dir = '../tf_files/saved7'
retrained_graph = '../tf_files/retrained_graph2.pb'
label_count = 5

def build_signature(inputs, outputs):
    signature_inputs = { key: saved_model_utils.build_tensor_info(tensor) for key, tensor in inputs.items() }
    signature_outputs = { key: saved_model_utils.build_tensor_info(tensor) for key, tensor in outputs.items() }

    signature_def = signature_def_utils.build_signature_def(
        signature_inputs,
        signature_outputs,
        signature_constants.PREDICT_METHOD_NAME
    )

    return signature_def

class GraphReferences(object):
  def __init__(self):
    self.examples = None
    self.train = None
    self.global_step = None
    self.metric_updates = []
    self.metric_values = []
    self.keys = None
    self.predictions = []
    self.input_jpeg = None

class Model(object):
    def __init__(self, label_count):
        self.label_count = label_count

    def build_image_str_tensor(self):
        image_str_tensor = tf.placeholder(tf.string, shape=[None])

        def decode_and_resize(image_str_tensor):
            return image_str_tensor

        image = tf.map_fn(
            decode_and_resize,
            image_str_tensor,
            back_prop=False,
            dtype=tf.string
        )

        return image_str_tensor

    def build_prediction_graph(self, g):
        tensors = GraphReferences()
        tensors.examples = tf.placeholder(tf.string, name='input', shape=(None,))
        tensors.input_jpeg = self.build_image_str_tensor()

        keys_placeholder = tf.placeholder(tf.string, shape=[None])
        inputs = {
            'key': keys_placeholder,
            'image_bytes': tensors.input_jpeg
        }

        keys = tf.identity(keys_placeholder)
        outputs = {
            'key': keys,
            'prediction': g.get_tensor_by_name('final_result:0')
        }

        return inputs, outputs

    def export(self, output_dir):
        with tf.Session(graph=tf.Graph()) as sess:
            with tf.gfile.GFile(retrained_graph, "rb") as f:
                graph_def = tf.GraphDef()
                graph_def.ParseFromString(f.read())
                tf.import_graph_def(graph_def, name="")

            g = tf.get_default_graph()
            inputs, outputs = self.build_prediction_graph(g)

            signature_def = build_signature(inputs=inputs, outputs=outputs)
            signature_def_map = {
                signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY: signature_def
            }

            builder = saved_model_builder.SavedModelBuilder(output_dir)
            builder.add_meta_graph_and_variables(
                sess,
                tags=[tag_constants.SERVING],
                signature_def_map=signature_def_map
            )
            builder.save()

model = Model(label_count)
model.export(export_dir)

此代码生成一个 saved_model.pb 文件,然后我用它来创建 Cloud ML Engine 模型.我可以使用 gcloud ml-engine predict --model my_model_name --json-instances request.json 从这个模型中得到预测,其中 request.json 的内容是:

This code generates a saved_model.pb file, which I then used to create the Cloud ML Engine model. I can get predictions from this model using gcloud ml-engine predict --model my_model_name --json-instances request.json, where the contents of request.json are:

{ "key": "0", "image_bytes": { "b64": "jpeg_image_base64_encoded" } }

但是,无论我在请求中编码哪个 jpeg,我总是得到完全相同的错误预测:

However, no matter which jpeg I encode in the request, I always get the exact same wrong predictions:

预测输出

我猜问题出在 CloudML Prediction API 将 base64 编码图像字节传递给 inception_v3 的输入张量DecodeJpeg/contents:0"(前面代码中的build_image_str_tensor()"方法)的方式上.关于如何解决此问题并让我在本地重新训练的模型在 Cloud ML Engine 上提供正确预测的任何线索?

I guess the problem is in the way the CloudML Prediction API passes the base64 encoded image bytes to the input tensor "DecodeJpeg/contents:0" of inception_v3 ("build_image_str_tensor()" method in the previous code). Any clue on how can I solve this issue and have my locally retrained model serving correct predictions on Cloud ML Engine?

(为了清楚起见,问题不在 retrained_graph.pb 中,因为当我在本地运行它时它会做出正确的预测;也不在 request.json 中,因为相同的请求文件在跟随 Flowers 时没有问题上面提到的 Cloud ML Engine 教程.)

(Just to make it clear, the problem is not in retrained_graph.pb, as it makes correct predictions when I run it locally; nor is it in request.json, because the same request file worked without problems when following the Flowers Cloud ML Engine Tutorial pointed above.)

推荐答案

首先,一般警告.TensorFlow for Poets codelab 的编写方式并非非常适合生产服务(部分体现在您必须实施的变通办法).您通常会导出不包含所有额外训练操作的特定于预测的图.因此,虽然我们可以尝试一起破解一些可行的东西,但可能需要额外的工作来生产这个图表.

First, a general warning. The TensorFlow for Poets codelab was not written in a way that is very amenable to production serving (partly manifested by the workarounds you are having to implement). You would normally export a prediction-specific graph that doesn't contain all of the extra training ops. So while we can try and hack something together that works, extra work may be needed to productionize this graph.

您的代码的方法似乎是导入一个图表,添加一些占位符,然后导出结果.这通常很好.但是,在问题中显示的代码中,您正在添加输入占位符,但实际上并未将它们连接到导入图中的任何内容.您最终会得到一个包含多个断开连接的子图的图,例如(请原谅粗图):

The approach of your code appears to be to import one graph, add some placeholders, and then export the result. This is generally fine. However, in the code shown in the question, you are adding input placeholders without actually connecting them to anything in the imported graph. You end up with a graph containing multiple disconnected subgraphs, something like (excuse the crude diagram):

image_str_tensor [input=image_bytes] -> <nothing>
keys_placeholder [input=key]  -> identity [output=key]
inception_subgraph -> final_graph [output=prediction]

inception_subgraph 我指的是您正在导入的所有操作.

By inception_subgraph I mean all of the ops that you are importing.

所以 image_bytes 实际上是一个无操作并且被忽略;key 被传递;prediction 包含运行 inception_subgraph 的结果;因为它没有使用您传递的输入,所以每次都会返回相同的结果(尽管我承认我实际上预计会出现错误).

So image_bytes is effectively a no-op and is ignored; key gets passed through; and prediction contains the result of running the inception_subgraph; since it's not using the input you are passing, it's returning the same result everytime (though I admit I actually expected an error here).

为了解决这个问题,我们需要将您创建的占位符连接到 inception_subgraph 中已经存在的占位符,以创建或多或少像这样的图表:

To address this problem, we would need to connect the placeholder you've created to the one that already exists in inception_subgraph to create a graph more or less like this:

image_str_tensor [input=image_bytes] -> inception_subgraph -> final_graph [output=prediction]
keys_placeholder [input=key]  -> identity [output=key]   

请注意,根据预测服务的要求,image_str_tensor 将是一批图像,但初始图的输入实际上是单个图像.为简单起见,我们将以一种老套的方式解决这个问题:我们假设我们将一张一张地发送图像.如果我们每次请求发送超过一张图片,我们会收到错误消息.此外,批量预测永远不会起作用.

Note that image_str_tensor is going to be a batch of images, as required by the prediction service, but the inception graph's input is actually a single image. In the interest of simplicity, we're going to address this in a hacky way: we'll assume we'll be sending images one-by-one. If we ever send more than one image per request, we'll get errors. Also, batch prediction will never work.

您需要的主要更改是导入语句,它将我们添加的占位符连接到图表中的现有输入(您还将看到更改输入形状的代码):

The main change you need is the import statement, which connects the placeholder we've added to the existing input in the graph (you'll also see the code for changing the shape of the input):

把它们放在一起,我们会得到这样的结果:

Putting it all together, we get something like:

import tensorflow as tf
from tensorflow.contrib import layers

from tensorflow.python.saved_model import builder as saved_model_builder
from tensorflow.python.saved_model import signature_constants
from tensorflow.python.saved_model import signature_def_utils
from tensorflow.python.saved_model import tag_constants
from tensorflow.python.saved_model import utils as saved_model_utils


export_dir = '../tf_files/saved7'
retrained_graph = '../tf_files/retrained_graph2.pb'
label_count = 5

class Model(object):
    def __init__(self, label_count):
        self.label_count = label_count

    def build_prediction_graph(self, g):
        inputs = {
            'key': keys_placeholder,
            'image_bytes': tensors.input_jpeg
        }

        keys = tf.identity(keys_placeholder)
        outputs = {
            'key': keys,
            'prediction': g.get_tensor_by_name('final_result:0')
        }

        return inputs, outputs

    def export(self, output_dir):
        with tf.Session(graph=tf.Graph()) as sess:
            # This will be our input that accepts a batch of inputs
            image_bytes = tf.placeholder(tf.string, name='input', shape=(None,))
            # Force it to be a single input; will raise an error if we send a batch.
            coerced = tf.squeeze(image_bytes)
            # When we import the graph, we'll connect `coerced` to `DecodeJPGInput:0`
            input_map = {'DecodeJPGInput:0': coerced}

            with tf.gfile.GFile(retrained_graph, "rb") as f:
                graph_def = tf.GraphDef()
                graph_def.ParseFromString(f.read())
                tf.import_graph_def(graph_def, input_map=input_map, name="")

            keys_placeholder = tf.placeholder(tf.string, shape=[None])

            inputs = {'image_bytes': image_bytes, 'key': keys_placeholder}

            keys = tf.identity(keys_placeholder)
            outputs = {
                'key': keys,
                'prediction': tf.get_default_graph().get_tensor_by_name('final_result:0')}    
            }

            tf.simple_save(sess, output_dir, inputs, outputs)

model = Model(label_count)
model.export(export_dir)

这篇关于Cloud ML Engine 中部署的重新训练的 inception_v3 模型始终输出相同的预测的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆