部署在 Cloud ML Engine 中的重新训练的 inception_v3 模型始终输出相同的预测 [英] Retrained inception_v3 model deployed in Cloud ML Engine always outputs the same predictions

查看:24
本文介绍了部署在 Cloud ML Engine 中的重新训练的 inception_v3 模型始终输出相同的预测的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我按照代码实验室 TensorFlow For Poets 进行迁移学习,使用inception_v3.它生成 retrained_graph.pb 和 retrained_labels.txt 文件,可用于在本地进行预测(运行 label_image.py).

I followed the codelab TensorFlow For Poets for transfer learning using inception_v3. It generates retrained_graph.pb and retrained_labels.txt files, which can used to make predictions locally (running label_image.py).

然后,我想将此模型部署到 Cloud ML Engine,以便我可以进行在线预测.为此,我必须将 retrained_graph.pb 导出为 SavedModel 格式.我设法按照 来自 Google 的 @rhaertel80 的答案这个 python 文件来自 Flowers Cloud ML Engine教程.这是我的代码:

Then, I wanted to deploy this model to Cloud ML Engine, so that I could make online predictions. For that, I had to export the retrained_graph.pb to SavedModel format. I managed to do it by following the indications in this answer from Google's @rhaertel80 and this python file from the Flowers Cloud ML Engine Tutorial. Here is my code:

import tensorflow as tf
from tensorflow.contrib import layers

from tensorflow.python.saved_model import builder as saved_model_builder
from tensorflow.python.saved_model import signature_constants
from tensorflow.python.saved_model import signature_def_utils
from tensorflow.python.saved_model import tag_constants
from tensorflow.python.saved_model import utils as saved_model_utils


export_dir = '../tf_files/saved7'
retrained_graph = '../tf_files/retrained_graph2.pb'
label_count = 5

def build_signature(inputs, outputs):
    signature_inputs = { key: saved_model_utils.build_tensor_info(tensor) for key, tensor in inputs.items() }
    signature_outputs = { key: saved_model_utils.build_tensor_info(tensor) for key, tensor in outputs.items() }

    signature_def = signature_def_utils.build_signature_def(
        signature_inputs,
        signature_outputs,
        signature_constants.PREDICT_METHOD_NAME
    )

    return signature_def

class GraphReferences(object):
  def __init__(self):
    self.examples = None
    self.train = None
    self.global_step = None
    self.metric_updates = []
    self.metric_values = []
    self.keys = None
    self.predictions = []
    self.input_jpeg = None

class Model(object):
    def __init__(self, label_count):
        self.label_count = label_count

    def build_image_str_tensor(self):
        image_str_tensor = tf.placeholder(tf.string, shape=[None])

        def decode_and_resize(image_str_tensor):
            return image_str_tensor

        image = tf.map_fn(
            decode_and_resize,
            image_str_tensor,
            back_prop=False,
            dtype=tf.string
        )

        return image_str_tensor

    def build_prediction_graph(self, g):
        tensors = GraphReferences()
        tensors.examples = tf.placeholder(tf.string, name='input', shape=(None,))
        tensors.input_jpeg = self.build_image_str_tensor()

        keys_placeholder = tf.placeholder(tf.string, shape=[None])
        inputs = {
            'key': keys_placeholder,
            'image_bytes': tensors.input_jpeg
        }

        keys = tf.identity(keys_placeholder)
        outputs = {
            'key': keys,
            'prediction': g.get_tensor_by_name('final_result:0')
        }

        return inputs, outputs

    def export(self, output_dir):
        with tf.Session(graph=tf.Graph()) as sess:
            with tf.gfile.GFile(retrained_graph, "rb") as f:
                graph_def = tf.GraphDef()
                graph_def.ParseFromString(f.read())
                tf.import_graph_def(graph_def, name="")

            g = tf.get_default_graph()
            inputs, outputs = self.build_prediction_graph(g)

            signature_def = build_signature(inputs=inputs, outputs=outputs)
            signature_def_map = {
                signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY: signature_def
            }

            builder = saved_model_builder.SavedModelBuilder(output_dir)
            builder.add_meta_graph_and_variables(
                sess,
                tags=[tag_constants.SERVING],
                signature_def_map=signature_def_map
            )
            builder.save()

model = Model(label_count)
model.export(export_dir)

此代码生成了一个 saved_model.pb 文件,然后我用它来创建 Cloud ML Engine 模型.我可以使用 gcloud ml-engine predict --model my_model_name --json-instances request.json 从这个模型中得到预测,其中 request.json 的内容是:

This code generates a saved_model.pb file, which I then used to create the Cloud ML Engine model. I can get predictions from this model using gcloud ml-engine predict --model my_model_name --json-instances request.json, where the contents of request.json are:

{ "key": "0", "image_bytes": { "b64": "jpeg_image_base64_encoded" } }

然而,无论我在请求中编码哪个 jpeg,我总是得到完全相同的错误预测:

However, no matter which jpeg I encode in the request, I always get the exact same wrong predictions:

预测输出

我想问题在于 CloudML Prediction API 将 base64 编码的图像字节传递给 inception_v3 的输入张量DecodeJpeg/contents:0"(前面代码中的build_image_str_tensor()"方法)的方式.关于如何解决此问题并让本地重新训练的模型在 Cloud ML Engine 上提供正确预测的任何线索?

I guess the problem is in the way the CloudML Prediction API passes the base64 encoded image bytes to the input tensor "DecodeJpeg/contents:0" of inception_v3 ("build_image_str_tensor()" method in the previous code). Any clue on how can I solve this issue and have my locally retrained model serving correct predictions on Cloud ML Engine?

(简单说一下,问题不在 retrained_graph.pb 中,因为当我在本地运行它时它会做出正确的预测;也不在 request.json 中,因为相同的请求文件在跟随 Flowers 时没有问题上面提到的 Cloud ML Engine 教程.)

(Just to make it clear, the problem is not in retrained_graph.pb, as it makes correct predictions when I run it locally; nor is it in request.json, because the same request file worked without problems when following the Flowers Cloud ML Engine Tutorial pointed above.)

推荐答案

首先,一般警告.TensorFlow for Poets codelab 的编写方式并不适合生产服务(部分体现在您必须实施的变通方法).您通常会导出不包含所有额外训练操作的特定于预测的图.因此,虽然我们可以尝试将一些有效的东西组合在一起,但可能需要额外的工作来生产这个图.

First, a general warning. The TensorFlow for Poets codelab was not written in a way that is very amenable to production serving (partly manifested by the workarounds you are having to implement). You would normally export a prediction-specific graph that doesn't contain all of the extra training ops. So while we can try and hack something together that works, extra work may be needed to productionize this graph.

您的代码的方法似乎是导入一个图形,添加一些占位符,然后导出结果.这通常没问题.但是,在问题中显示的代码中,您添加了输入占位符,而没有将它们实际连接到导入图中的任何内容.您最终会得到一个包含多个断开连接的子图的图形,例如(请原谅粗糙的图表):

The approach of your code appears to be to import one graph, add some placeholders, and then export the result. This is generally fine. However, in the code shown in the question, you are adding input placeholders without actually connecting them to anything in the imported graph. You end up with a graph containing multiple disconnected subgraphs, something like (excuse the crude diagram):

image_str_tensor [input=image_bytes] -> <nothing>
keys_placeholder [input=key]  -> identity [output=key]
inception_subgraph -> final_graph [output=prediction]

inception_subgraph 我指的是您导入的所有操作.

By inception_subgraph I mean all of the ops that you are importing.

所以 image_bytes 实际上是一个空操作并且被忽略;key 被通过;prediction 包含运行 inception_subgraph 的结果;由于它没有使用您传递的输入,因此每次都返回相同的结果(尽管我承认我实际上预计这里会出现错误).

So image_bytes is effectively a no-op and is ignored; key gets passed through; and prediction contains the result of running the inception_subgraph; since it's not using the input you are passing, it's returning the same result everytime (though I admit I actually expected an error here).

为了解决这个问题,我们需要将您创建的占位符连接到 inception_subgraph 中已经存在的占位符,以创建一个或多或少这样的图表:

To address this problem, we would need to connect the placeholder you've created to the one that already exists in inception_subgraph to create a graph more or less like this:

image_str_tensor [input=image_bytes] -> inception_subgraph -> final_graph [output=prediction]
keys_placeholder [input=key]  -> identity [output=key]   

请注意,image_str_tensor 将是预测服务要求的一批图像,但初始图的输入实际上是单个图像.为简单起见,我们将用一种骇人听闻的方式解决这个问题:我们假设我们将一张一张地发送图像.如果我们每次请求发送多于一张图片,我们就会收到错误消息.此外,批量预测永远不会奏效.

Note that image_str_tensor is going to be a batch of images, as required by the prediction service, but the inception graph's input is actually a single image. In the interest of simplicity, we're going to address this in a hacky way: we'll assume we'll be sending images one-by-one. If we ever send more than one image per request, we'll get errors. Also, batch prediction will never work.

您需要的主要更改是 import 语句,它将我们添加到图中现有输入的占位符连接起来(您还将看到用于更改输入形状的代码):

The main change you need is the import statement, which connects the placeholder we've added to the existing input in the graph (you'll also see the code for changing the shape of the input):

综合起来,我们得到如下结果:

Putting it all together, we get something like:

import tensorflow as tf
from tensorflow.contrib import layers

from tensorflow.python.saved_model import builder as saved_model_builder
from tensorflow.python.saved_model import signature_constants
from tensorflow.python.saved_model import signature_def_utils
from tensorflow.python.saved_model import tag_constants
from tensorflow.python.saved_model import utils as saved_model_utils


export_dir = '../tf_files/saved7'
retrained_graph = '../tf_files/retrained_graph2.pb'
label_count = 5

class Model(object):
    def __init__(self, label_count):
        self.label_count = label_count

    def build_prediction_graph(self, g):
        inputs = {
            'key': keys_placeholder,
            'image_bytes': tensors.input_jpeg
        }

        keys = tf.identity(keys_placeholder)
        outputs = {
            'key': keys,
            'prediction': g.get_tensor_by_name('final_result:0')
        }

        return inputs, outputs

    def export(self, output_dir):
        with tf.Session(graph=tf.Graph()) as sess:
            # This will be our input that accepts a batch of inputs
            image_bytes = tf.placeholder(tf.string, name='input', shape=(None,))
            # Force it to be a single input; will raise an error if we send a batch.
            coerced = tf.squeeze(image_bytes)
            # When we import the graph, we'll connect `coerced` to `DecodeJPGInput:0`
            input_map = {'DecodeJPGInput:0': coerced}

            with tf.gfile.GFile(retrained_graph, "rb") as f:
                graph_def = tf.GraphDef()
                graph_def.ParseFromString(f.read())
                tf.import_graph_def(graph_def, input_map=input_map, name="")

            keys_placeholder = tf.placeholder(tf.string, shape=[None])

            inputs = {'image_bytes': image_bytes, 'key': keys_placeholder}

            keys = tf.identity(keys_placeholder)
            outputs = {
                'key': keys,
                'prediction': tf.get_default_graph().get_tensor_by_name('final_result:0')}    
            }

            tf.simple_save(sess, output_dir, inputs, outputs)

model = Model(label_count)
model.export(export_dir)

这篇关于部署在 Cloud ML Engine 中的重新训练的 inception_v3 模型始终输出相同的预测的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆