将Keras模型部署到Google Cloud ML中以提供预测 [英] Deploying Keras model to Google Cloud ML for serving predictions

查看：76 发布时间：2020/4/25 10:34:17 deployment tensorflow google-cloud-platform keras google-cloud-ml

本文介绍了将Keras模型部署到Google Cloud ML中以提供预测的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我需要了解如何在Google Cloud ML上部署模型.我的第一个任务是在服务上部署一个非常简单的文本分类器.我可以按照以下步骤进行操作(也许可以缩短为更少的步骤，如果可以的话，请随时告诉我):

I need to understand how to deploy models on Google Cloud ML. My first task is to deploy a very simple text classifier on the service. I do it in the following steps (could perhaps be shortened to fewer steps, if so, feel free to let me know):

使用Keras定义模型并导出到YAML
加载YAML并导出为Tensorflow SavedModel
将模型上传到Google Cloud Storage
将模型从存储部署到Google Cloud ML
在模型网站上将上传模型版本设置为默认值.
使用示例输入运行模型

我终于使第1-5步工作了，但是现在我在运行模型时看到以下奇怪的错误.有人可以帮忙吗?有关步骤的详细信息如下.希望它也可以帮助停留在前面步骤之一中的其他人.我的模型在本地工作正常.

I've finally made step 1-5 work, but now I get this strange error seen below when running the model. Can anyone help? Details on the steps is below. Hopefully, it can also help others that are stuck on one of the previous steps. My model works fine locally.

我已经看到通过Google Cloud ML部署Keras模型和将基本Tensorflow模型导出到Google Cloud ML ，但它们似乎停留在该过程的其他步骤上.

I've seen Deploying Keras Models via Google Cloud ML and Export a basic Tensorflow model to Google Cloud ML, but they seem to be stuck on other steps of the process.

错误

Prediction failed: Exception during model execution: AbortionError(code=StatusCode.INVALID_ARGUMENT, details="In[0] is not a matrix
         [[Node: MatMul = MatMul[T=DT_FLOAT, _output_shapes=[[-1,64]], transpose_a=false, transpose_b=false, _device="/job:localhost/replica:0/task:0/cpu:0"](Mean, softmax_W/read)]]")

第1步

# import necessary classes from Keras..
model_input = Input(shape=(maxlen,), dtype='int32')
embed = Embedding(input_dim=nb_tokens,
                  output_dim=256,
                  mask_zero=False,
                  input_length=maxlen,
                  name='embedding')
x = embed(model_input)
x = GlobalAveragePooling1D()(x)
outputs = [Dense(nb_classes, activation='softmax', name='softmax')(x)]
model = Model(input=[model_input], output=outputs, name="fasttext")
# export to YAML..

第2步

from __future__ import print_function

import sys
import os

import tensorflow as tf
from tensorflow.contrib.session_bundle import exporter
import keras
from keras import backend as K
from keras.models import model_from_config, model_from_yaml
from optparse import OptionParser

EXPORT_VERSION = 1 # for us to keep track of different model versions (integer)

def export_model(model_def, model_weights, export_path):

    with tf.Session() as sess:
        init_op = tf.global_variables_initializer()
        sess.run(init_op)

        K.set_learning_phase(0)  # all new operations will be in test mode from now on

        yaml_file = open(model_def, 'r')
        yaml_string = yaml_file.read()
        yaml_file.close()

        model = model_from_yaml(yaml_string)

        # force initialization
        model.compile(loss='categorical_crossentropy',
                      optimizer='adam') 
        Wsave = model.get_weights()
        model.set_weights(Wsave)

        # weights are not loaded as I'm just testing, not really deploying
        # model.load_weights(model_weights)   

        print(model.input)
        print(model.output)

        pred_node_names = output_node_names = 'Softmax:0'
        num_output = 1

        export_path_base = export_path
        export_path = os.path.join(
            tf.compat.as_bytes(export_path_base),
            tf.compat.as_bytes('initial'))
        builder = tf.saved_model.builder.SavedModelBuilder(export_path)

        # Build the signature_def_map.
        x = model.input
        y = model.output

        values, indices = tf.nn.top_k(y, 5)
        table = tf.contrib.lookup.index_to_string_table_from_tensor(tf.constant([str(i) for i in xrange(5)]))
        prediction_classes = table.lookup(tf.to_int64(indices))

        classification_inputs = tf.saved_model.utils.build_tensor_info(model.input)
        classification_outputs_classes = tf.saved_model.utils.build_tensor_info(prediction_classes)
        classification_outputs_scores = tf.saved_model.utils.build_tensor_info(values)
        classification_signature = (
        tf.saved_model.signature_def_utils.build_signature_def(inputs={tf.saved_model.signature_constants.CLASSIFY_INPUTS: classification_inputs},
          outputs={tf.saved_model.signature_constants.CLASSIFY_OUTPUT_CLASSES: classification_outputs_classes, tf.saved_model.signature_constants.CLASSIFY_OUTPUT_SCORES: classification_outputs_scores},
          method_name=tf.saved_model.signature_constants.CLASSIFY_METHOD_NAME))

        tensor_info_x = tf.saved_model.utils.build_tensor_info(x)
        tensor_info_y = tf.saved_model.utils.build_tensor_info(y)

        prediction_signature = (tf.saved_model.signature_def_utils.build_signature_def(
            inputs={'images': tensor_info_x},
            outputs={'scores': tensor_info_y},
            method_name=tf.saved_model.signature_constants.PREDICT_METHOD_NAME))

        legacy_init_op = tf.group(tf.tables_initializer(), name='legacy_init_op')
        builder.add_meta_graph_and_variables(
            sess, [tf.saved_model.tag_constants.SERVING],
            signature_def_map={'predict_images': prediction_signature,
               tf.saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY: classification_signature,},
            legacy_init_op=legacy_init_op)

        builder.save()
        print('Done exporting!')

        raise SystemExit

if __name__ == '__main__':
    usage = "usage: %prog [options] arg"
    parser = OptionParser(usage)
    (options, args) = parser.parse_args()

    if len(args) < 3:   
        raise ValueError("Too few arguments!")

    model_def = args[0]
    model_weights = args[1]
    export_path = args[2]
    export_model(model_def, model_weights, export_path)

第3步

gsutil cp -r fasttext_cloud/ gs://quiet-notch-xyz.appspot.com

第4步

from __future__ import print_function

from oauth2client.client import GoogleCredentials
from googleapiclient import discovery
from googleapiclient import errors
import time

projectID = 'projects/{}'.format('quiet-notch-xyz')
modelName = 'fasttext'
modelID = '{}/models/{}'.format(projectID, modelName)
versionName = 'Initial'
versionDescription = 'Initial release.'
trainedModelLocation = 'gs://quiet-notch-xyz.appspot.com/fasttext/'

credentials = GoogleCredentials.get_application_default()
ml = discovery.build('ml', 'v1', credentials=credentials)

# Create a dictionary with the fields from the request body.
requestDict = {'name': modelName, 'description': 'Online predictions.'}

# Create a request to call projects.models.create.
request = ml.projects().models().create(parent=projectID, body=requestDict)

# Make the call.
try:
    response = request.execute()
except errors.HttpError as err: 
    # Something went wrong, print out some information.
    print('There was an error creating the model.' +
        ' Check the details:')
    print(err._get_reason())

    # Clear the response for next time.
    response = None
    raise


time.sleep(10)

requestDict = {'name': versionName,
               'description': versionDescription,
               'deploymentUri': trainedModelLocation}

# Create a request to call projects.models.versions.create
request = ml.projects().models().versions().create(parent=modelID,
              body=requestDict)

# Make the call.
try:
    print("Creating model setup..", end=' ')
    response = request.execute()

    # Get the operation name.
    operationID = response['name']
    print('Done.')

except errors.HttpError as err:
    # Something went wrong, print out some information.
    print('There was an error creating the version.' +
          ' Check the details:')
    print(err._get_reason())
    raise

done = False
request = ml.projects().operations().get(name=operationID)
print("Adding model from storage..", end=' ')

while (not done):
    response = None

    # Wait for 10000 milliseconds.
    time.sleep(10)

    # Make the next call.
    try:
        response = request.execute()

        # Check for finish.
        done = True # response.get('done', False)

    except errors.HttpError as err:
        # Something went wrong, print out some information.
        print('There was an error getting the operation.' +
              'Check the details:')
        print(err._get_reason())
        done = True
        raise

print("Done.")

第5步

使用网站.

第6步

def predict_json(instances, project='quiet-notch-xyz', model='fasttext', version=None):
    """Send json data to a deployed model for prediction.

    Args:
        project (str): project where the Cloud ML Engine Model is deployed.
        model (str): model name.
        instances ([Mapping[str: Any]]): Keys should be the names of Tensors
            your deployed model expects as inputs. Values should be datatypes
            convertible to Tensors, or (potentially nested) lists of datatypes
            convertible to tensors.
        version: str, version of the model to target.
    Returns:
        Mapping[str: any]: dictionary of prediction results defined by the
            model.
    """
    # Create the ML Engine service object.
    # To authenticate set the environment variable
    # GOOGLE_APPLICATION_CREDENTIALS=<path_to_service_account_file>
    service = googleapiclient.discovery.build('ml', 'v1')
    name = 'projects/{}/models/{}'.format(project, model)

    if version is not None:
        name += '/versions/{}'.format(version)

    response = service.projects().predict(
        name=name,
        body={'instances': instances}
    ).execute()

    if 'error' in response:
        raise RuntimeError(response['error'])

    return response['predictions']

然后运行带有测试输入的功能:predict_json({'inputs':[[18, 87, 13, 589, 0]]})

Then run function with test input: predict_json({'inputs':[[18, 87, 13, 589, 0]]})

推荐答案

现在有一个示例演示了在CloudML引擎上使用Keras的示例，包括预测.您可以在此处找到示例:

There is now a sample demonstrating the use of Keras on CloudML engine, including prediction. You can find the sample here:

https://github.com/GoogleCloudPlatform/cloudml-samples /tree/master/census/keras

我建议将您的代码与该代码进行比较.

I would suggest comparing your code to that code.

一些仍然有用的其他建议:

Some additional suggestions that will still be relevant:

CloudML Engine当前仅支持使用单个签名(默认签名).查看您的代码，我认为projection_signature更有可能导致成功，但是您尚未将其设为默认签名.我建议以下内容:

CloudML Engine currently only supports using a single signature (the default signature). Looking at your code, I think prediction_signature is more likely to lead to success, but you haven't made that the default signature. I suggest the following:

builder.add_meta_graph_and_variables(
            sess, [tf.saved_model.tag_constants.SERVING],
            signature_def_map={tf.saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY: prediction_signature,},
            legacy_init_op=legacy_init_op)

如果要部署到服务，则可以像这样调用预测:

If you are deploying to the service, then you would invoke prediction like so:

predict_json({'images':[[18, 87, 13, 589, 0]]})

如果您正在使用gcloud ml-engine local predict --json-instances在本地进行测试，则输入数据会略有不同(与批次预测服务的数据匹配).每行以换行符分隔的行如下所示(显示一个包含两行的文件):

If you are testing locally using gcloud ml-engine local predict --json-instances the input data is slightly different (matches that of the batch prediction service). Each newline-separated line looks like this (showing a file with two lines):

{'images':[[18, 87, 13, 589, 0]]}
{'images':[[21, 85, 13, 100, 1]]}

我实际上对model.x的形状了解不足，无法确保所发送的数据对于您的模型是正确的.

I don't actually know enough about the shape of model.x to ensure the data being sent is correct for your model.

通过解释的方式，考虑SavedModel中Classification和Prediction方法之间的区别可能是有见地的.一个区别是，当使用基于tensorflow_serving的tensorflow_serving(强类型)时，Classification提供了大多数分类器都可以使用的强类型签名.然后，您可以在任何分类器上重用相同的客户端.

By way of explanation, it may be insightful to consider the difference between the Classification and Prediction methods in SavedModel. One difference is that, when using tensorflow_serving, which is based on gRPC, which is strongly typed, Classification provides a strongly-typed signature that most classifiers can use. Then you can reuse the same client on any classifier.

使用JSON时并不太有用，因为JSON的类型不严格.

That's not overly useful when using JSON since JSON isn't strongly typed.

另一个区别是，当使用tensorflow_serving时，Prediction接受基于列的输入(整个批次中从功能名称到该功能的每个值的映射)，而Classification接受基于行的输入(每个输入实例/示例是一行).

One other difference is that, when using tensorflow_serving, Prediction accepts column-based inputs (a map from feature name to every value for that feature in the whole batch) whereas Classification accepts row based inputs (each input instance/example is a row).

CloudML稍微抽象了一点，并且始终需要基于行的输入(实例列表).即使我们只正式支持Prediction，但Classification也应该可以正常工作.

CloudML abstracts that away a bit and always requires row-based inputs (a list of instances). We even though we only officially support Prediction, but Classification should work as well.

这篇关于将Keras模型部署到Google Cloud ML中以提供预测的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

将Keras模型部署到Google Cloud ML中以提供预测 [英] Deploying Keras model to Google Cloud ML for serving predictions

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

将Keras模型部署到Google Cloud ML中以提供预测 [英] Deploying Keras model to Google Cloud ML for serving predictions

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭