通过CloudML获取TFrecord的批量预测 [英] Getting batch predictions for TFrecords via CloudML

查看:182
本文介绍了通过CloudML获取TFrecord的批量预测的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我遵循了这篇出色的教程,并成功地训练了模型(在CloudML上).我的代码也可以离线进行预测,但是现在我正在尝试使用Cloud ML进行预测并遇到一些问题.

I followed this great tutorial and successfully trained a model (on CloudML). My code also makes predictions offline, but now I am trying to use Cloud ML to make predictions and have some problems.

要部署我的模型,我遵循了本教程.现在,我有一个通过apache_beam.io.WriteToTFRecord生成TFRecords的代码,我想对这些TFRecords进行预测.为此,我正在关注本文,我的命令看起来像这样:

To deploy my model I followed this tutorial. Now I have a code that generates TFRecords via apache_beam.io.WriteToTFRecord and I want to make predictions for those TFRecords. To do so I am following this article, my command looks like this:

gcloud ml-engine jobs submit prediction $JOB_ID --model $MODEL --input-paths gs://"$FILE_INPUT".gz --output-path gs://"$OUTPUT"/predictions --region us-west1 --data-format TF_RECORD_GZIP

但是我只有错误: 'Exception during running the graph: Expected serialized to be a scalar, got shape: [64]

似乎它期望数据采用不同的格式.我在此处找到了格式规范,但是找不到使用TFrecords的方法.

It seems like it expect data in a different format. I found the format specs for JSON here, but couldn't find how to do it with TFrecords.

更新:这是saved_model_cli show --all --dir

MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:

signature_def['prediction']:
  The given SavedModel SignatureDef contains the following input(s):
    inputs['example_proto'] tensor_info:
    dtype: DT_STRING
    shape: unknown_rank
    name: input:0
  The given SavedModel SignatureDef contains the following output(s):
    outputs['probability'] tensor_info:
    dtype: DT_FLOAT
    shape: (1, 1)
    name: probability:0
  Method name is: tensorflow/serving/predict

signature_def['serving_default']:
  The given SavedModel SignatureDef contains the following input(s):
    inputs['example_proto'] tensor_info:
    dtype: DT_STRING
    shape: unknown_rank
    name: input:0
  The given SavedModel SignatureDef contains the following output(s):
    outputs['probability'] tensor_info:
    dtype: DT_FLOAT
    shape: (1, 1)
    name: probability:0
  Method name is: tensorflow/serving/predict

推荐答案

导出模型时,需要确保它是可批量的",即输入占位符的外部尺寸为shape=[None],例如,

When you export your model, you need to make sure that it is "batchable", i.e., the outer dimension of the input Placeholder has shape=[None], e.g.,

input = tf.Placeholder(dtype=tf.string, shape=[None])
...

这可能需要重新处理图形.例如,我看到输出的形状被硬编码为[1,1].最外面的尺寸应为None,这可能会在您固定占位符时自动发生,或者可能需要进行其他更改.

That may require reworking the graph a bit. For instance, I see that the shape of your output is hard-coded to [1,1]. The outermost dimension should be None, this may happen automatically when you fix the placeholder, or it may require other changes.

鉴于输出的名称是probabilities,我也希望最里面的维度是> 1,即要预测的类数,所以类似[None, NUM_CLASSES].

Given that the name of the output is probabilities, I would also expect the innermost dimension to be >1, i.e. the number of classes being predicted, so something like [None, NUM_CLASSES].

这篇关于通过CloudML获取TFrecord的批量预测的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆