gcloud ml-engine对大文件返回错误 [英] gcloud ml-engine returns error on large files

查看:104
本文介绍了gcloud ml-engine对大文件返回错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个训练有素的模型,需要输入一些大的输入.我通常将其作为形状为(1,473,473,3)的numpy数组来执行.当我将其放入JSON时,我最终得到了9.2MB的文件.即使我将其转换为JSON文件的base64编码,输入也仍然很大.

I have a trained model that takes in a somewhat large input. I generally do this as a numpy array of the shape (1,473,473,3). When I put that to JSON I end up getting about a 9.2MB file. Even if I convert that to a base64 encoding for the JSON file the input is still rather large.

ml-engine Forecast发送带有以下错误的JSON文件时会拒绝我的请求:

ml-engine predict rejects my request when sending the JSON file with the following error:

(gcloud.ml-engine.predict) HTTP request failed. Response: {
"error": {
    "code": 400,
    "message": "Request payload size exceeds the limit: 1572864 bytes.",
    "status": "INVALID_ARGUMENT"
  }
}

看来我无法将任何超过1.5MB的内容发送给ML引擎.这肯定是一件事情吗?其他人如何进行大数据的在线预测?我是否必须启动计算引擎,还是在那里遇到同样的问题?

It looks like I can't send anything over about 1.5MB in size to ML-engine. Is this for sure a thing? How do others get around doing online predictions for large data? Do I have to spin up a compute-engine or will I run into the same issue there?

我从Keras模型开始,尝试导出到tensorflow服务.我将Keras模型加载到名为"model"的变量中,并具有定义的目录"export_path".我建立像这样的tensorflow服务模型:

I am starting from a Keras model and trying to export to tensorflow serving. I load my Keras model into a variable named 'model' and have a defined directory "export_path". I build the tensorflow serving model like this:

signature = predict_signature_def(inputs={'input': model.input},
                                outputs={'output': model.output})
builder = saved_model_builder.SavedModelBuilder(export_path)
builder.add_meta_graph_and_variables(
    sess=sess,
    tags=[tag_constants.SERVING],
    signature_def_map={
        signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY: signature
    }
)
builder.save()

输入将如何查找这个signature_def? JSON是否类似于{'input':' https://storage.googleapis.com /projectid/bucket/filename '}其中文件是(1,473,473,3)numpy数组?

How would the input look for this signature_def? Would the JSON just be something like {'input': 'https://storage.googleapis.com/projectid/bucket/filename'} where the file is the (1,473,473,3) numpy array?

第二次修改: 看着Lak Lakshmanan发布的代码,我尝试了几种不同的变体,但没有成功读取图像URL并尝试以这种方式解析文件.我尝试了以下方法,但未成功:

2nd Looking at the code posted by Lak Lakshmanan, I have tried a few different variations without success to read an image url and attempt to parse the file that way. I have tried the following without success:

inputs = {'imageurl': tf.placeholder(tf.string, shape=[None])}
filename = tf.squeeze(inputs['imageurl']) 
image = read_and_preprocess(filename)#custom preprocessing function
image = tf.placeholder_with_default(image, shape=[None, HEIGHT, WIDTH, NUM_CHANNELS])
features = {'image' : image}
inputs.update(features)
signature = predict_signature_def(inputs= inputs,
                                outputs={'output': model.output})


with K.get_session() as session:
    """Convert the Keras HDF5 model into TensorFlow SavedModel."""
    builder = saved_model_builder.SavedModelBuilder(export_path)
    builder.add_meta_graph_and_variables(
        sess=session,
        tags=[tag_constants.SERVING],
        signature_def_map={
            signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY: signature
        }
    )
    builder.save()

我认为问题在于从imageurl占位符到构建要素的映射.关于我在做什么错的想法?​​

I believe the problem is with getting a mapping from the imageurl placeholder towards building the features. Thoughts on what I am doing wrong?

推荐答案

我通常要做的是让json引用Google Cloud Storage中的文件.例如,请参见此处的服务输入功能:

What I typically do is to have the json refer to a file in Google Cloud Storage. See the serving input function here for example:

用户将首先将其文件上传到gcs,然后调用预测.但是这种方法还有其他优点,因为存储实用程序允许并行和多线程上载.

Users would first upload their file to gcs and then invoke prediction. But this approach has other advantages, since the storage utilities allow for parallel and multithreaded uploads.

这篇关于gcloud ml-engine对大文件返回错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆