将请求发送到 Sagemaker 中部署的 tensorflow 模型比 JSON 更有效的方法? [英] More efficient way to send a request than JSON to deployed tensorflow model in Sagemaker?

查看:28
本文介绍了将请求发送到 Sagemaker 中部署的 tensorflow 模型比 JSON 更有效的方法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经在 Sagemaker 中训练了一个基于 tf.estimator 的 TensorFlow 模型并进行了部署,并且运行良好.

I have trained a tf.estimator based TensorFlow model in Sagemaker and deployed it and it works fine.

但我只能以 JSON 格式向它发送请求.我需要发送一些大的输入张量,这看起来效率很低,而且很快就打破了 InvokeEndpoints 5MB 的请求限制.

But I can only send requests to it in JSON format. I need to send some big input tensors and this seems very inefficient and also quickly breaks InvokeEndpoints 5MB request limit.

是否可以针对基于 tensorflow 服务的端点使用更有效的格式?

Is it possible to use a more effective format against the tensorflow serving based endpoint?

我尝试发送基于 protobuf 的请求:

I tried sending a protobuf based request:

from sagemaker.tensorflow.serving import Model 
from sagemaker.tensorflow.tensorflow_serving.apis import predict_pb2 
from sagemaker.tensorflow.predictor import tf_serializer, tf_deserializer

role = 'xxx'

model = Model('s3://xxx/tmp/artifacts/sagemaker-tensorflow-scriptmode-xxx/output/model.tar.gz', role)

predictor = model.deploy(initial_instance_count=1, instance_type='ml.c5.xlarge', endpoint_name='test-endpoint')

# this predictor has json serializer, make a new one pred = 

RealTimePredictor('test-endpoint', serializer=tf_serializer, deserializer=tf_deserializer)

req = predict_pb2.PredictRequest()

req.inputs['instances'].CopyFrom(tf.make_tensor_proto(np.zeros((4, 36, 64)), shape=(4, 36, 64)))

predictor.predict(req)

导致以下错误:

---------------------------------------------------------------------------
ModelError                                Traceback (most recent call last)
<ipython-input-40-5ba7f281bd0d> in <module>()
----> 1 predictor.predict(req)

~/anaconda3/envs/default/lib/python3.6/site-packages/sagemaker/predictor.py in predict(self, data, initial_args)
     76 
     77         request_args = self._create_request_args(data, initial_args)
---> 78         response = self.sagemaker_session.sagemaker_runtime_client.invoke_endpoint(**request_args)
     79         return self._handle_response(response)
     80 

~/anaconda3/envs/default/lib/python3.6/site-packages/botocore/client.py in _api_call(self, *args, **kwargs)
    355                     "%s() only accepts keyword arguments." % py_operation_name)
    356             # The "self" in this scope is referring to the BaseClient.
--> 357             return self._make_api_call(operation_name, kwargs)
    358 
    359         _api_call.__name__ = str(py_operation_name)

~/anaconda3/envs/default/lib/python3.6/site-packages/botocore/client.py in _make_api_call(self, operation_name, api_params)
    659             error_code = parsed_response.get("Error", {}).get("Code")
    660             error_class = self.exceptions.from_code(error_code)
--> 661             raise error_class(parsed_response, operation_name)
    662         else:
    663             return parsed_response

ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (415) from model with message "{"error": "Unsupported Media Type: application/octet-stream"}".

JSON 是已部署的 TensorFlow 模型唯一可用的查询格式吗?

Is JSON the only available query format for deployed TensorFlow models ?

推荐答案

你看过批量转换吗?如果您实际上不需要 HTTPS 端点,这可能会解决您的问题:

Have you looked at batch transform? If you don't need actually an HTTPS endpoint, this might solve your problem:

https://docs.aws.amazon.com/sagemaker/latest/dg/ex1-batch-transform.html

这篇关于将请求发送到 Sagemaker 中部署的 tensorflow 模型比 JSON 更有效的方法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆