调用 InvokeEndpoint 操作时发生错误 (InternalFailure):向模型发送请求时发生异常 [英] An error occurred (InternalFailure) when calling the InvokeEndpoint operation: An exception occurred while sending request to model

查看：60 发布时间：2021/10/1 18:37:12 python amazon-web-services machine-learning xgboost amazon-sagemaker

本文介绍了调用 InvokeEndpoint 操作时发生错误 (InternalFailure):向模型发送请求时发生异常的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试在 AWS Sagemaker 终端节点上托管已在本地训练的 XGBoost 模型，但在调用终端节点时收到以下错误:

I am trying to host an XGBoost model that I have trained locally on an AWS Sagemaker Endpoint but I am receiving the following error when invoking the endpoint:

调用 InvokeEndpoint 操作时发生错误 (InternalFailure)(达到最大重试次数:4):向模型发送请求时发生异常.有关请求，请联系客户支持.

An error occurred (InternalFailure) when calling the InvokeEndpoint operation (reached max retries: 4): An exception occurred while sending request to model. Please contact customer support regarding request.

模型在本地按预期工作，我在上传到 S3 之前使用以下内容保存它:

The model works as expected locally and I save it using the following before uploading to S3:

model.fit(args)
model.save_model(model_save_loc)
model_tar_loc = model_save_loc + '.tar.gz'
!tar czvf $model_tar_loc $model_save_loc

我通过 MultiDataModel 函数托管模型，

I am hosting the model through the MultiDataModel function,

container = retrieve("xgboost", region, "1.3-1")
mme = MultiDataModel(
    name=model_name,
    role=role,
    model_data_prefix=model_data_prefix,
    image_uri=container,
    sagemaker_session=sagemaker_session,
)

predictor = mme.deploy(
    initial_instance_count=1, instance_type=instance_type, endpoint_name=model_name,     
)

MultiDataModel 部署按预期工作，没有错误，如果我这样做:

The MultiDataModel deploy works as expected with no errors, and if I do:

list(mme.list_models())

它返回预期的模型列表:

It returns the expected list of models:

model_1.tar.gz
model_2.tar.gz
etc..

我使用以下方法调用模型:

I invoke the model using the following:

runtime_client = boto3.client("runtime.sagemaker")

response = runtime_client.invoke_endpoint(
    EndpointName="model_name", ContentType="text/csv", Body=payload, TargetModel='model_1.tar.gz'
)
result = response["Body"].read().decode("ascii")

我尝试了各种创建有效负载的方法，但都没有改变错误消息.

I have experimented with various ways of creating the payload but none change the error message.

本地 XGBoost 模型使用 XGBoost 1.3.1 版本(与 Docker 版本相同)进行训练.

The local XGBoost model was trained using XGBoost version 1.3.1 (same as the Docker version).

CloudWatch 仅提供以下内容:

CloudWatch provides only the following:

2021-06-26 10:48:36,865 [INFO] pool-1-thread-1 ACCESS_LOG -/10.32.0.2:37106 GET/ping HTTP/1.1"200 0

2021-06-26 10:48:36,865 [INFO ] pool-1-thread-1 ACCESS_LOG - /10.32.0.2:37106 "GET /ping HTTP/1.1" 200 0

根据错误提示，无法通过基本计划联系客户支持.

There is no way of contacting customer support through the basic plan, as advised by the error.

调用 InvokeEndpoint 操作时发生错误 (InternalFailure):向模型发送请求时发生异常 [英] An error occurred (InternalFailure) when calling the InvokeEndpoint operation: An exception occurred while sending request to model

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录关闭

调用 InvokeEndpoint 操作时发生错误 (InternalFailure):向模型发送请求时发生异常 [英] An error occurred (InternalFailure) when calling the InvokeEndpoint operation: An exception occurred while sending request to model

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录 关闭

登录关闭