AWS Sagemaker CustomerError:监控输入时编码不匹配 [英] AWS Sagemaker CustomerError: Encoding Mismatch when monitoring input

查看:31
本文介绍了AWS Sagemaker CustomerError:监控输入时编码不匹配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经在 AWS 中部署了一个管道模型,现在正在尝试使用 ModelMonitor 来评估传入的数据行为,但是在生成监控报告

I've deployed a Pipeline model in AWS and am now trying to use ModelMonitor to assess incoming data behavior, but it failes when generating monitoring report

管道由预处理步骤和常规 XGBoost 容器组成.该模型使用 Content-type: application/json 调用.

The pipeline consists of a preprocessing step and then a regular XGBoost container. The model is invoked with Content-type: application/json.

为此,我按照 文档 中所述进行设置,但它失败并出现以下错误

For that I set up as stated in the docs, but it fails with the following error

线程main"中的异常com.amazonaws.sagemaker.dataanalyzer.exception.CustomerError:错误:编码不匹配:endpointInput 的编码是 JSON,但 endpointOutput 的编码是 CSV.目前我们只支持相同类型的输入和输出编码.

Exception in thread "main" com.amazonaws.sagemaker.dataanalyzer.exception.CustomerError: Error: Encoding mismatch: Encoding is JSON for endpointInput, but Encoding is CSV for endpointOutput. We currently only support the same type of input and output encoding at the moment.

我在 GitHub 上发现了这个问题,但没有帮不了我.

I've found this issue at GitHub, but didn't help me.

深入研究 XGBoost 的输出方式,我发现它是 CSV 编码的,因此错误是有道理的,但即使部署执行序列化程序的模型也失败(下面部分中的代码)

Digging depper into how XGBoost outputs, I've found out that it's CSV encoded, hence the error makes sense, but even deploying the model enforcing the serializers fails (code in the section below)

我正在按照 AWS 的建议配置计划,我刚刚更改了约束的位置(必须手动调整它们)

I'm configuring the schedule as recommended by AWS, I've just changed the location of my constraints (had to manually adjust'em)

--->到目前为止尝试过(所有尝试都失败并出现完全相同的错误)

---> Tried so far (all attempts fail with the exact same error)

  1. 如问题中所述,但由于我期待 json 有效负载,因此我使用了

data_capture_config=DataCaptureConfig(
    enable_capture = True,
    sampling_percentage=100,
    json_content_types = ['application/json'],
    destination_s3_uri=MY_BUCKET)

  1. 尝试强制执行预测器的(反)序列化器(我不确定这是否有意义)

predictor = Predictor(
    endpoint_name=MY_ENDPOINT,
    # Hoping that I could force the output to be a JSON
    deserializer=sagemaker.deserializers.JSONDeserializer) 

以后

predictor = Predictor(
    endpoint_name=MY_ENDPOINT,
    # Hoping that I could force the input to be a CSV
    serializer=sagemaker.serializers.CSVSerializer) 

  1. 在部署期间设置(反)序列化程序

p_modle = pipeline_model.deploy(
    initial_instance_count=1,
    instance_type='ml.m4.xlarge',
    endpoint_name=MY_ENDPOINT,
    serializer = sagemaker.serializers.JSONSerializer(),
    deserializer= sagemaker.deserializers.JSONDeserializer(),
    wait = True)

推荐答案

我之前在使用 boto3 sagemaker 运行时调用端点时遇到过类似的问题.尝试在 invoke_endpoint 函数中添加Accept"参数,值为application/json".

I have come across a similar issue earlier while invoking the endpoint using boto3 sagemaker runtime. Try adding the 'Accept' parameter in invoke_endpoint function with value as 'application/json'.

参考更多帮助 https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_runtime_InvokeEndpoint.html#API_runtime_InvokeEndpoint_RequestSyntax

这篇关于AWS Sagemaker CustomerError:监控输入时编码不匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆