在 sagemaker 中进行预测之前,如何预处理输入数据? [英] how can I preprocess input data before making predictions in sagemaker?

查看:26
本文介绍了在 sagemaker 中进行预测之前,如何预处理输入数据?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 java Sagemaker SDK 调用 Sagemaker 端点.我发送的数据在模型可以用于预测之前几乎不需要清理.我怎么能在 Sagemaker 中做到这一点.

I am calling a Sagemaker endpoint using java Sagemaker SDK. The data that I am sending needs little cleaning before the model can use it for prediction. How can I do that in Sagemaker.

我在 Jupyter 笔记本实例中有一个预处理功能,它在传递训练数据以训练模型之前清理训练数据.现在我想知道我是否可以在调用端点时使用该函数,或者该函数是否已被使用?如果有人愿意,我可以展示我的代码吗?

I have a pre-processing function in the Jupyter notebook instance which is cleaning the training data before passing that data to train the model. Now I want to know if I can use that function while calling the endpoint or is that function already being used? I can show my code if anyone wants?

编辑 1基本上,在预处理中,我是在做标签编码.这是我的预处理函数

EDIT 1 Basically, in the pre-processing, I am doing label encoding. Here is my function for preprocessing

def preprocess_data(data):
 print("entering preprocess fn")
 # convert document id & type to labels
 le1 = preprocessing.LabelEncoder()
 le1.fit(data["documentId"])
 data["documentId"]=le1.transform(data["documentId"])
 le2 = preprocessing.LabelEncoder()
 le2.fit(data["documentType"])
 data["documentType"]=le2.transform(data["documentType"])
 print("exiting preprocess fn")
 return data,le1,le2

这里的数据"是一个熊猫数据框.

Here the 'data' is a pandas dataframe.

现在我想在调用端点时使用这些 le1,le2.我想在 sagemaker 本身而不是在我的 java 代码中进行这个预处理.

Now I want to use these le1,le2 at the time of calling endpoint. I want to do this preprocessing in sagemaker itself not in my java code.

推荐答案

SageMaker 现在有一个新功能,称为推理管道.这使您可以构建一个由 2 到 5 个容器组成的线性序列,用于预处理/后处理请求.然后将整个管道部署在单个端点上.

There is now a new feature in SageMaker, called inference pipelines. This lets you build a linear sequence of two to five containers that pre/post-process requests. The whole pipeline is then deployed on a single endpoint.

https://docs.aws.amazon.com/sagemaker/latest/dg/inference-pipelines.html

这篇关于在 sagemaker 中进行预测之前,如何预处理输入数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆