如何将stepfunctionexecutionId解析为SageMaker批量转换作业名称? [英] How to parse stepfunction executionId to SageMaker batch transform job name?

查看:80
本文介绍了如何将stepfunctionexecutionId解析为SageMaker批量转换作业名称?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我创建了一个阶跃函数,下面的状态机( step-function.json )的定义在terraform中使用(使用此页面中的语法:

I have created a stepfunction, the definition for this statemachine below (step-function.json) is used in terraform (using the syntax in this page:https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateTransformJob.html)

如果我第一次执行此状态机,它将创建一个名为 example-jobname 的SageMaker批量转换作业,但是我需要每天执行此状态机,然后它会给我错误错误":"SageMaker.ResourceInUseException",原因":作业名称在AWS帐户和区域内必须是唯一的,并且具有该名称的作业已经存在.

The first time if I execute this statemachine, it will create a SageMaker batch transform job named example-jobname, but I need to exeucute this statemachine everyday, then it will give me error "error": "SageMaker.ResourceInUseException", "cause": "Job name must be unique within an AWS account and region, and a job with this name already exists .

原因是因为作业名称被硬编码为 example-jobname ,所以如果状态机在第一次之后执行,则由于作业名称必须是唯一的,因此任务将失败,只是想知道如何添加字符串(作业名称末尾的ExecutionId之类的东西).这是我尝试过的:

The cause is because the job name is hard-coded as example-jobname so if the state machine gets executed after the first time, since the job name needs to be unique, the task will fail, just wondering how I can add a string (something like ExecutionId at the end of the job name). Here's what I have tried:

  1. 我在 Parameters中添加了"executionId.$":"States.Format('somestring {}',$$.Execution.Id)'部分,但执行任务时出现错误"error":"States.Runtime","cause":执行状态'SageMaker时发生错误CreateTransformJob'(在事件ID#2中输入).参数'{​​\"BatchStrategy \":\\"SingleRecord \",......... \"executionId \":\"somestring arn:aws:状态:us-east-1:xxxxx:执行:xxxxx状态机:xxxxxxxx72950 \}"不能用于启动任务:功能不支持]"}

  1. I added "executionId.$": "States.Format('somestring {}', $$.Execution.Id)" in the Parameters section in the json file, but when I execute the task I got error "error": "States.Runtime", "cause": "An error occurred while executing the state 'SageMaker CreateTransformJob' (entered at the event id #2). The Parameters '{\"BatchStrategy\":\"SingleRecord\",..............\"executionId\":\"somestring arn:aws:states:us-east-1:xxxxx:execution:xxxxx-state-machine:xxxxxxxx72950\"}' could not be used to start the Task: [The field \"executionId\" is not supported by Step Functions]"}

我将json文件中的作业名称修改为"TransformJobName":"example-jobname-States.Format('somestring {}',$$.Execution.Id)'",,当我执行状态机时,它给了我错误:错误":"SageMaker.AmazonSageMakerException",原因":检测到2个验证错误:值'example-jobname-States'.位于'transformJobName'的Format('somestring {}',$$.Execution.Id)'无法满足约束条件:成员必须满足正则表达式模式:^ [a-zA-Z0-9](-* [a-zA-Z0-9]){0,62};'transformJobName'的值'example-jobname-States.Format('somestring {}',$$.Execution.Id)'不能满足约束:成员的长度必须小于或等于63

I modified the jobname in the json file to "TransformJobName": "example-jobname-States.Format('somestring {}', $$.Execution.Id)",, when I execute the statemachine, it gave me error: "error": "SageMaker.AmazonSageMakerException", "cause": "2 validation errors detected: Value 'example-jobname-States.Format('somestring {}', $$.Execution.Id)' at 'transformJobName' failed to satisfy constraint: Member must satisfy regular expression pattern: ^[a-zA-Z0-9](-*[a-zA-Z0-9]){0,62}; Value 'example-jobname-States.Format('somestring {}', $$.Execution.Id)' at 'transformJobName' failed to satisfy constraint: Member must have length less than or equal to 63

我的想法真的用完了,有人可以帮忙吗?非常感谢.

I really run out of ideas, can someone help please? Many thanks.

推荐答案

因此,根据

So as per the documentation, we should be passing the parameters in the following format

        "Parameters": {
            "ModelName.$": "$$.Execution.Name",  
            ....
        },

如果仔细看,这是定义中缺少的内容,因此您的步进函数定义应类似于以下内容:

If you take a close look this is something missing from your definition, So your step function definition should be something like below:

两个

      "TransformJobName.$": "$$.Execution.Id",

OR

      "TransformJobName.$: "States.Format('mytransformjob{}', $$.Execution.Id)"

完整状态机定义:

    {
        "Comment": "Defines the statemachine.",
        "StartAt": "Generate Random String",
        "States": {
            "Generate Random String": {
                "Type": "Task",
                "Resource": "arn:aws:lambda:eu-central-1:1234567890:function:randomstring",
                "ResultPath": "$.executionid",
                "Parameters": {
                "executionId.$": "$$.Execution.Id"
                },
                "Next": "SageMaker CreateTransformJob"
            },
        "SageMaker CreateTransformJob": {
            "Type": "Task",
            "Resource": "arn:aws:states:::sagemaker:createTransformJob.sync",
            "Parameters": {
            "BatchStrategy": "SingleRecord",
            "DataProcessing": {
                "InputFilter": "$",
                "JoinSource": "Input",
                "OutputFilter": "xxx"
            },
            "Environment": {
                "SAGEMAKER_MODEL_SERVER_TIMEOUT": "300"
            },
            "MaxConcurrentTransforms": 100,
            "MaxPayloadInMB": 1,
            "ModelName": "${model_name}",
            "TransformInput": {
                "DataSource": {
                    "S3DataSource": {
                        "S3DataType": "S3Prefix",
                        "S3Uri": "${s3_input_path}"
                    }
                },
                "ContentType": "application/jsonlines",
                "CompressionType": "Gzip",
                "SplitType": "Line"
            },
            "TransformJobName.$": "$.executionid",
            "TransformOutput": {
                "S3OutputPath": "${s3_output_path}",
                "Accept": "application/jsonlines",
                "AssembleWith": "Line"
            },    
            "TransformResources": {
                "InstanceType": "xxx",
                "InstanceCount": 1
            }
        },
            "End": true
        }
        }
    }

在上面的定义中,lambda可以是一个解析我通过参数部分传递的执行ID arn的函数:

In the above definition the lambda could be a function which parses the execution id arn which I am passing via the parameters section:

 def lambda_handler(event, context):
    return(event.get('executionId').split(':')[-1])

或者,如果您不想传递执行ID,它可以简单地返回随机字符串,例如

Or if you dont wanna pass the execution id , it can simply return the random string like

 import string
 def lambda_handler(event, context):
    return(string.ascii_uppercase + string.digits)

您可以生成各种随机字符串,也可以在lambda中生成任何内容,然后将其传递给转换作业名称.

you can generate all kinds of random string or do generate anything in the lambda and pass that to the transform job name.

这篇关于如何将stepfunctionexecutionId解析为SageMaker批量转换作业名称?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆