Sagemaker 使用 PySpark 和 Step Functions 处理作业 [英] Sagemaker processing job with PySpark and Step Functions

查看：83 发布时间：2021/6/25 18:33:36 python amazon-web-services pyspark amazon-sagemaker aws-step-functions

本文介绍了Sagemaker 使用 PySpark 和 Step Functions 处理作业的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

这是我的问题:我必须使用在 PySpark 中编写的自定义代码运行 Sagemaker 处理作业.我通过运行以下命令使用了 Sagemaker SDK:

this is my problem: I have to run a Sagemaker processing job using custom code written in PySpark. I've used the Sagemaker SDK by running these commands:

spark_processor = sagemaker.spark.processing.PySparkProcessor(
        base_job_name="spark-preprocessor",
        framework_version="2.4",
        role=role_arn,
        instance_count=2,
        instance_type="ml.m5.xlarge",
        max_runtime_in_seconds=1800,
    )

    spark_processor.run(
        submit_app="processing.py",
        arguments=['s3_input_bucket', bucket_name,
                   's3_input_file_path', file_path
                   ]
    )

现在我必须使用 Step Functions 自动化工作流程.为此，我编写了一个 lambda 函数来执行此操作，但收到以下错误:

Now I have to automate the workflow by using Step Functions. For this purpose, I've written a lambda function to do that but I receive the following error:

{
  "errorMessage": "Unable to import module 'lambda_function': No module named 'sagemaker'",
  "errorType": "Runtime.ImportModuleError"
}

这是我的 lambda 函数:

This is my lambda function:

import sagemaker

def lambda_handler(event, context):
    spark_processor = sagemaker.spark.processing.PySparkProcessor(
        base_job_name="spark-preprocessor",
        framework_version="2.4",
        role=role_arn,
        instance_count=2,
        instance_type="ml.m5.xlarge",
        max_runtime_in_seconds=1800,
    )

    spark_processor.run(
        submit_app="processing.py",
        arguments=['s3_input_bucket', event["bucket_name"],
                   's3_input_file_path', event["file_path"]
                   ]
    )

我的问题是:如何在我的状态机中创建一个步骤以使用 Sagemaker 处理运行 PySpark 代码?

My question is: How can I create a step in my state machine for running a PySpark code using Sagemaker processing?

谢谢

Sagemaker 使用 PySpark 和 Step Functions 处理作业 [英] Sagemaker processing job with PySpark and Step Functions

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Sagemaker 使用 PySpark 和 Step Functions 处理作业 [英] Sagemaker processing job with PySpark and Step Functions

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭