Azure批处理作业计划:任务不会周期性运行 [英] Azure Batch Job Scheduling: Task doesn't run recurrently

查看:119
本文介绍了Azure批处理作业计划:任务不会周期性运行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的目标是安排Azure批处理任务自添加之日起每5分钟运行一次,我使用Python SDK创建/管理我的Azure资源。我尝试创建一个工作计划,它会自动创建一个指定池下的新作业。

My objective is to schedule an Azure Batch Task to run every 5 minutes from the moment it has been added, and I use the Python SDK to create/manage my Azure resources. I tried creating a Job-Schedule and it automatically created a new Job under the specified Pool.

    job_spec = batch.models.JobSpecification(
        pool_info=batch.models.PoolInformation(pool_id=pool_id)
    )
    schedule = batch.models.Schedule(
        start_window=datetime.timedelta(hours=1),
        recurrence_interval=datetime.timedelta(minutes=5)
    )
    setup = batch.models.JobScheduleAddParameter(
        'python_test_schedule',
        schedule,
        job_spec
    )
    batch_client.job_schedule.add(setup)

我所做的就是向新工作中添加任务。但是,该任务似乎在添加后仅运行一次(就像正常任务一样)。我需要做更多的事情来使任务循环运行吗?似乎也没有太多的JobSchedule文档和示例。

What I did is then add a task to this new Job. But the task seems to run only once as soon as it is added (like a normal task). Is there something more that I need to do to make the task run recurrently? There doesn't seem to be much documentation and examples of JobSchedule either.

谢谢!感谢您的帮助。

推荐答案

您是正确的,因为JobSchedule将在指定的时间间隔创建一个新作业。此外,完成任务后,不能每隔5分钟重新运行任务。您可以执行以下任一操作:

You are correct in that a JobSchedule will create a new job at the specified time interval. Additionally, you cannot have a task "re-run" every 5 minutes once it has completed. You could do either:


  • 有一个运行循环的任务,每5分钟执行一次相同的操作。

  • 每5分钟使用作业管理器添加一个新任务(做同样的事情)。

我可能会建议使用第二个选项,因为它具有更大的灵活性来监视任务和作业的进度并采取相应的措施。
创建工作的示例客户端可能看起来像这样:

I would probably recommend the 2nd option, as it has a little more flexibility to monitor the progress of the tasks and job and take actions accordingly. An example client which creates the job might look a bit like this:

job_manager = models.JobManagerTask(
    id='job_manager',
    command_line="/bin/bash -c 'python ./job_manager.py'",
    environment_settings=[
        mdoels.EnvironmentSettings('AZ_BATCH_KEY', AZ_BATCH_KEY)],
    resource_files=[
        models.ResourceFile(blob_sas="https://url/to/job_manager.py", file_name="job_manager.py")],
    authentication_token_settings=models.AuthenticationTokenSettings(
        access=[models.AccessScope.job]),
    kill_job_on_completion=True,  # This will mark the job as complete once the Job Manager has finished.
    run_exclusive=False)  # Whether the job manager needs a dedicated VM - this will depend on the nature of the other tasks running on the VM.


new_job = models.JobAddParameter(
    id='my_job',
    job_manager_task=job_manager,
    pool_info=models.PoolInformation(pool_id='my_pool'))

batch_client.job.add(new_job)

现在我们需要一个脚本在计算节点上作为作业管理器运行。在这种情况下,我将使用Python,因此您需要向池中添加一个StartTask(或将JobPrepTask添加到作业中)以安装azure-batch Python软件包。

Now we need a script to run as the Job Manager on the compute node. In this case I will use Python, so you will need to add a StartTask to you pool (or JobPrepTask to the job) to install the azure-batch Python package.

另外,作业管理器任务将需要能够根据Batch API进行身份验证。有两种方法可以执行此操作,具体取决于作业管理器将执行的活动范围。如果仅需要添加任务,则可以使用authentication_token_settings属性,该属性将向作业管理器任务添加AAD令牌环境变量,并且该权限仅允许访问当前作业。如果您需要执行其他操作(例如更改池)或启动新作业的权限,则可以通过环境变量传递帐户密钥。这两个选项都显示在上面。

Additionally the Job Manager Task will need to be able to authenticate against the Batch API. There are two methods of doing this depending on the scope of activities that the Job Manager will perform. If you only need to add tasks, then you can use the authentication_token_settings attribute, which will add an AAD token environment variable to the Job Manager task with permissions to ONLY access the current job. If you need permission to do other things, like alter the pool, or start new jobs, you can pass an account key via environment variable. Both options are shown above.

在作业管理器任务上运行的脚本可能类似于以下内容:

The script you run on the Job Manager task could look something like this:

import os
import time

from azure.batch import BatchServiceClient
from azure.batch.batch_auth import SharedKeyCredentials
from azure.batch import models

# Batch account credentials
AZ_BATCH_ACCOUNT = os.environ['AZ_BATCH_ACCOUNT_NAME']
AZ_BATCH_KEY = os.environ['AZ_BATCH_KEY']
AZ_BATCH_ENDPOINT = os.environ['AZ_BATCH_ENDPOINT']

# If you're using the authentication_token_settings for authentication
# you can use the AAD token in the environment variable AZ_BATCH_AUTHENTICATION_TOKEN.


def main():
    # Batch Client
    creds = SharedKeyCredentials(AZ_BATCH_ACCOUNT, AZ_BATCH_KEY)
    batch_client = BatchServiceClient(creds, base_url=AZ_BATCH_ENDPOINT)

    # You can set up the conditions under which your Job Manager will continue to add tasks here.
    # It could be a timeout, max number of tasks, or you could monitor tasks to act on task status
    condition = True
    task_id = 0
    task_params = {
        "command_line": "/bin/bash -c 'echo hello world'",
        # Any other task parameters go here.
    }

    while condition:
        new_task = models.TaskAddParameter(id=task_id, **task_params)
        batch_client.task.add(AZ_JOB, new_task)
        task_id += 1
        # Perform any additional log here - for example:
        # - Check the status of the tasks, e.g. stdout, exit code etc
        # - Process any output files for the tasks
        # - Delete any completed tasks
        # - Error handling for tasks that have failed
        time.sleep(300)  # Wait for 5 minutes (300 seconds)

    # Job Manager task has completed - it will now exit and the job will be marked as complete.

if __name__ == '__main__':
    main()

这篇关于Azure批处理作业计划:任务不会周期性运行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆