在配置的时间后以编程方式终止Subscription的PubSubIO.readMessages吗? [英] Programmatically terminating PubSubIO.readMessages from Subscription after configured time?

查看:65
本文介绍了在配置的时间后以编程方式终止Subscription的PubSubIO.readMessages吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在计划从PubSub主题的下标中安排具有PubSubIO.readString的数据流.在配置的间隔后如何终止工作?我的用例不是要使该工作持续一整天,因此希望排定开始的时间,然后在经过配置的间隔后从该工作中停止.

I am looking to schedule the Dataflow which has PubSubIO.readString from a PubSub topic's subscripton. How can i have the job to be terminating after a configured interval? My usecase is not to keep the job running through the entire day, so looking to schedule to start, and then stop after a configured interval from within the job.

Pipeline
    .apply(PubsubIO.readMessages().fromSubscription("some-subscription"))

推荐答案

来自文档:

如果您需要停止正在运行的Cloud Dataflow作业,可以通过以下方式停止 使用Cloud Dataflow监控界面发出命令 或Cloud Dataflow命令行界面.

If you need to stop a running Cloud Dataflow job, you can do so by issuing a command using either the Cloud Dataflow Monitoring Interface or the Cloud Dataflow Command-line Interface.

我认为您对通过控制台手动停止作业不感兴趣,这使您可以使用命令行解决方案.如果您打算安排数据流作业运行,例如每天,那么您也知道要在哪个时间停止(启动时间+配置的间隔").在这种情况下,您可以配置cron作业以运行 gcloud dataflow jobs cancel 每天的那个时间.例如,以下脚本将取消当天已启动的所有活动作业:

I would assume that you are not interested in stopping jobs manually via Console, which leaves you with the command line solution. If you intend to schedule your dataflow job to run e.g. daily, then you know at which time you want it to stop too (launch time + "configured interval"). In that case, you could configure a cron job to run the gcloud dataflow jobs cancel at that time every day. For instance, the following script would cancel all active jobs having been launched within the day:

#!/bin/bash
gcloud dataflow jobs list --status=active --created-after=-1d \
| awk '{print $1;}' | tail -n +2 \
| while read -r JOB_ID; do gcloud dataflow jobs cancel $JOB_ID; done

另一种解决方案是使用 Runtime.getRuntime.exec() 在Java代码中调用gcloud命令.您可以按照此处中的说明,使用java.util.Timer().schedule()将其安排为在特定间隔后运行.这样,您可以确保在指定的时间间隔后停止工作,无论您何时启动.

Another solution would be to invoke the gcloud command within your java code, using Runtime.getRuntime.exec(). You can schedule this to run after a specific interval using java.util.Timer().schedule() as noted here. This way you can ensure your job is going to stop after the provided time interval regardless of when you launched it.

更新

@RoshanFernando在注释中正确地指出,实际上存在一个

@RoshanFernando correctly noted in comments that there's actually an SDK method to cancel a pipeline.

这篇关于在配置的时间后以编程方式终止Subscription的PubSubIO.readMessages吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆