安排Google Cloud Dataflow作业的最简单方法 [英] easiest way to schedule a Google Cloud Dataflow job

查看:72
本文介绍了安排Google Cloud Dataflow作业的最简单方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我只需要每天运行一条数据流管道,但是在我看来,建议的解决方案(如App Engine Cron Service)需要构建一个完整的Web应用程序,似乎有点过多. 我当时正在考虑仅通过Compute Engine Linux VM中的cron作业运行管道,但这也许太简单了:).这样做有什么问题,为什么没有人(我猜我除外)建议呢?

I just need to run a dataflow pipeline on a daily basis, but it seems to me that suggested solutions like App Engine Cron Service, which requires building a whole web app, seems a bit too much. I was thinking about just running the pipeline from a cron job in a Compute Engine Linux VM, but maybe that's far too simple :). What's the problem with doing it that way, why isn't anybody (besides me I guess) suggesting it?

推荐答案

使用cron作业启动Dataflow管道绝对没有错.无论是Java还是Python开发的管道,我们一直在为生产系统做这些事情.

There's absolutely nothing wrong with using a cron job to kick off your Dataflow pipelines. We do it all the time for our production systems, whether it be our Java or Python developed pipelines.

但是,这就是说,我们正在努力使自己脱离cron作业,并朝着使用AWS Lambdas(我们运行多云)或Cloud Functions的方向发展.不幸的是,Cloud Functions 尚无调度. AWS Lambdas .

That said however, we are trying to wean ourselves off cron jobs, and move more toward using either AWS Lambdas (we run multi cloud) or Cloud Functions. Unfortunately, Cloud Functions don't have scheduling yet. AWS Lambdas do.

这篇关于安排Google Cloud Dataflow作业的最简单方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆