使用Google DataFlow/Apache Beam并行化图像处理或爬网任务是否有意义? [英] Does it make sense to use Google DataFlow/Apache Beam to parallelize image processing or crawling tasks?

查看：82 发布时间：2020/9/3 5:04:19 google-cloud-platform google-cloud-dataflow azure-data-factory amazon-data-pipeline apache-beam

本文介绍了使用Google DataFlow/Apache Beam并行化图像处理或爬网任务是否有意义?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在考虑将Google DataFlow作为运行涉及以下步骤的管道的选项:

I am considering Google DataFlow as an option for running a pipeline that involves steps like:

从网络上下载图像；
处理图像.

我喜欢DataFlow管理完成任务所需的VM的生命周期，因此我不需要自己启动或停止它们，但是我遇到的所有示例都将其用于数据挖掘等任务.我想知道它是否对其他批处理任务(如图像处理和爬网)是否可行.

I like that DataFlow manages the lifetime of VMs required to complete the job, so I don't need to start or stop them myself, but all examples I came across use it for data mining kind of tasks. I wonder if it is a viable option for other batch tasks like image processing and crawling.

使用Google DataFlow/Apache Beam并行化图像处理或爬网任务是否有意义? [英] Does it make sense to use Google DataFlow/Apache Beam to parallelize image processing or crawling tasks?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

使用Google DataFlow/Apache Beam并行化图像处理或爬网任务是否有意义? [英] Does it make sense to use Google DataFlow/Apache Beam to parallelize image processing or crawling tasks?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭