什么是阿帕奇梁? [英] What is Apache Beam?

查看:24
本文介绍了什么是阿帕奇梁?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在浏览 Apache 帖子时发现了一个名为 Beam 的新术语.谁能解释一下 Apache Beam 到底是什么?我试图用谷歌搜索但无法得到明确的答案.

解决方案

Apache Beam 是一个开源,用于定义和执行批处理和流式数据并行处理管道的统一模型,以及一组用于构建管道的特定于语言的 SDK 和用于执行它们的特定于运行时的 Runner.

历史: Beam 背后的模型从许多内部 Google 数据处理项目演变而来,包括 MapReduceFlumeJavaMillwheel.该模型最初被称为数据流模型"并首先实现为Google Cloud Dataflow -- 包括一个 GitHub 上的 Java SDK,用于编写管道和完全托管的服务,以便在 Google Cloud Platform 上执行它们.社区中的其他人开始编写扩展程序,包括 Spark RunnerFlink RunnerScala SDK.2016 年 1 月,谷歌和一些合作伙伴提交了数据流编程模型和 SDK 部分作为 Apache 孵化器提案,名称为 Apache Beam(统一批处理 + strEAM 处理).Apache Beam 于 2016 年 12 月从孵化毕业.>

用于学习梁模型的其他资源:

I was going through the Apache posts and found a new term called Beam. Can anybody explain what exactly Apache Beam is? I tried to google out but unable to get a clear answer.

解决方案

Apache Beam is an open source, unified model for defining and executing both batch and streaming data-parallel processing pipelines, as well as a set of language-specific SDKs for constructing pipelines and runtime-specific Runners for executing them.

History: The model behind Beam evolved from a number of internal Google data processing projects, including MapReduce, FlumeJava, and Millwheel. This model was originally known as the "Dataflow Model" and first implemented as Google Cloud Dataflow -- including a Java SDK on GitHub for writing pipelines and fully managed service for executing them on Google Cloud Platform. Others in the community began writing extensions, including a Spark Runner, Flink Runner, and Scala SDK. In January 2016, Google and a number of partners submitted the Dataflow Programming Model and SDKs portion as an Apache Incubator Proposal, under the name Apache Beam (unified Batch + strEAM processing). Apache Beam graduated from incubation in December 2016.

Additional resources for learning the Beam Model:

这篇关于什么是阿帕奇梁?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆