Apache Beam:FlatMap与Map? [英] Apache Beam : FlatMap vs Map?
问题描述
我想了解在哪种情况下应该使用FlatMap或Map. 文档在我看来似乎并不清晰.
I want to understand in which scenario that I should use FlatMap or Map. The documentation did not seem clear to me.
我仍然不知道在哪种情况下应该使用FlatMap或Map的转换.
I still do not understand in which scenario I should use the transformation of FlatMap or Map.
有人可以给我一个例子,让我了解他们的区别吗?
Could someone give me an example so I can understand their difference?
我了解Spark中FlatMap与Map的区别,但是不确定是否有相似之处?
I understand the difference of FlatMap vs Map in Spark, and however not sure if there any similarity?
推荐答案
Beam中的这些变换与Spark(也是Scala)完全相同.
These transforms in Beam are exactly same as Spark (Scala too).
Map
转换,将映射从N个元素的PCollection
转换为N个元素的另一个PCollection
.
A Map
transform, maps from a PCollection
of N elements into another PCollection
of N elements.
A FlatMap
变换将N个元素的PCollections
映射到零个或多个元素的N个集合中,然后将它们PCollection
中.
A FlatMap
transform maps a PCollections
of N elements into N collections of zero or more elements, which are then flattened into a single PCollection
.
作为一个简单的例子,会发生以下情况:
As a simple example, the following happens:
beam.Create([1, 2, 3]) | beam.Map(lambda x: [x, 'any'])
# The result is a collection of THREE lists: [[1, 'any'], [2, 'any'], [3, 'any']]
位置:
beam.Create([1, 2, 3]) | beam.FlatMap(lambda x: [x, 'any'])
# The lists that are output by the lambda, are then flattened into a
# collection of SIX single elements: [1, 'any', 2, 'any', 3, 'any']
这篇关于Apache Beam:FlatMap与Map?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!