检查PCollection是否为空-Apache Beam [英] Check if PCollection is empty - Apache Beam

查看:89
本文介绍了检查PCollection是否为空-Apache Beam的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有什么方法可以检查PCollection是否为空?

Is there any way to check if a PCollection is empty?

在Dataflow和Apache Beam的文档中没有发现任何相关内容.

I haven't found anything relevant in the documentation of Dataflow and Apache Beam.

推荐答案

在没有应用PTransform的情况下无法检查PCollection的大小(例如Count.globally()或Combine.combineFn()),因为PCollection不像Java SDK中的典型Collection那样.

There is no way to check size of the PCollection without applying a PTransform on it (such as Count.globally() or Combine.combineFn()) because PCollection is not like a typical Collection in Java SDK or so.

它是数据的有界或无界集合的抽象,其中将数据馈送到该集合中以对其执行操作(例如PTransform).而且它是并行的(如课程开始时的P所建议).

It is an abstraction of bounded or unbounded collection of data where data is fed into the collection for an operation being applied on it (e.g. PTransform). Also it is parallelized (as the P at the beginning of the class suggest).

因此,您需要一种机制来从每个工作程序/节点获取元素计数并将其组合以获取值.直到该转换结束,才能知道它是0还是n.

Therefore you need a mechanism to get counts of elements from each worker/node and combine them to get a value. Whether it is 0 or n can not be known until the end of that transformation.

这篇关于检查PCollection是否为空-Apache Beam的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆