Apache Beam-跳过管道步骤 [英] Apache Beam - skip pipeline step

查看：67 发布时间：2020/9/3 5:25:11 java google-cloud-platform apache-beam

本文介绍了Apache Beam-跳过管道步骤的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在使用Apache Beam来建立包含2个主要步骤的管道:

I'm using Apache Beam to set up a pipeline consisting of 2 main steps:

使用波束变换来变换数据
将转换后的数据加载到BigQuery

管道设置如下:

myPCollection = (org.apache.beam.sdk.values.PCollection<myCollectionObjectType>)myInputPCollection
                .apply("do a parallel transform"),
                     ParDo.of(new MyTransformClassName.MyTransformFn()));

 myPCollection
    .apply("Load BigQuery data for PCollection",
            BigQueryIO.<myCollectionObjectType>write()
            .to(new MyDataLoadClass.MyFactTableDestination(myDestination))
            .withFormatFunction(new MyDataLoadClass.MySerializationFn())

我已经看了这个问题:

> Apache Beam:跳过已经构建好的管道中的步骤

这表明我可以按照步骤1中的并行转换，以某种方式动态更改可以将数据传递至的输出.

which suggests that I may be able to somehow dynamically change which output I can pass data to, following the parallel transform in step 1.

我该怎么做?我不知道如何选择是否将步骤1的myPCollection传递给步骤2.如果步骤1的myPCollection中的对象是null，则需要跳过步骤2.

How do I do this? I don't know how to choose whether or not to pass myPCollection from step 1 to step 2. I need to skip step 2 if the object in myPCollection from step 1 is null.

推荐答案

如果您不希望在下一步中使用MyTransformClassName.MyTransformFn中的元素，就不会发出它，例如，像这样的东西:

You just don't emit the element from your MyTransformClassName.MyTransformFn when you don't want it in the next step, for example something like this:

class MyTransformClassName.MyTransformFn extends...
  @ProcessElement
  public void processElement(ProcessContext c, ...) {
    ...
    result = ...
    if (result != null) {
       c.output(result);   //only output something that's not null
    }
  }

这样，null不会到达下一步.

This way nulls don't reach the next step.

有关更多详细信息，请参见指南的ParDo部分: https://beam.apache.org/documentation/programming-guide/#pardo

See the ParDo section of the guide for more details: https://beam.apache.org/documentation/programming-guide/#pardo

这篇关于Apache Beam-跳过管道步骤的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Apache Beam-跳过管道步骤 [英] Apache Beam - skip pipeline step

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

Apache Beam-跳过管道步骤 [英] Apache Beam - skip pipeline step

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭