在 Python Apache Beam 中使用 value provider 参数的方法 [英] Ways of using value provider parameter in Python Apache Beam

查看:24
本文介绍了在 Python Apache Beam 中使用 value provider 参数的方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

现在我只能使用 ParDo 获取类中的 RunTime 值,还有其他方法可以像在我的函数中一样使用运行时参数吗?

Right now I'm just able to grab the RunTime value inside a class using a ParDo, is there another way to get to use the runtime parameter like in my functions?

这是我现在得到的代码:

This is the code I got right now:

class UserOptions(PipelineOptions):
    @classmethod
    def _add_argparse_args(cls, parser):
        parser.add_value_provider_argument('--firestore_document',default='')

def run(argv=None):

    parser = argparse.ArgumentParser()

    pipeline_options = PipelineOptions()

    user_options = pipeline_options.view_as(UserOptions)

    pipeline_options.view_as(SetupOptions).save_main_session = True

    with beam.Pipeline(options=pipeline_options) as p:

        rows = (p 
        | 'Create inputs' >> beam.Create(['']) 
        | 'Call Firestore' >> beam.ParDo(
                CallFirestore(user_options.firestore_document)) 
        | 'Read DB2' >> beam.Map(ReadDB2))

我希望 user_options.firestore_document 无需执行 ParDo 即可在其他功能中使用

I want the user_options.firestore_document to be usable in other functions without having to do a ParDo

推荐答案

使用值提供程序的唯一方法是在 ParDos 和 Combines 中.无法在创建中传递值提供者,但您可以定义一个 DoFn,该 DoFn 返回您在构造函数中传递给它的值提供者:

The only way in which you can use value providers are in ParDos, and Combines. It is not possible to pass a value provider in a create, but you can define a DoFn that returns the value provider you pass to it in the constructor:

class OutputValueProviderFn(beam.DoFn):
  def __init__(self, vp):
    self.vp = vp

  def process(self, unused_elm):
    yield self.vp.get()

在您的管道中,您将执行以下操作:

And in your pipeline, you would do the following:

user_options = pipeline_options.view_as(UserOptions)

with beam.Pipeline(options=pipeline_options) as p:
  my_value_provided_pcoll = (
      p
      | beam.Create([None])
      | beam.ParDo(OutputValueProviderFn(user_options.firestore_document))

那样你就不会在 Create 中使用它,因为这是不可能的,但你仍然可以在 PCollection 中得到它.

That way you wouldn't use it in a Create, as it's not possible, but you could still get it in a PCollection.

这篇关于在 Python Apache Beam 中使用 value provider 参数的方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆