Beam/Dataflow自定义Python作业-从Cloud Storage到PubSub [英] Beam / Dataflow Custom Python job - Cloud Storage to PubSub

查看：99 发布时间：2020/9/3 5:03:17 python google-cloud-storage apache-beam google-cloud-pubsub dataflow

本文介绍了Beam/Dataflow自定义Python作业-从Cloud Storage到PubSub的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我需要对某些数据执行非常简单的转换(从JSON中提取一个字符串)，然后将其写入PubSub-我正尝试使用自定义的python Dataflow作业来完成此操作.

I need to perform a very simple transformation on some data (extract a string from JSON), then write it to PubSub - I'm attempting to use a custom python Dataflow job to do so.

我写了一份可以成功写回Cloud Storage的作业，但是我尝试最简单的写入PubSub(不进行转换)的尝试都会导致错误:JOB_MESSAGE_ERROR: Workflow failed. Causes: Expected custom source to have non-zero number of splits.

I've written a job which successfully writes back to Cloud Storage, but my attempts at even the simplest possible write to PubSub (no transformation) result in an error: JOB_MESSAGE_ERROR: Workflow failed. Causes: Expected custom source to have non-zero number of splits.

有人通过数据流从GCS成功写入PubSub吗?

Has anyone successfully written to PubSub from GCS via Dataflow?

任何人都可以弄清楚这里出了什么问题吗?

Can anyone shed some light on what is going wrong here?


def run(argv=None):

  parser = argparse.ArgumentParser()
  parser.add_argument('--input',
                      dest='input',
                      help='Input file to process.')
  parser.add_argument('--output',
                      dest='output',                      
                      help='Output file to write results to.')
  known_args, pipeline_args = parser.parse_known_args(argv)

  pipeline_options = PipelineOptions(pipeline_args)
  pipeline_options.view_as(SetupOptions).save_main_session = True
  with beam.Pipeline(options=pipeline_options) as p:

    lines = p | ReadFromText(known_args.input)

    output = lines #Obviously not necessary but this is where my simple extract goes

    output | beam.io.WriteToPubSub(known_args.output) # This doesn't

Beam/Dataflow自定义Python作业-从Cloud Storage到PubSub [英] Beam / Dataflow Custom Python job - Cloud Storage to PubSub

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Beam/Dataflow自定义Python作业-从Cloud Storage到PubSub [英] Beam / Dataflow Custom Python job - Cloud Storage to PubSub

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭