使用Apache Beam python创建Google云数据流模板时出现RuntimeValueProviderError [英] RuntimeValueProviderError when creating a google cloud dataflow template with Apache Beam python

查看:63
本文介绍了使用Apache Beam python创建Google云数据流模板时出现RuntimeValueProviderError的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我无法使用python 3.7登台云数据流模板.使用apache_beam.error.RuntimeValueProviderError: RuntimeValueProvider(option: input, type: str, default_value: 'gs://dataflow-samples/shakespeare/kinglear.txt') not accessible

I can't stage a cloud dataflow template with python 3.7. It fails on the one parametrized argument with apache_beam.error.RuntimeValueProviderError: RuntimeValueProvider(option: input, type: str, default_value: 'gs://dataflow-samples/shakespeare/kinglear.txt') not accessible

使用python 2.7登台模板可以正常工作.

Staging the template with python 2.7 works fine.

我尝试用3.7运行数据流作业,它们工作正常.仅模板暂存已损坏. 数据流模板中仍不支持python 3.7还是python 3中的暂存语法发生了变化?

I have tried running dataflow jobs with 3.7 and they work fine. Only the template staging is broken. Is python 3.7 still not supported in dataflow templates or did the syntax for staging in python 3 change?

这是流水线

class WordcountOptions(PipelineOptions):
  @classmethod
  def _add_argparse_args(cls, parser):
    parser.add_value_provider_argument(
      '--input',
      default='gs://dataflow-samples/shakespeare/kinglear.txt',
      help='Path of the file to read from',
      dest="input")

def main(argv=None):
  options = PipelineOptions(flags=argv)
  setup_options = options.view_as(SetupOptions)

  wordcount_options = options.view_as(WordcountOptions)

  with beam.Pipeline(options=setup_options) as p:
    lines = p | 'read' >> ReadFromText(wordcount_options.input)

if __name__ == '__main__':
  main()

这是带有暂存脚本的完整回购 https://github.com/firemuzzy/dataflow-templates-bug-python3

Here is the full repo with the staging scripts https://github.com/firemuzzy/dataflow-templates-bug-python3

以前也有类似的问题,但是不确定它的相关性,因为那是在python 2.7中完成的,但是我的模板在2.7中很好,但在3.7中失败了

There was a previous similar issues, but am not sure how related it is since that was done in python 2.7 but my template stages fine in 2.7 but fails in 3.7

如何创建Google Cloud Dataflow Wordcount python中的自定义模板?

****堆栈跟踪****

**** Stack Trace ****

Traceback (most recent call last):
  File "run_pipeline.py", line 44, in <module>
    main()
  File "run_pipeline.py", line 41, in main
    lines = p | 'read' >> ReadFromText(wordcount_options.input)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/transforms/ptransform.py", line 906, in __ror__
    return self.transform.__ror__(pvalueish, self.label)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/transforms/ptransform.py", line 515, in __ror__
    result = p.apply(self, pvalueish, label)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/pipeline.py", line 490, in apply
    return self.apply(transform, pvalueish)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/pipeline.py", line 525, in apply
    pvalueish_result = self.runner.apply(transform, pvalueish, self._options)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/runner.py", line 183, in apply
    return m(transform, input, options)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/runner.py", line 189, in apply_PTransform
    return transform.expand(input)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/io/textio.py", line 542, in expand
    return pvalue.pipeline | Read(self._source)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/transforms/ptransform.py", line 515, in __ror__
    result = p.apply(self, pvalueish, label)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/pipeline.py", line 525, in apply
    pvalueish_result = self.runner.apply(transform, pvalueish, self._options)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/runner.py", line 183, in apply
    return m(transform, input, options)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/dataflow/dataflow_runner.py", line 1020, in apply_Read
    return self.apply_PTransform(transform, pbegin, options)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/runner.py", line 189, in apply_PTransform
    return transform.expand(input)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/io/iobase.py", line 863, in expand
    return pbegin | _SDFBoundedSourceWrapper(self.source)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/pvalue.py", line 113, in __or__
    return self.pipeline.apply(ptransform, self)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/pipeline.py", line 525, in apply
    pvalueish_result = self.runner.apply(transform, pvalueish, self._options)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/runner.py", line 183, in apply
    return m(transform, input, options)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/runner.py", line 189, in apply_PTransform
    return transform.expand(input)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/io/iobase.py", line 1543, in expand
    | core.ParDo(self._create_sdf_bounded_source_dofn()))
  File "/usr/local/lib/python3.7/site-packages/apache_beam/io/iobase.py", line 1517, in _create_sdf_bounded_source_dofn
    estimated_size = source.estimate_size()
  File "/usr/local/lib/python3.7/site-packages/apache_beam/options/value_provider.py", line 136, in _f
    raise error.RuntimeValueProviderError('%s not accessible' % obj)
apache_beam.error.RuntimeValueProviderError: RuntimeValueProvider(option: input, type: str, default_value: 'gs://dataflow-samples/shakespeare/kinglear.txt') not accessible

推荐答案

不幸的是,看起来模板在Apache Beam的Python SDK 2.18.0上已损坏.

Unfortunately, it looks like templates are broken on Apache Beam's Python SDK 2.18.0.

目前,解决方案是避免使用Beam 2.18.0,因此在您的需求/依赖项中,定义apache-beam[gcp]<2.18.0apache-beam[gcp]>2.18.0

For now, the solution to this is to avoid Beam 2.18.0, so in your requirements / dependencies, define apache-beam[gcp]<2.18.0 or apache-beam[gcp]>2.18.0

这篇关于使用Apache Beam python创建Google云数据流模板时出现RuntimeValueProviderError的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆