使用 Apache Beam python 创建谷歌云数据流模板时出现 RuntimeValueProviderError [英] RuntimeValueProviderError when creating a google cloud dataflow template with Apache Beam python

查看:28
本文介绍了使用 Apache Beam python 创建谷歌云数据流模板时出现 RuntimeValueProviderError的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我无法使用 python 3.7 暂存云数据流模板.它在 apache_beam.error.RuntimeValueProviderError: RuntimeValueProvider(option: input, type: str, default_value: 'gs://dataflow-samples/shakespeare/kinglear.txt') 无法访问的一个参数化参数上失败

I can't stage a cloud dataflow template with python 3.7. It fails on the one parametrized argument with apache_beam.error.RuntimeValueProviderError: RuntimeValueProvider(option: input, type: str, default_value: 'gs://dataflow-samples/shakespeare/kinglear.txt') not accessible

使用 python 2.7 暂存模板工作正常.

Staging the template with python 2.7 works fine.

我尝试过使用 3.7 运行数据流作业,它们运行良好.只有模板暂存被破坏.数据流模板中仍然不支持 python 3.7 还是 python 3 中的暂存语法发生了变化?

I have tried running dataflow jobs with 3.7 and they work fine. Only the template staging is broken. Is python 3.7 still not supported in dataflow templates or did the syntax for staging in python 3 change?

这是管道部分

class WordcountOptions(PipelineOptions):
  @classmethod
  def _add_argparse_args(cls, parser):
    parser.add_value_provider_argument(
      '--input',
      default='gs://dataflow-samples/shakespeare/kinglear.txt',
      help='Path of the file to read from',
      dest="input")

def main(argv=None):
  options = PipelineOptions(flags=argv)
  setup_options = options.view_as(SetupOptions)

  wordcount_options = options.view_as(WordcountOptions)

  with beam.Pipeline(options=setup_options) as p:
    lines = p | 'read' >> ReadFromText(wordcount_options.input)

if __name__ == '__main__':
  main()

这是带有暂存脚本的完整存储库 https://github.com/firemuzzy/dataflow-templates-bug-python3

Here is the full repo with the staging scripts https://github.com/firemuzzy/dataflow-templates-bug-python3

之前有一个类似的问题,但我不确定它是如何相关的,因为它是在 python 2.7 中完成的,但我的模板在 2.7 中运行良好,但在 3.7 中失败

There was a previous similar issues, but am not sure how related it is since that was done in python 2.7 but my template stages fine in 2.7 but fails in 3.7

如何创建 Google Cloud Dataflow WordcountPython 中的自定义模板?

**** 堆栈跟踪 ****

**** Stack Trace ****

Traceback (most recent call last):
  File "run_pipeline.py", line 44, in <module>
    main()
  File "run_pipeline.py", line 41, in main
    lines = p | 'read' >> ReadFromText(wordcount_options.input)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/transforms/ptransform.py", line 906, in __ror__
    return self.transform.__ror__(pvalueish, self.label)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/transforms/ptransform.py", line 515, in __ror__
    result = p.apply(self, pvalueish, label)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/pipeline.py", line 490, in apply
    return self.apply(transform, pvalueish)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/pipeline.py", line 525, in apply
    pvalueish_result = self.runner.apply(transform, pvalueish, self._options)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/runner.py", line 183, in apply
    return m(transform, input, options)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/runner.py", line 189, in apply_PTransform
    return transform.expand(input)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/io/textio.py", line 542, in expand
    return pvalue.pipeline | Read(self._source)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/transforms/ptransform.py", line 515, in __ror__
    result = p.apply(self, pvalueish, label)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/pipeline.py", line 525, in apply
    pvalueish_result = self.runner.apply(transform, pvalueish, self._options)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/runner.py", line 183, in apply
    return m(transform, input, options)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/dataflow/dataflow_runner.py", line 1020, in apply_Read
    return self.apply_PTransform(transform, pbegin, options)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/runner.py", line 189, in apply_PTransform
    return transform.expand(input)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/io/iobase.py", line 863, in expand
    return pbegin | _SDFBoundedSourceWrapper(self.source)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/pvalue.py", line 113, in __or__
    return self.pipeline.apply(ptransform, self)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/pipeline.py", line 525, in apply
    pvalueish_result = self.runner.apply(transform, pvalueish, self._options)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/runner.py", line 183, in apply
    return m(transform, input, options)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/runners/runner.py", line 189, in apply_PTransform
    return transform.expand(input)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/io/iobase.py", line 1543, in expand
    | core.ParDo(self._create_sdf_bounded_source_dofn()))
  File "/usr/local/lib/python3.7/site-packages/apache_beam/io/iobase.py", line 1517, in _create_sdf_bounded_source_dofn
    estimated_size = source.estimate_size()
  File "/usr/local/lib/python3.7/site-packages/apache_beam/options/value_provider.py", line 136, in _f
    raise error.RuntimeValueProviderError('%s not accessible' % obj)
apache_beam.error.RuntimeValueProviderError: RuntimeValueProvider(option: input, type: str, default_value: 'gs://dataflow-samples/shakespeare/kinglear.txt') not accessible

推荐答案

不幸的是,Apache Beam 的 Python SDK 2.18.0 上的模板似乎已损坏.

Unfortunately, it looks like templates are broken on Apache Beam's Python SDK 2.18.0.

目前,解决方案是避免使用 Beam 2.18.0,因此在您的需求/依赖项中,定义 apache-beam[gcp]<2.18.0apache-光束[gcp]>2.18.0

For now, the solution to this is to avoid Beam 2.18.0, so in your requirements / dependencies, define apache-beam[gcp]<2.18.0 or apache-beam[gcp]>2.18.0

这篇关于使用 Apache Beam python 创建谷歌云数据流模板时出现 RuntimeValueProviderError的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆