添加 ReadAllFromText 转换时管道失败 [英] Pipeline fails when addng ReadAllFromText transform

查看:29
本文介绍了添加 ReadAllFromText 转换时管道失败的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在 Apache Beam 中运行一个非常简单的程序来测试它的工作原理.

I am trying to run a very simple program in Apache Beam to try out how it works.

import apache_beam as beam


class Split(beam.DoFn):
    def process(self, element):
        return element


with beam.Pipeline() as p:
    rows = (p | beam.io.ReadAllFromText(
        "input.csv") | beam.ParDo(Split()))

运行时出现以下错误

.... some more stack....
 File "/home/raheel/code/beam-practice/lib/python2.7/site-packages/apache_beam/transforms/util.py", line 565, in expand
    windowing_saved = pcoll.windowing
  File "/home/raheel/code/beam-practice/lib/python2.7/site-packages/apache_beam/pvalue.py", line 137, in windowing
    self.producer.inputs)
  File "/home/raheel/code/beam-practice/lib/python2.7/site-packages/apache_beam/transforms/ptransform.py", line 464, in get_windowing
    return inputs[0].windowing
  File "/home/raheel/code/beam-practice/lib/python2.7/site-packages/apache_beam/pvalue.py", line 137, in windowing
    self.producer.inputs)
  File "/home/raheel/code/beam-practice/lib/python2.7/site-packages/apache_beam/transforms/ptransform.py", line 464, in get_windowing
    return inputs[0].windowing
AttributeError: 'PBegin' object has no attribute 'windowing'

知道这里有什么问题吗?

Any Idea what is wrong here ?

谢谢

推荐答案

ReadAllFromText 期望从文件的 PCollection 中读取,而不是将其作为参数传递.因此,就您而言,它应该是:

ReadAllFromText expects to read from a PCollection of files instead of passing it as an argument. So, in your case, it should be:

p | beam.Create(["input.csv"])
  | beam.io.ReadAllFromText()

这篇关于添加 ReadAllFromText 转换时管道失败的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆