如何在Apache Beam中使用ParDo和DoFn创建读取转换 [英] How to create read transform using ParDo and DoFn in Apache Beam
问题描述
根据 Apache Beam文档推荐方式
要编写简单的源代码,请使用读取转换"和ParDo
.不幸的是,Apache Beam文档让我失望了.
According to the Apache Beam documentation the recommended way
to write simple sources is by using Read Transforms and ParDo
. Unfortunately the Apache Beam docs has let me down here.
我正在尝试编写一个简单的无限制数据源,该数据源使用ParDo
发出事件,但是编译器一直在抱怨DoFn
对象的输入类型:
I'm trying to write a simple unbounded data source which emits events using a ParDo
but the compiler keeps complaining about the input type of the DoFn
object:
message: 'The method apply(PTransform<? super PBegin,OutputT>) in the type PBegin is not applicable for the arguments (ParDo.SingleOutput<PBegin,Event>)'
我的尝试:
public class TestIO extends PTransform<PBegin, PCollection<Event>> {
@Override
public PCollection<Event> expand(PBegin input) {
return input.apply(ParDo.of(new ReadFn()));
}
private static class ReadFn extends DoFn<PBegin, Event> {
@ProcessElement
public void process(@TimerId("poll") Timer pollTimer) {
Event testEvent = new Event(...);
//custom logic, this can happen infinitely
for(...) {
context.output(testEvent);
}
}
}
}
推荐答案
DoFn
执行逐元素处理.按照书面规定,ParDo.of(new ReadFn())
将具有类型PTransform<PCollection<PBegin>, PCollection<Event>>
.具体来说,ReadFn
表示它接受类型为PBegin
的 元素,并返回0个或多个类型为Event
的元素.
A DoFn
performs element-wise processing. As written, ParDo.of(new ReadFn())
will have type PTransform<PCollection<PBegin>, PCollection<Event>>
. Specifically, the ReadFn
indicates it takes an element of type PBegin
and returns 0 or more elements of type Event
.
相反,您应该使用实际的Read
操作.有提供了各种.如果要使用一组特定的内存中集合,也可以使用Create
.
Instead, you should use an actual Read
operation. There are a variety provided. You can also use Create
if you have a specific set of in-memory collections to use.
If you need to create a custom source you should use the Read transform. Since you're using timers, you likely want to create an Unbounded Source (a stream of elements).
这篇关于如何在Apache Beam中使用ParDo和DoFn创建读取转换的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!