Flink:如何存储状态并在另一个流中使用? [英] Flink: how to store state and use in another stream?

查看：20 发布时间：2021/11/12 1:03:45 apache-flink flink-streaming

本文介绍了Flink:如何存储状态并在另一个流中使用?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个 Flink 用例，我需要从文件中读取信息，存储每一行，然后使用此状态过滤另一个流.

I have a use-case for Flink where I need to read information from a file, store each line, and then use this state to filter another stream.

我现在使用 connect 运算符和 RichCoFlatMapFunction 可以完成所有这些工作，但感觉过于复杂.另外，我担心 flatMap2 可能会在从文件加载所有状态之前开始执行:

I have all of this working right now with the connect operator and a RichCoFlatMapFunction, but it feels overly complicated. Also, I'm concerned that flatMap2 could begin executing before all of the state is loaded from the file:

fileStream
    .connect(partRecordStream.keyBy((KeySelector<PartRecord, String>) partRecord -> partRecord.getPartId()))
    .keyBy((KeySelector<String, String>) partId -> partId, (KeySelector<PartRecord, String>) partRecord -> partRecord.getPartId())
    .flatMap(new RichCoFlatMapFunction<String, PartRecord, PartRecord>() {
        private transient ValueState<String> storedPartId;
        @Override
        public void flatMap1(String partId, Collector<PartRecord> out) throws Exception {
            // store state
            storedPartId.update(partId);
        }

        @Override
        public void flatMap2(PartRecord record, Collector<PartRecord> out) throws Exception {
            if (record.getPartId().equals(storedPartId.value())) {
                out.collect(record);
            } else {
                // do nothing
            }
        }

        @Override
        public void open(Configuration parameters) throws Exception {
            ValueStateDescriptor<String> descriptor =
                    new ValueStateDescriptor<>(
                            "partId", // the state name
                            TypeInformation.of(new TypeHint<String>() {}),
                            null);
            storedPartId = getRuntimeContext().getState(descriptor);
        }
    });

有没有更好的方法(从 Flink 1.1.3 开始)来实现这种加载状态模式，然后在后续流中使用它?

Is there a better way (as of Flink 1.1.3) to accomplish this pattern of loading state, then using it in subsequent streams?

Flink:如何存储状态并在另一个流中使用? [英] Flink: how to store state and use in another stream?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

Flink:如何存储状态并在另一个流中使用? [英] Flink: how to store state and use in another stream?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭