FLINK:如何使用相同的 StreamExecutionEnvironment 从多个 kafka 集群中读取数据 [英] FLINK: How to read from multiple kafka cluster using same StreamExecutionEnvironment

查看：75 发布时间：2021/11/12 1:04:28 java apache-flink flink-streaming

本文介绍了FLINK:如何使用相同的 StreamExecutionEnvironment 从多个 kafka 集群中读取数据的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想在 FLINK 中读取多个 KAFKA 集群的数据.

I want to read data from multiple KAFKA clusters in FLINK.

但结果是 kafkaMessageStream 只从第一个 Kafka 读取.

But the result is that the kafkaMessageStream is reading only from first Kafka.

只有当我为两个 Kafka 分别拥有 2 个流 时，我才能从两个 Kafka 集群中读取数据，这不是我想要的.

I am able to read from both Kafka clusters only if i have 2 streams separately for both Kafka , which is not what i want.

是否可以将多个源附加到单个阅读器.

Is it possible to have multiple sources attached to single reader.

示例代码

public class KafkaReader<T> implements Reader<T>{

private StreamExecutionEnvironment executionEnvironment ;

public StreamExecutionEnvironment getExecutionEnvironment(Properties properties){
    executionEnvironment = StreamExecutionEnvironment.getExecutionEnvironment();
    executionEnvironment.setRestartStrategy( RestartStrategies.fixedDelayRestart(3, 1500));

    executionEnvironment.enableCheckpointing(
            Integer.parseInt(properties.getProperty(Constants.SSE_CHECKPOINT_INTERVAL,"5000")), CheckpointingMode.EXACTLY_ONCE);
    executionEnvironment.getCheckpointConfig().setCheckpointTimeout(60000);
    //executionEnvironment.getCheckpointConfig().setMaxConcurrentCheckpoints(1);
    //try {
    //  executionEnvironment.setStateBackend(new FsStateBackend(new Path(Constants.SSE_CHECKPOINT_PATH)));
        // The RocksDBStateBackend or The FsStateBackend
    //} catch (IOException e) {
        // LOGGER.error("Exception during initialization of stateBackend in execution environment"+e.getMessage());
    }

    return executionEnvironment;
}
public DataStream<T> readFromMultiKafka(Properties properties_k1, Properties properties_k2 ,DeserializationSchema<T> deserializationSchema) {


    DataStream<T> kafkaMessageStream = executionEnvironment.addSource(new FlinkKafkaConsumer08<T>( 
            properties_k1.getProperty(Constants.TOPIC),deserializationSchema, 
            properties_k1));
    executionEnvironment.addSource(new FlinkKafkaConsumer08<T>( 
            properties_k2.getProperty(Constants.TOPIC),deserializationSchema, 
            properties_k2));

    return kafkaMessageStream;
}


public DataStream<T> readFromKafka(Properties properties,DeserializationSchema<T> deserializationSchema) {


    DataStream<T> kafkaMessageStream = executionEnvironment.addSource(new FlinkKafkaConsumer08<T>( 
            properties.getProperty(Constants.TOPIC),deserializationSchema, 
            properties));

    return kafkaMessageStream;
}

}

我的电话:

 public static void main( String[] args ) throws Exception
{
    Properties pk1 = new Properties();
    pk1.setProperty(Constants.TOPIC, "flink_test");
    pk1.setProperty("zookeeper.connect", "localhost:2181");
    pk1.setProperty("group.id", "1");
    pk1.setProperty("bootstrap.servers", "localhost:9092");
    Properties pk2 = new Properties();
    pk2.setProperty(Constants.TOPIC, "flink_test");
    pk2.setProperty("zookeeper.connect", "localhost:2182");
    pk2.setProperty("group.id", "1");
    pk2.setProperty("bootstrap.servers", "localhost:9093");


    Reader<String> reader = new KafkaReader<String>();
    //Do not work

    StreamExecutionEnvironment environment = reader.getExecutionEnvironment(pk1);
    DataStream<String> dataStream1 = reader.readFromMultiKafka(pk1,pk2,new SimpleStringSchema());
    DataStream<ImpressionObject> transform = new TsvTransformer().transform(dataStream);

    transform.print();      


  //Works:

    StreamExecutionEnvironment environment = reader.getExecutionEnvironment(pk1);
    DataStream<String> dataStream1 = reader.readFromKafka(pk1, new SimpleStringSchema());
    DataStream<String> dataStream2 = reader.readFromKafka(pk2, new SimpleStringSchema());

    DataStream<Tuple2<String, Integer>> transform1 = dataStream1.flatMap(new LineSplitter()).keyBy(0)
    .timeWindow(Time.seconds(5)).sum(1).setParallelism(5);
    DataStream<Tuple2<String, Integer>> transform2 = dataStream2.flatMap(new LineSplitter()).keyBy(0)
            .timeWindow(Time.seconds(5)).sum(1).setParallelism(5);


    transform1.print();     
    transform2.print();     

    environment.execute("Kafka Reader");
}

FLINK:如何使用相同的 StreamExecutionEnvironment 从多个 kafka 集群中读取数据 [英] FLINK: How to read from multiple kafka cluster using same StreamExecutionEnvironment

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

FLINK:如何使用相同的 StreamExecutionEnvironment 从多个 kafka 集群中读取数据 [英] FLINK: How to read from multiple kafka cluster using same StreamExecutionEnvironment

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭