Spring-Kafka 并发属性 [英] Spring-Kafka Concurrency Property

查看:44
本文介绍了Spring-Kafka 并发属性的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 Spring-Kafka 编写我的第一个 Kafka Consumer.查看了框架提供的不同选项,并且对此几乎没有怀疑.有人可以在下面澄清,如果你已经工作过.

问题 - 1:根据 Spring-Kafka 文档,有两种方法可以实现 Kafka-Consumer;您可以通过配置 MessageListenerContainer 并提供消息侦听器或使用 @KafkaListener 注释来接收消息".有人能告诉我什么时候应该选择一个选项而不是另一个选项吗?

问题 - 2:我选择了 KafkaListener 方法来编写我的应用程序.为此,我需要初始化一个容器工厂实例,并且在容器工厂内部有控制并发的选项.只是想仔细检查我对并发的理解是否正确.

假设,我有一个主题名称 MyTopic,其中有 4 个分区.为了使用来自 MyTopic 的消息,我已经启动了我的应用程序的 2 个实例,这些实例是通过将并发设置为 2 来启动的.因此,理想情况下,按照 kafka 分配策略,2 个分区应分配给 consumer1,其他 2 个分区应分配给 consumer2.既然并发设置为2,那么每个consumer是否会启动2个线程,并行的从topic中消费数据?如果我们并行消费,我们还应该考虑什么.

问题 3 - 我选择了手动确认模式,而不是在外部管理偏移量(不将其持久化到任何数据库/文件系统).那么我是否需要编写自定义代码来处理重新平衡,或者框架会自动管理它?我认为没有,因为我只有在处理完所有记录后才承认.

问题 - 4:另外,使用手动 ACK 模式,哪个监听器会提供更好的性能?BATCH 消息侦听器或普通消息侦听器.我想如果我使用普通消息侦听器,则在处理每条消息后将提交偏移量.

粘贴以下代码供您参考.

批量确认消费者:

 public void onMessage(List>记录,确认确认,消费者<?,?>消费者) {for (ConsumerRecord 记录:记录) {System.out.println("记录:" + record.value());//处理这里的消息..listener.addOffset(record.topic(), record.partition(), record.offset());}确认.确认();}

初始化容器工厂:

@Beanpublic ConsumerFactory消费者工厂(){返回新的 DefaultKafkaConsumerFactory(consumerConfigs());}@豆角,扁豆公共地图<字符串,对象>消费者配置(){映射<字符串,对象>configs = new HashMap();configs.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, bootStrapServer);configs.put(ConsumerConfig.GROUP_ID_CONFIG, groupId);configs.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, enableAutoCommit);configs.put(ConsumerConfig.MAX_POLL_INTERVAL_MS_CONFIG, maxPolInterval);configs.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, autoOffsetReset);configs.put(ConsumerConfig.CLIENT_ID_CONFIG, clientId);configs.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);configs.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);返回配置;}@豆角,扁豆public ConcurrentKafkaListenerContainerFactorykafkaListenerContainerFactory() {ConcurrentKafkaListenerContainerFactoryfactory = new ConcurrentKafkaListenerContainerFactory();//不确定这个属性的影响,所以用 1factory.setConcurrency(2);factory.setBatchListener(true);factory.getContainerProperties().setAckMode(AckMode.MANUAL);factory.getContainerProperties().setConsumerRebalanceListener(RebalanceListener.getInstance());factory.setConsumerFactory(consumerFactory());factory.getContainerProperties().setMessageListener(new BatchAckConsumer());返厂;}

解决方案

  1. @KafkaListener 是一个消息驱动的POJO",它添加了有效载荷转换、参数匹配等内容.如果你实现了 MessageListener 你只能得到来自 Kafka 的原始 ConsumerRecord.请参阅 @KafkaListener 注释.

  2. 是的,并发代表线程数;每个线程创建一个Consumer;它们并行运行;在您的示例中,每个分区都会有 2 个分区.

<块引用>

如果我们并行消费,我们还应该考虑什么.

您的侦听器必须是线程安全的(没有共享状态或任何此类状态需要受锁保护.

  1. 处理重新平衡事件"是什么意思不清楚.当重新平衡发生时,框架将提交任何未决的偏移量.

  2. 这没什么区别;消息侦听器 Vs.批处理侦听器只是一种偏好.即使使用消息侦听器,使用 MANUAL ackmode,在处理完轮询的所有结果后,也会提交偏移量.使用 MANUAL_IMMEDIATE 模式时,偏移量是一一提交的.

I am progressing on writing my first Kafka Consumer by using Spring-Kafka. Had a look at the different options provided by framework, and have few doubts on the same. Can someone please clarify below if you have already worked on it.

Question - 1 : As per Spring-Kafka documentation, there are 2 ways to implement Kafka-Consumer; "You can receive messages by configuring a MessageListenerContainer and providing a message listener or by using the @KafkaListener annotation". Can someone tell when should I choose one option over another ?

Question - 2 : I have chosen KafkaListener approach for writing my application. For this I need to initialize a container factory instance and inside container factory there is option to control concurrency. Just want to double check if my understanding about concurrency is correct or not.

Suppose, I have a topic name MyTopic which has 4 partitions in it. And to consume messages from MyTopic, I've started 2 instances of my application and these instances are started by setting concurrency as 2. So, Ideally as per kafka assignment strategy, 2 partitions should go to consumer1 and 2 other partitions should go to consumer2. Since the concurrency is set as 2, does each of the consumer will start 2 threads, and will consume data from the topics in parallel ? Also should we consider anything if we are consuming in parallel.

Question 3 - I have chosen manual ack mode, and not managing the offsets externally (not persisting it to any database/filesystem). So should I need to write custom code to handle rebalance, or framework will manage it automatically ? I think no as I am acknowledging only after processing all the records.

Question - 4 : Also, with Manual ACK mode, which Listener will give more performance? BATCH Message Listener or normal Message Listener. I guess if I use Normal Message listener, the offsets will be committed after processing each of the messages.

Pasted the code below for your reference.

Batch Acknowledgement Consumer:

    public void onMessage(List<ConsumerRecord<String, String>> records, Acknowledgment acknowledgment,
          Consumer<?, ?> consumer) {
      for (ConsumerRecord<String, String> record : records) {
          System.out.println("Record : " + record.value());
          // Process the message here..
          listener.addOffset(record.topic(), record.partition(), record.offset());
       }
       acknowledgment.acknowledge();
    }

Initialising container factory:

@Bean
public ConsumerFactory<String, String> consumerFactory() {
    return new DefaultKafkaConsumerFactory<String, String>(consumerConfigs());
}

@Bean
public Map<String, Object> consumerConfigs() {
    Map<String, Object> configs = new HashMap<String, Object>();
    configs.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, bootStrapServer);
    configs.put(ConsumerConfig.GROUP_ID_CONFIG, groupId);
    configs.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, enablAutoCommit);
    configs.put(ConsumerConfig.MAX_POLL_INTERVAL_MS_CONFIG, maxPolInterval);
    configs.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, autoOffsetReset);
    configs.put(ConsumerConfig.CLIENT_ID_CONFIG, clientId);
    configs.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
    configs.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
    return configs;
}

@Bean
public ConcurrentKafkaListenerContainerFactory<String, String> kafkaListenerContainerFactory() {
    ConcurrentKafkaListenerContainerFactory<String, String> factory = new ConcurrentKafkaListenerContainerFactory<String, String>();
    // Not sure about the impact of this property, so going with 1
    factory.setConcurrency(2);
    factory.setBatchListener(true);
    factory.getContainerProperties().setAckMode(AckMode.MANUAL);
    factory.getContainerProperties().setConsumerRebalanceListener(RebalanceListener.getInstance());
    factory.setConsumerFactory(consumerFactory());
    factory.getContainerProperties().setMessageListener(new BatchAckConsumer());
    return factory;
}

解决方案

  1. @KafkaListener is a message-driven "POJO" it adds stuff like payload conversion, argument matching, etc. If you implement MessageListener you can only get the raw ConsumerRecord from Kafka. See @KafkaListener Annotation.

  2. Yes, the concurrency represents the number of threads; each thread creates a Consumer; they run in parallel; in your example, each would get 2 partitions.

Also should we consider anything if we are consuming in parallel.

Your listener must be thread-safe (no shared state or any such state needs to be protected by locks.

  1. It's not clear what you mean by "handle rebalance events". When a rebalance occurs, the framework will commit any pending offsets.

  2. It doesn't make a difference; message listener Vs. batch listener is just a preference. Even with a message listener, with MANUAL ackmode, the offsets are committed when all the results from the poll have been processed. With MANUAL_IMMEDIATE mode, the offsets are committed one-by-one.

这篇关于Spring-Kafka 并发属性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆