Kafka Streams 错误 - 分区上的偏移提交失败,请求超时 [英] Kafka Streams error - Offset commit failed on partition, request timed out

查看:31
本文介绍了Kafka Streams 错误 - 分区上的偏移提交失败,请求超时的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们使用 Kafka Streams 来消费、处理和生成消息,并且在 PROD env 上我们遇到了多个主题的错误:

ERROR org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=app-xxx-StreamThread-3-consumer, groupId=app]在偏移量 13920 处的分区 xxx-1 上偏移量提交失败:请求超时.[]

对于负载较小的主题,这些错误很少发生,但对于负载较高(和峰值)的主题,每个主题每天会发生数十次错误.主题有多个分区(例如 10 个).似乎这个问题不影响数据的处理(尽管性能),因为在抛出异常之后(甚至可能是同一偏移量的多个错误),消费者稍后重新读取消息并成功处理它.

我看到这个错误信息出现在 kafka-clients 版本 1.0.0 由于 PR,但在以前的版本中使用 在消费者上)类似消息(Offset commit for group {} failed: {})以 debug 级别记录.对我来说,对于这种用例,将日志级别更新为警告会更合乎逻辑.

如何解决这个问题?根本原因可能是什么?也许更改消费者属性或分区设置可以帮助摆脱此类问题.

我们使用以下实现来创建 Kafka Streams:

StreamsBuilder builder = new StreamsBuilder();KStream<字符串,字符串>stream = builder.stream(topicName);stream.foreach((key, value) -> processMessage(key, value));拓扑拓扑 = builder.build();StreamsConfig StreamsConfig = new StreamsConfig(consumerSettings);新的 KafkaStreams(流拓扑,流配置);

我们的 Kafka 消费者设置:

bootstrap.servers: xxx1:9092,xxx2:9092,...,xxx5:9092application.id: 应用程序state.dir:/tmp/kafka-streams/xxxcommit.interval.ms: 5000 # 我也试过默认值 30000key.serde: org.apache.kafka.common.serialization.Serdes$StringSerdevalue.serde: org.apache.kafka.common.serialization.Serdes$StringSerdetimestamp.extractor: org.apache.kafka.streams.processor.WallclockTimestampExtractor

kafka 代理版本:kafka_2.11-0.11.0.2.两个版本的 Kafka Streams 都发生错误:1.0.11.1.0.

解决方案

看起来您的 Kafka 集群有问题,并且 Kafka 消费者在尝试提交偏移量时超时.您可以尝试增加Kafka消费者的连接相关配置

  1. request.timeout.ms(默认为 305000 毫秒)

<块引用>

该配置控制客户端将使用的最长时间等待请求的响应

  1. connections.max.idle.ms(默认为 540000 毫秒)

<块引用>

在指定的毫秒数后关闭空闲连接这个配置.

We use Kafka Streams for consuming, processing and producing messages, and on PROD env we faced with errors on multiple topics:

ERROR org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=app-xxx-StreamThread-3-consumer, groupId=app] 
Offset commit failed on partition xxx-1 at offset 13920: 
The request timed out.[]

These errors occur rarely for topics with small load, but for topics with high load (and spikes) errors occur dozens of times a day per topic. Topics have multiple partitions (e.g. 10). Seems this issue does not affect processing of data (despite performance), as after throwing exception (even could be multiple errors for the same offset), consumer later re-read message and successfully process it.

I see that this error message appeared in kafka-clients version 1.0.0 due to PR, but in previous kafka-clients versions for the same use case (Errors.REQUEST_TIMED_OUT on consumer) similar message (Offset commit for group {} failed: {}) was logged with debug level. as for me, it would be more logical to update log level to warning for such use case.

How to fix this issue? What could be the root cause? Maybe changing consumer properties or partition setup could help to get rid of such issue.

we use the following implementation for creating Kafka Streams:

StreamsBuilder builder = new StreamsBuilder();
KStream<String, String> stream = builder.<String, String>stream(topicName);
stream.foreach((key, value) -> processMessage(key, value));
Topology topology = builder.build();
StreamsConfig streamsConfig = new StreamsConfig(consumerSettings);
new KafkaStreams(streamsTopology, streamsConfig);

our Kafka consumer settings:

bootstrap.servers: xxx1:9092,xxx2:9092,...,xxx5:9092
application.id: app
state.dir: /tmp/kafka-streams/xxx
commit.interval.ms: 5000       # also I tried default value 30000
key.serde: org.apache.kafka.common.serialization.Serdes$StringSerde
value.serde: org.apache.kafka.common.serialization.Serdes$StringSerde
timestamp.extractor: org.apache.kafka.streams.processor.WallclockTimestampExtractor

kafka broker version: kafka_2.11-0.11.0.2. error occur on both versions of Kafka Streams: 1.0.1 and 1.1.0.

解决方案

Looks like you have issue with Kafka cluster and Kafka consumer is get timed out while trying to commit offsets. You can try to increase connection related configs for Kafka consumer

  1. request.timeout.ms (by default 305000ms)

The configuration controls the maximum amount of time the client will wait for the response of a request

  1. connections.max.idle.ms (by default 540000ms)

Close idle connections after the number of milliseconds specified by this config.

这篇关于Kafka Streams 错误 - 分区上的偏移提交失败,请求超时的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆