消费者和生产者因错误而失败:“在读取响应之前断开了与 0 的连接" [英] Consumer and producer failing with error: "Connection to 0 was disconnected before the response was read"

查看:36
本文介绍了消费者和生产者因错误而失败:“在读取响应之前断开了与 0 的连接"的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个由 3 个 Kafka 代理组成的集群,所有主题的复制因子为 3.由于最近几天我面临这个问题,突然(一天几次)消费者和生产者在得到响应时卡住了,即使 Kafka 在所有 3 个服务器上运行,直到我检查代理日志(连接到 0 已断开连接在响应被读取之前")并找出罪魁祸首节点为 0(在本例中为第一个节点)并在该节点上重新启动 zookeeper 和 broker.

I have a cluster of 3 Kafka brokers with replication factor of 3 for all topics. Since last few days I am facing this issue, suddenly(few times in a day) consumers and producers are stuck while getting response even when Kafka is running on all 3 servers which gets resolved until I check brokers logs ("Connection to 0 was disconnected before the response was read")and find out the culprit node which is 0 (1st node in this case)and restarts zookeeper and broker on that node.

根据日志,这是由于重新平衡而发生的.

As per logs it is happening due to rebalancing.

我将 min.insync.replicas 减少到 2,但这无济于事.

I reduces min.insync.replicas to 2, but it doesn't help.

在这种情况下导致问题的 0(第一个节点)的服务器日志:

Server logs of 0 (1st node) which caused problem in this case:

Member consumer-3-8e370c0e-4a21-4dec-8301-18ce6aaf71d9 in group banner has failed, removing it from the group (kafka.coordinator.group.GroupCoordinator)
  Preparing to rebalance group banner in state PreparingRebalance with old generation 2570 (__consumer_offsets-5) (reason: removing member consumer-3-8e370c0e-4a21-4dec-8301-18ce6aaf71d9 on heartbeat expiration) (kafka.coordinator.group.GroupCoordinator)
  Member consumer-4-da57dad3-6825-4a6d-ac93-82a29f72a3dc in group banner has failed, removing it from the group (kafka.coordinator.group.GroupCoordinator)
  Member consumer-2-812b613b-3409-42e7-baf8-8b32df4e2fa4 in group banner has failed, removing it from the group (kafka.coordinator.group.GroupCoordinator)
  Member consumer-2-d03f0417-4e0f-4ab0-90c6-12b17a6354d7 in group poster has failed, removing it from the group (kafka.coordinator.group.GroupCoordinator)
  Preparing to rebalance group poster in state PreparingRebalance with old generation 191 (__consumer_offsets-9) (reason: removing member consumer-2-d03f0417-4e0f-4ab0-90c6-12b17a6354d7 on heartbeat expiration) (kafka.coordinator.group.GroupCoordinator)
  Group poster with generation 192 is now empty (__consumer_offsets-9) (kafka.coordinator.group.GroupCoordinator)
  Member rdkafka-fda5cec6-e121-4ab7-9650-83d391abc82d in group notification-test has failed, removing it from the group (kafka.coordinator.group.GroupCoordinator)
  Stabilized group notification-test generation 436 (__consumer_offsets-20) (kafka.coordinator.group.GroupCoordinator)
  Member consumer-5-eeb1b721-b52b-4b65-af70-e48a345d150f in group banner has failed, removing it from the group (kafka.coordinator.group.GroupCoordinator)
  Member consumer-4-889659e2-3c2f-4059-bf0c-45796f824443 in group banner has failed, removing it from the group (kafka.coordinator.group.GroupCoordinator)
  Member consumer-5-95a38999-6156-4a53-ac1f-1d51703956fd in group banner has failed, removing it from the group (kafka.coordinator.group.GroupCoordinator)
  Group banner with generation 2571 is now empty (__consumer_offsets-5) (kafka.coordinator.group.GroupCoordinator)
  Failed to write empty metadata for group poster: The group is rebalancing, so a rejoin is needed. (kafka.coordinator.group.GroupCoordinator)
  Failed to write empty metadata for group banner: The group is rebalancing, so a rejoin is needed. (kafka.coordinator.group.GroupCoordinator)
  Member consumer-5-e7a6af24-1f50-40cc-a593-cf8614e9d088 in group redemption has failed, removing it from the group (kafka.coordinator.group.GroupCoordinator)
  Preparing to rebalance group redemption in state PreparingRebalance with old generation 1373 (__consumer_offsets-27) (reason: removing member consumer-5-e7a6af24-1f50-40cc-a593-cf8614e9d088 on heartbeat expiration) (kafka.coordinator.group.GroupCoordinator)
  Member consumer-1-d89defb1-6637-48bc-ba16-b646c32d3849 in group redemption has failed, removing it from the group (kafka.coordinator.group.GroupCoordinator)
  Member consumer-4-c942542b-7c54-4656-a485-410278b936ec in group redemption has failed, removing it from the group (kafka.coordinator.group.GroupCoordinator)
  Member consumer-3-adfb6536-2fbc-4b39-9368-56b665db2c75 in group redemption has failed, removing it from the group (kafka.coordinator.group.GroupCoordinator)
[2019-05-19 13:22

其他经纪人的日志:

java.io.IOException: Connection to 0 was disconnected before the response was read
        at org.apache.kafka.clients.NetworkClientUtils.sendAndReceive(NetworkClientUtils.java:97)
        at kafka.server.ReplicaFetcherBlockingSend.sendRequest(ReplicaFetcherBlockingSend.scala:97)
        at kafka.server.ReplicaFetcherThread.fetchFromLeader(ReplicaFetcherThread.scala:190)
        at kafka.server.AbstractFetcherThread.kafka$server$AbstractFetcherThread$$processFetchRequest(AbstractFetcherThread.scala:241)
        at kafka.server.AbstractFetcherThread$$anonfun$maybeFetch$1.apply(AbstractFetcherThread.scala:130)
        at kafka.server.AbstractFetcherThread$$anonfun$maybeFetch$1.apply(AbstractFetcherThread.scala:129)
        at scala.Option.foreach(Option.scala:257)
        at kafka.server.AbstractFetcherThread.maybeFetch(AbstractFetcherThread.scala:129)
        at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:111)
        at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:82)

消费者服务日志:

|FAIL|rdkafka#consumer-2| [thrd:m-data-kaf006.c-14cba.internal:9092/2]: m-data-kaf006.c.internal:9092/2: 3 request(s) timed out: disconnect

无法找到解决方案.

卡夫卡版本:2.1.0

Kafka version: 2.1.0

推荐答案

这是由于较旧的 kafka 版本中的问题.您需要将您的 kafka 升级到 2.2.0、2.1.1 请在下面找到底层的 jira 问题 - https://issues.apache.org/jira/browse/KAFKA-7697

This is due to an issue in an older kafka version. You need to upgrade your kafka to 2.2.0, 2.1.1 Please find below the underlying jira issue - https://issues.apache.org/jira/browse/KAFKA-7697

这篇关于消费者和生产者因错误而失败:“在读取响应之前断开了与 0 的连接"的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆