Kafka 消费者启动延迟融合 dotnet [英] Kafka consumer startup delay confluent dotnet

查看:35
本文介绍了Kafka 消费者启动延迟融合 dotnet的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

启动 confluent-dotnet 消费者时,在调用 subscribe 和后续轮询后,似乎需要很长时间才能从服务器接收到已分配分区"事件,以及消息(大约 10-15 秒).

When starting up a confluent-dotnet consumer , after the call to subscribe and subsequent polling, it seems to take a very long time to receive the "Partition assigned" event from the server, and therefore messages (about 10-15sec).

一开始我以为有自动创建主题的开销,但是不管消费者的主题/消费者组是否已经存在,时间都是一样的.

At first I thought there was a auto topic creation overhead, but the time is the same whether the topic/consumer group of the consumer already exist or not.

我用这个配置启动我的消费者,其余的代码与融合高级消费者示例中的相同:

I start my consumer with this config, the rest of the code is the same as in the confluent advanced consumer example :

            var kafkaConfig = new Dictionary<string, object>
        {
            {"group.id", config.ConsumerGroup},
            {"statistics.interval.ms", 60000},
            {"fetch.wait.max.ms", 10},
            {"bootstrap.servers", config.BrokerList},
            {"enable.auto.commit", config.AutoCommit},
            {"socket.blocking.max.ms",1},
            {"fetch.error.backoff.ms",1 },
            {"socket.nagle.disable",true },
            {"auto.commit.interval.ms", 5000},

            {
                "default.topic.config", new Dictionary<string, object>()
                {
                    {"auto.offset.reset", "smallest"}
                }
            }
        };

kafka 集群由位于远程数据中心的 3 台具有默认设置的中低规格机器组成.是否可以调整代理或客户端设置以缩短此启动时间?

The kafka cluster consists of 3 low-mid spec machines in a remote datacenter with default settings. Is there a broker or client setting that can be tweaked to lower this startup time?

使用分配而不是订阅自己分配分区导致启动时间约为 2 秒

assigning partitions myself with Assign instead of Subscribe results in startup time of around 2sec instead

推荐答案

Kafka 消费者按设计成组工作 - 您看到的延迟是组协调器(驻留在集群上,而不是客户端)等待任何现有/先前会话超时并允许同一组中的任何其他使用者在将分区分配给具有活动连接的所有使用者之前启动.

Kafka Consumers work in groups by design - the delay you see is the group co-ordinator (which resides on the cluster, not the client side) waiting for any existing/previous session(s) to timeout and to allow any additional consumers in the same group to start before allocating partitions to all the consumers with an active connection.

事实上,如果你足够快地重新启动你的测试消费者,你会看到延迟跳到将近 30 秒,因为 session.timeout.ms 有一个默认值 30000 并且集群仍然没有注意到"前一个消费者已经离开,直到此超时开始.此外,如果您在重新启动之间更改 group.id,您将看到延迟急剧下降,因为集群不会等待属于不同群体的现有消费者.

In fact, if you re-start your test consumer quickly enough, you'll see that delay jump to almost 30 seconds because session.timeout.ms has a default value of 30000 and the cluster still hasn't "noticed" that the previous consumer has gone until this timeout kicks in. Also if you change group.id between restarts you'll see the delay drop drastically as the cluster won't wait on existing consumers that are part of a different group.

最后,在再次启动之前尝试干净地退出消费者(调用 Unsubscribe() 并确保消费者已被处理).

Finally, try cleanly exiting your consumer before firing up again (call Unsubscribe() and make sure the Consumer is disposed).

session.timeout.ms 似乎可以降低到 6000 以减少任何现有消费者组连接的超时,但不能降低.

It appears that session.timeout.ms can be lowered to 6000 to reduce the timeout of any existing consumer group connection, but not lower.

即使一切都开始干净",看起来您仍然会延迟最多 7 秒(我猜是标准连接设置加上等待同一组中的任何其他消费者开始).如果您使用 Assign() 而不是 Subscribe(),那么您选择自己将分区分配给您的消费者,并且自动组平衡不适用.

Even with everything starting "clean" it appears you'll still get a delay of up to 7 seconds (I'm guessing standard connection setup plus waiting for any other consumers in the same group to start). If you use Assign() instead of Subscribe() then you are choosing to assign the partitions to your consumer(s) yourself and automatic group balancing doesn't apply.

这篇关于Kafka 消费者启动延迟融合 dotnet的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆