如何缩放 max.incremental.fetch.session.cache.slots [英] How to scale max.incremental.fetch.session.cache.slots

查看:32
本文介绍了如何缩放 max.incremental.fetch.session.cache.slots的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在运行一个有点大的 Kafka 集群,但目前我无法正确设置 max.incremental.fetch.session.cache.slots 并且需要一些指导.关于此的文档也不清楚:https://cwiki.apache.org/confluence/display/KAFKA/KIP-227%3A+Introduce+Incremental+FetchRequests+to+Increase+Partition+Scalability

按规模我的意思是:3 个节点、约 400 个主题、4500 个分区、300 个消费者组、500 个消费者

一段时间以来,我看到日志中出现 FETCH_SESSION_ID_NOT_FOUND 错误,并希望解决这些错误.

所以我尝试增加配置中的值,重新启动所有代理,并且池再次快速填满到最大容量.这减少了错误的发生,但它们并没有完全消失.起初我将值设置为 2000,它立即满了.然后分几步达到 100.000.游泳池在大约 40 分钟内就被填满了.

从文档中,当 min.incremental.fetch.session.eviction.ms 开始时,我预计池会在 2 分钟后达到上限.但情况似乎并非如此.

衡量缓存适当大小的指标是什么.我仍然看到我可以在代理上修复的错误还是我需要寻找配置错误的消费者?如果是这样,我需要注意什么?

解决方案

如此高的 Fetch Sessions 使用率很可能是由坏客户端造成的.

Sarama,一个 Golang 客户端,有一个问题,导致一个新的 Fetch Session 被分配到版本 1.26.0 和 1.26.2 之间的每个 Fetch 请求,请参阅 https://github.com/Shopify/sarama/pull/1644.

我建议您检查是否有用户在运行此客户端,并确保他们更新到最新版本.

I'm running a somewhat large Kafka cluster, but currently I'm stuck at properly setting max.incremental.fetch.session.cache.slots and would need some guidance. The documentation about this is not clear either: https://cwiki.apache.org/confluence/display/KAFKA/KIP-227%3A+Introduce+Incremental+FetchRequests+to+Increase+Partition+Scalability

By scale i mean: 3 nodes, ~400 Topics, 4500 Partitions, 300 consumergroups, 500 consumers

For a while now, I'm seeing the FETCH_SESSION_ID_NOT_FOUND errors appearing in the logs and wanted to address them.

So I've tried increasing the value in the config, restarted all brokers and the pool quickly filled up again to it's max capacity. This reduced the occurrence of the errors, but they are not completely gone. At first I've set to value to 2000, it was instantly full. Then in several steps up to 100.000. And the pool was filled in ~40 Minutes.

From the documentation I was expecting the pool to cap out after 2 Minutes when min.incremental.fetch.session.eviction.ms kicks in. But this seems not to be the case.

What would be the metrics to gauge the appropriate size of the cache. Are the errors I'm still seeing anything I can fix on the brokers or do I need to hunt down misconfigured consumers? If so, what do I need to look out for?

解决方案

Such a high usage of Fetch Sessions is most likely caused by a bad client.

Sarama, a Golang client, had an issue that caused a new Fetch Session to be allocated on every Fetch request between versions 1.26.0 and 1.26.2, see https://github.com/Shopify/sarama/pull/1644.

I'd recommend checking if you have users running this client and ensure they update to the latest release.

这篇关于如何缩放 max.incremental.fetch.session.cache.slots的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆