重新启动集群时,Connect Consumer作业将被删除 [英] Connect consumers jobs are getting deleted when restarting the cluster

查看:68
本文介绍了重新启动集群时,Connect Consumer作业将被删除的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在更改与kafka相关的某些属性并重新启动集群时,我面临以下问题.

I am facing the below issue on changing some properties related to kafka and re-starting the cluster.

In kafka Consumer, there were 5 consumer jobs are running . 

如果我们进行一些重要的属性更改,并且在重新启动集群时,一些/所有现有的使用者作业将无法启动.

If we make some important property change , and on restarting cluster some/all the existing consumer jobs are not able to start.

Ideally all the consumer jobs should start , 

因为它将从下面的系统主题中获取元数据信息.

since it will take the meta-data info from the below System-topics .

config.storage.topic
offset.storage.topic
status.storage.topic

推荐答案

首先,有一点背景知识. Kafka将其所有数据存储在主题中,但这些主题(或构成这些主题的分区)主题)是仅附加日志,除非执行某些操作,否则它将永远增长.为了防止这种情况,Kafka可以通过两种方式清理主题:保留和压缩.配置为使用 保留 的主题将在一段可配置的时间内保留数据:代理可以自由删除任何早于此时间的日志消息.配置为使用> 压缩 的主题需要每个消息都有一个密钥,并且代理将始终为每个不同的密钥保留最新的已知消息.当每个消息(即键/值对)代表键的最后一个已知状态时,压缩非常方便.因为消费者正在阅读主题以获取每个键的最后一个已知状态,所以如果删除了较旧的状态,他们最终将更快地到达那个最后一个状态.

First, a bit of background. Kafka stores all of its data in topics, but those topics (or rather the partitions that make up a topic) are append-only logs that would grow forever unless something is done. To prevent this, Kafka has the ability to clean up topics in two ways: retention and compaction. Topics configured to use retention will retain data for a configurable length of time: the broker is free to remove any log messages that are older than this. Topics configured to use compaction require every message have a key, and the broker will always retain the last known message for every distinct key. Compaction is extremely handy when each message (i.e., key/value pair) represents the last known state for the key; since consumers are reading the topic to get the last known state for each key, they will eventually get to that last state a bit faster if older states are removed.

代理对主题使用的哪种清理策略取决于几件事.默认情况下,每个隐式或显式创建的主题都将使用 retention ,尽管您可以更改以下几种方式:

Which cleanup policy a broker will use for a topic depends on several things. Every topic created implicitly or explicitly will use retention by default, though you can change a couple of ways:

  • 更改全局log.cleanup.policy代理设置,仅影响在此之后创建的主题;或
  • 创建或修改主题时,请指定cleanup.policy主题特定设置
  • change the globally log.cleanup.policy broker setting, affecting only topics created after that point; or
  • specify the cleanup.policy topic-specific setting when you create or modify a topic

现在,Kafka Connect使用几个内部主题来存储连接器配置,偏移量和状态信息.这些内部主题必须 紧凑主题,以便(至少)每个连接器的最后配置,偏移和状态始终可用.由于Kafka Connect绝不使用旧的配置,偏移量和状态,因此对于代理来说,将其从内部主题中删除实际上是一件好事.

Now, Kafka Connect uses several internal topics to store connector configurations, offsets, and status information. These internal topics must be compacted topics so that (at least) the last configuration, offset, and status for each connector are always available. Since Kafka Connect never uses older configurations, offsets, and status, it's actually a good thing for the broker to remove them from the internal topics.

在Kafka 0.11.0.0之前,推荐的过程是使用正确的主题特定设置手动创建这些内部主题.您可以依靠代理来自动创建它们,但是由于多个原因,这是有问题的,其中最重要的是三个内部主题应该具有不同数量的分区.

Before Kafka 0.11.0.0, the recommended process is to manually create these internal topics using the correct topic-specific settings. You could rely upon the broker to auto-create them, but that is problematic for several reasons, not the least of which is that the three internal topics should have different numbers of partitions.

如果未对这些内部主题进行精简,则在保留期限过后,将清除并删除配置,偏移量和状态信息.默认情况下,此保留期为24小时!这意味着,如果在部署/更新连接器配置后超过24小时重新启动Kafka Connect,则该连接器的配置可能已清除,并且看起来好像该连接器配置不存在.

If these internal topics are not compacted, the configurations, offsets, and status info will be cleaned up and removed after the retention period has elapsed. By default this retention period is 24 hours! That means that if you restart Kafka Connect more than 24 hours after deploying / updating a connector configuration, that connector's configuration may have been purged and it will appear as if the connector configuration never existed.

因此,如果您没有正确创建这些内部主题,只需使用主题管理工具将主题的设置更新为文档中所述的 .

So, if you didn't create these internal topics correctly, simply use the topic admin tool to update the topic's settings as described in the documentation.

顺便说一句,不能正确创建这些内部主题是一个非常常见的问题,以至于Kafka Connect 0.11.0.0将能够使用正确的设置自动创建这些内部主题,而无需依赖中介自动创建主题.

BTW, not properly creating these internal topics is a very common problem, so much so that Kafka Connect 0.11.0.0 will be able to automatically create these internal topics using the correct settings without relying upon broker auto-creation of topics.

在0.11.0中,对于源连接器写入的主题,您仍然必须依靠手动创建或代理自动创建.这并不理想,因此有一个

In 0.11.0 you will still have to rely upon manual creation or broker auto-creation for topics that source connectors write to. This is not ideal, and so there's a proposal to change Kafka Connect to automatically create the topics for the source connectors while giving the source connectors control over the settings. Hopefully that improvement makes it into 0.11.1.0 so that Kafka Connect is even easier to use.

这篇关于重新启动集群时,Connect Consumer作业将被删除的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆