Kafka生产者-如何在不停机且不保留消息顺序的情况下更改主题? [英] Kafka producer - How to change a topic without down-time and preserving message ordering?

查看:91
本文介绍了Kafka生产者-如何在不停机且不保留消息顺序的情况下更改主题?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这个问题是关于架构和kafka主题的迁移.

原始问题:没有向后兼容性的架构演进.

https://docs.confluent.io/current/schema-registry /avro.html

我正在要求社区提供建议或分享文章,从中我可以得到启发,甚至可以考虑解决问题的方法.也许有架构或流模式.不必为我提供特定于语言的解决方案;只是给我一个方向,我可以去...我的问题很大,对于以后想要的人来说可能很有趣

  • a)更改消息格式,并将消息生成新主题.
  • b)停止在​​一个主题中产生消息,而立即产生"在另一个主题中;换句话说,一旦生成v2中的消息,就不会在v1中添加新消息.

问题

我正在更改消息格式,该消息格式与以前的版本不兼容.为了不打扰现有的消费者,我决定对新话题发表意见.

施法者想法

我已经读过一个上流演员.

https://docs.axoniq.io /reference-guide/operations-guide/production-considerations/versioning-events

正式任务

v1v2为主题.目前,我在主题v1中生成格式为format_v1的消息.我想将格式为format_v2的消息生成到主题v2中.切换应该在我可以选择的时间进行.

换句话说,在某个时刻,生产者的所有实例都停止向v1发送消息,并开始向v2发送消息;因此,v1中的最后一条消息m1v2中的第一条消息m2之前产生.

详细信息

我有个主意,我可以为主题v1生成消息,并具有一个订阅了v1并将转换后的消息推送到v2的kafka Steam上播程序.假设转换器(在我的情况下为 )可以将format_v1的消息正确地转换为format_v2.

正如上面有关avro模式演变的链接中所述,当我添加一个上播程序并将消息生成到v1中时,我已将所有<​​c1>的使用者都更改为v2.

现在,棘手的部分.我们有两个要求:

1..没有生产中断时间.

2..保留邮件顺序.

这意味着:

1)我们不允许丢失消息;客户可以随时使用我们的系统,因此我们的系统应该随时产生消息.

2)我们正在运行生产者的多个实例.在某个时刻,可能(可能)有生产者可能将格式为format_v1的消息生成为主题v1,并且有些实例将格式为format_v2的消息生成为主题v2.

我们知道,kafka不保证不同分区和主题的消息顺序.

我可以通过使用与v1相同的分区选择器将消息写入v2来解决分区问题.或者现在,我可以想象我们只对v1使用一个分区,对v2使用一个分区.


我的简化和尝试

1)我以为现在要更改生产者以将消息产生为新主题时,我有一个上播程序(kafka流组件),能够将消息从v1转换为v2没有错误.这个kafka流组件是可扩展的.

2)我所有的使用者都已经切换到v2主题.他们不断从v2接收消息.此时此刻,我的生产者实例正在将消息生成到主题v1中,并且向上广播者很好地完成了工作.

3)为简化问题,让我们假设现在format_v1format_v2无关紧要,并且它们是相同的.

4)假设我们为v1有一个分区,为v2有一个分区.

现在我的问题是,如何从给定的时间点立即转换所有生产者;所有实例都将消息发送到主题v2.

我的同事和卡夫卡专家告诉我,在停机时可以做到

如果您依靠分区中消息的顺序,则不能在没有停机的情况下切换到新版本.为了使停机时间最少,我们可以执行以下操作.

Upcaster组件必须将数据写入相同的分区,并应尝试进行相同的偏移.但是,并非总是可能的,因为偏移量可能会有间隙,因此必须保留旧偏移量和新偏移量之间的映射.没有所有记录,只有每个分区的最后一个批量.如果upcaster崩溃,请重新开始,制作人仍不参与v2.

启动v2使用者.如果它从与v1相同的使用者组开始,则不应执行任何操作,如果它具有新的使用者组,请根据新偏移量更新Kafka中的偏移量.

现在生产者将数据写入v1,向上转换者转换数据,消费者从v2消费

这里是休息时间.当upcaster的滞后时间接近于0时,关闭v1生产者,等到upcaster转换其余记录,然后关闭upcaster,启动v2生产者,该生产者将写入v2主题.

我通过数据库中的手动操作(通过某些端点或其他方式)来更改标志;生产者在产生消息之前总是检查标志.当标志显示v2true时,生产者将开始将消息写入v2.但是,如果在某个时间该标志为假,那么一个生产者开始将消息生成到v1中,然后该标志已更改,而另一个生产者在先前的生产者完成生成到v1中之前将消息发送到了v2中,该怎么办.

解决方案

只有一个生产者处于活动状态是否可以接受?

在这种情况下,您可以将您的想法与标志一起使用:

  1. 关闭所有生产者p2p3,...,pn,除了p1
  2. p1单独写入v1
  3. 将标志切换到v2,因此p1结束对v1的最后写入,并开始写入v2
  4. 现在没有人写v1
  5. 启动其他生产者p2p3,...,pn
  6. 每个生产者现在都由于v2的活动标志而进行写入,而仍然没有人向v1
  7. 进行写入

This question is about architecture and kafka topics migrating.

Original problem: schema evolution without backward compatibility.

https://docs.confluent.io/current/schema-registry/avro.html

I am asking the community to give me an advice or share articles from which I can get inspired and maybe think of a solution to my problem. Maybe there is an architecture or streaming pattern. It is not necessary to give me a language specific solution; just give me a direction into which I can go... My question is big, it may be interesting for those who later want

  • a) change message format and produce message into a new topic.
  • b) stop producing message into one topic and start producing messages into another topic "instantly"; in other words once a message in v2 was produced, no new messages are appended into v1.

Problem

I am changing message format, which is not compatible with the pervious version. In order not to break existing consumers, I decided to produce message to a new topic.

Up-caster idea

I have read about an up-caster.

https://docs.axoniq.io/reference-guide/operations-guide/production-considerations/versioning-events

Formal task

Let v1 and v2 be the topics. Currently, I produce messages in the format format_v1 into the topic v1. I want to produce messages in the format format_v2 into the topic v2. The switch should happen at some moment of time which I can choose.

In other words, at some moment of time, all instances of the producer stop sending messages into v1, and start sending messages into v2; thus the last message m1 in v1 is produced before the first message of m2 in v2.

Details

I got an idea, that I can produce messages to the topic v1 have a kafka steam up-caster that is subscribed to v1 and pushes transformed messages to v2. Let assume that the transformer (in my case of course) is able to transform message of format_v1 into format_v2 without errors.

As described in the link above about avro schema evolution, by the time I have added an up-caster and produce messages into v1, I have all my consumers of v1 changed into v2.

Now, a tricky part. We have two requirements:

1. No production down-time.

2. Preserve message ordering.

It means:

1) We are not allowed to lose messages; a client may use our system at any time, so our system should produce a message at any time.

2) We are running multiple instances of the producer. At some moment of time there can (potentially) be producers that may produce messages of format format_v1 into the topic v1, and some instances that produce messages of format format_v2 into the topic v2.

As we know, kafka does not guarantee message ordering for different partitions and topics.

I can solve the problem with partitions by writing message into v2 with the same partition selector as for v1. Or for now, I can imagine that we use just one partition for v1 and one partition for v2.


My simplifications and attempts

1) I imagined that by the moment I want to change the producer to produce messages into a new topic, I have an up-caster (kafka stream component) that is capable of transforming messages from v1 into v2 without error. This kafka stream component is scalable.

2) All my consumers have been already switched into v2 topic. They constantly receive messages from v2. At this moment of time, my producer instances are producing messages into the topic v1 and up-caster does its job well.

3) To simplify the problem, let's imagine that for now format_v1 and format_v2 do not matter, and they are the same.

4) Let's imagine we have one partition for v1 and one partition for v2.

Now my problem, how to instantly switch all producers that from a given point of time; all the instances produce messages into the topic v2.

My colleague and kafka expert told me that with down-time it can be done

If you rely on the order of the messages in the partitions, you cannot switch to the new version without down time. To make down time minimal we can do the following.

Upcaster component must write the data to the same partitions and should try to make the same offsets. However it is not always possible, as offsets may have gaps, so the mapping between old offsets, and new offsets must be kept. No all the records, only the last bulk for each partition. If upcaster crashes, just start again, producer is still not involved in v2.

Start the v2 consumer. If it starts with the same consumer group as v1, nothing should be done, if it has new consumer group, update offsets in Kafka according to the new offsets.

Now Producers writes to v1, upcaster converts the data, consumer consumes from v2

Here comes down time. When the lag of upcaster is close to 0, shutdown v1 producer, wait until upcaster converts rest of the records, shutdown upcaster, start v2 producer, which writes to v2 topic.

I though of manual manipulation in the database (via some rest-endpoint or etc) to change a flag; producers always check the flag before they produce messages. When the flag says v2 or true, the producer will start writing messages into v2. However, what if at moment of time the flag is false one producers starts producing message into v1, then the flag has changed and another producer has sent a message into v2 before the previous producer finished producing into v1.

解决方案

Is it acceptable for you to have only one producer being active?

In that case you can use your idea with a flag:

  1. Shut down all producers p2,p3,...,pn except p1
  2. p1 writes to v1 alone
  3. Switch the flag to v2, so p1 ends its last write to v1 and starts writing to v2
  4. Now nobody writes to v1
  5. Start your other producers p2,p3,...,pn
  6. Every producer writes now because of the active flag to v2 and still nobody to v1

这篇关于Kafka生产者-如何在不停机且不保留消息顺序的情况下更改主题?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆