发布和使用不同类型消息的最佳方式是什么? [英] What is the best way to publish and consume different type of messages?

查看:29
本文介绍了发布和使用不同类型消息的最佳方式是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

卡夫卡 0.8V

我想发布/consume byte[] 对象、java bean 对象、可序列化对象等等..

I want to publish /consume byte[] objects, java bean objects, serializable objects and much more..

为这种类型的场景定义发布者和消费者的最佳方式是什么?当我从消费者迭代器消费一条消息时,我不知道它是什么类型的消息.任何人都可以指点我如何设计此类场景的指南吗?

What is the best way to define a publisher and consumer for this type scenario? When I consume a message from the consumer iterator, I do not know what type of the message it is. Can anybody point me a guide on how to design such scenarios?

推荐答案

我强制每个 Kafka 主题使用单一模式或对象类型.这样,当您收到消息时,您就可以确切地知道自己收到了什么.

I enforce a single schema or object type per Kafka Topic. That way when you receive messages you know exactly what you are getting.

至少,您应该决定给定的主题是要保存 binary 还是 string 数据,并根据此决定如何对其进行进一步编码.

At a minimum, you should decide whether a given topic is going to hold binary or string data, and depending on that, how it will be further encoded.

例如,您可以有一个名为 Schema 的主题,其中包含存储为字符串的 JSON 编码对象.

For example, you could have a topic named Schema that contains JSON-encoded objects stored as strings.

如果您使用 JSON 和一种松散类型的语言(如 JavaScript),那么在同一主题中存储具有不同模式的不同对象可能很诱人.使用 JavaScript,您只需调用 JSON.parse(...),查看结果对象,然后弄清楚您想用它做什么.

If you use JSON and a loosely-typed language like JavaScript, it could be tempting to store different objects with different schemas in the same topic. With JavaScript, you can just call JSON.parse(...), take a peek at the resulting object, and figure out what you want to do with it.

但是你不能在像 Scala 这样的严格类型的语言中做到这一点.Scala JSON 解析器通常希望您将 JSON 解析为已定义的 Scala 类型,通常是 case class.它们不适用于此模型.

But you can't do that in a strictly-typed language like Scala. The Scala JSON parsers generally want you to parse the JSON into an already defined Scala type, usually a case class. They do not work with this model.

一种解决方案是保持一个模式/一个主题规则,但有点作弊:将一个对象包装在一个对象中.一个典型的例子是 Action 对象,其中你有一个描述操作的标头,以及一个有效负载对象,其架构依赖于标头中列出的操作类型.想象一下这个伪模式:

One solution is to keep the one schema / one topic rule, but cheat a little: wrap an object in an object. A typical example would be an Action object where you have a header that describes the action, and a payload object with a schema dependent on the action type listed in the header. Imagine this pseudo-schema:

{name: "Action", fields: [
  {name: "actionType", type: "string"},
  {name: "actionObject", type: "string"}
]}

这样,即使在强类型语言中,您也可以执行以下操作(这也是伪代码):

This way, in even a strongly-typed language, you can do something like the following (again this is pseudo-code) :

action = JSONParser[Action].parse(msg)
switch(action.actionType) {
  case "foo" => var foo = JSONParser[Foo].parse(action.actionObject)
  case "bar" => var bar = JSONParser[Bar].parse(action.actionObject)
}

这种方法的一个巧妙之处在于,如果您有一个消费者只等待一个特定的 action.actionType,并且将忽略所有其他的,那么它是非常轻量级的只解码标题并推迟解码 action.actionObject 直到需要时.

One of the neat things about this approach is that if you have a consumer that's waiting for only a specific action.actionType, and is just going to ignore all the others, it's pretty lightweight for it to decode just the header and put off decoding action.actionObject until when and if it is needed.

到目前为止,这都是关于字符串编码的数据.如果您想处理二进制数据,当然您也可以将其包装在 JSON 中,或者使用多种基于字符串的编码(如 XML)中的任何一种.但是也有许多二进制编码系统,例如 Thrift 和 Avro.事实上,上面的伪模式是基于 Avro 的.你甚至可以在 Avro 中做一些很酷的事情,比如模式演化,它提供了一种非常灵活的方式来处理上述 Action 用例——而不是将一个对象包装在一个对象中,你可以定义一个模式是其他模式的子集,并且只解码你想要的字段,在这种情况下只是 action.actionType 字段.这是模式演变.

So far this has all been about string-encoded data. If you want to work with binary data, of course you can wrap it in JSON as well, or any of a number of string-based encodings like XML. But there are a number of binary-encoding systems out there, too, like Thrift and Avro. In fact, the pseudo-schema above is based on Avro. You can even do cool things in Avro like schema evolution, which amongst other things provides a very slick way to handle the above Action use case -- instead of wrapping an object in an object, you can define a schema that is a subset of other schemas and decode just the fields you want, in this case just the action.actionType field. Here is a really excellent description of schema evolution.

简而言之,我推荐的是:

In a nutshell, what I recommend is:

  1. 选择基于架构的编码系统(无论是 JSON、XML、Avro、随便)
  2. 对每个主题强制执行一个架构

这篇关于发布和使用不同类型消息的最佳方式是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆