发布和使用不同类型的消息的最佳方法是什么? [英] What is the best way to publish and consume different type of messages?

查看:93
本文介绍了发布和使用不同类型的消息的最佳方法是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Kafka 0.8V

我想发布/consume byte []对象,java bean对象,可序列化的对象等等.

I want to publish /consume byte[] objects, java bean objects, serializable objects and much more..

为这种类型的情况定义发布者和使用者的最佳方法是什么? 当我使用来自使用者迭代器的消息时,我不知道它是什么类型的消息. 有人可以给我指出如何设计这种情况的指南吗?

What is the best way to define a publisher and consumer for this type scenario? When I consume a message from the consumer iterator, I do not know what type of the message it is. Can anybody point me a guide on how to design such scenarios?

推荐答案

我为每个Kafka主题强制执行一个单一的架构或对象类型.这样,当您收到消息时,便会确切地知道自己所得到的.

I enforce a single schema or object type per Kafka Topic. That way when you receive messages you know exactly what you are getting.

至少,您应该确定给定主题是要保存binary还是string数据,并根据该主题进一步编码.

At a minimum, you should decide whether a given topic is going to hold binary or string data, and depending on that, how it will be further encoded.

例如,您可能有一个名为 Schema 的主题,其中包含以字符串形式存储的JSON编码对象.

For example, you could have a topic named Schema that contains JSON-encoded objects stored as strings.

如果使用JSON和像JavaScript这样的宽松类型的语言,可能很容易在同一主题中存储具有不同模式的不同对象.使用JavaScript,您只需调用JSON.parse(...),窥视生成的对象,并弄清楚您想使用该对象做什么.

If you use JSON and a loosely-typed language like JavaScript, it could be tempting to store different objects with different schemas in the same topic. With JavaScript, you can just call JSON.parse(...), take a peek at the resulting object, and figure out what you want to do with it.

但是您不能使用像Scala这样的严格类型的语言来做到这一点. Scala JSON解析器通常希望您将JSON解析为已定义的Scala类型,通常为case class.他们不适用于此模型.

But you can't do that in a strictly-typed language like Scala. The Scala JSON parsers generally want you to parse the JSON into an already defined Scala type, usually a case class. They do not work with this model.

一种解决方案是保留一个架构/一个主题规则,但要作弊:将一个对象包装在一个对象中.一个典型的示例是 Action 对象,其中具有描述该操作的标头,而有效负载对象的模式取决于标头中列出的操作类型.想象一下这种伪模式:

One solution is to keep the one schema / one topic rule, but cheat a little: wrap an object in an object. A typical example would be an Action object where you have a header that describes the action, and a payload object with a schema dependent on the action type listed in the header. Imagine this pseudo-schema:

{name: "Action", fields: [
  {name: "actionType", type: "string"},
  {name: "actionObject", type: "string"}
]}

这样,即使在强类型语言中,您也可以执行以下操作(同样是伪代码):

This way, in even a strongly-typed language, you can do something like the following (again this is pseudo-code) :

action = JSONParser[Action].parse(msg)
switch(action.actionType) {
  case "foo" => var foo = JSONParser[Foo].parse(action.actionObject)
  case "bar" => var bar = JSONParser[Bar].parse(action.actionObject)
}

关于此方法的一件整洁的事情是,如果您有一个仅在等待特定action.actionType的使用者,而只是将忽略所有其他使用者,则它仅解码标头并放入它就相当轻巧.关闭解码action.actionObject直到需要的时候.

One of the neat things about this approach is that if you have a consumer that's waiting for only a specific action.actionType, and is just going to ignore all the others, it's pretty lightweight for it to decode just the header and put off decoding action.actionObject until when and if it is needed.

到目前为止,所有这些都与字符串编码的数据有关.如果要使用二进制数据,当然也可以将其包装在JSON中,也可以将其包装在任何基于字符串的编码中,例如XML.但是也有许多二进制编码系统,例如Thrift和 Avro .实际上,上面的伪模式是基于Avro的.您甚至可以在Avro中做一些很酷的事情,例如模式演化,它提供了一种非常巧妙的方式来处理上述Action用例-您可以将模式定义为子集,而不是将对象包装在对象中其他模式,并仅解码所需的字段,在这种情况下,仅解码action.actionType字段.这是对 架构演变 .

So far this has all been about string-encoded data. If you want to work with binary data, of course you can wrap it in JSON as well, or any of a number of string-based encodings like XML. But there are a number of binary-encoding systems out there, too, like Thrift and Avro. In fact, the pseudo-schema above is based on Avro. You can even do cool things in Avro like schema evolution, which amongst other things provides a very slick way to handle the above Action use case -- instead of wrapping an object in an object, you can define a schema that is a subset of other schemas and decode just the fields you want, in this case just the action.actionType field. Here is a really excellent description of schema evolution.

简而言之,我的建议是:

In a nutshell, what I recommend is:

  1. 使用基于模式的编码系统(JSON,XML,Avro, 等等)
  2. 每个主题规则强制一个模式
  1. Settle on a schema-based encoding system (be it JSON, XML, Avro, whatever)
  2. Enforce a one schema per topic rule

这篇关于发布和使用不同类型的消息的最佳方法是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆