kafka ProducerRecord 和 KeyedMessage 有什么区别 [英] what is the difference between kafka ProducerRecord and KeyedMessage

查看:38
本文介绍了kafka ProducerRecord 和 KeyedMessage 有什么区别的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在测量 kafka 生产者生产者的表现.目前我遇到了两个配置和用法略有不同的客户端:

I'm measuring the kafka producer producer performance. Currently I've met two clients with bit different configuration and usage:

常见:

def buildKafkaConfig(hosts: String, port: Int): Properties = {
  val props = new Properties()    
  props.put("metadata.broker.list", brokers)
  props.put("serializer.class", "kafka.serializer.StringEncoder")
  props.put("producer.type", "async") 
  props.put("request.required.acks", "0")
  props.put("queue.buffering.max.ms", "5000")
  props.put("queue.buffering.max.messages", "2000")
  props.put("batch.num.messages", "300")
  props
}

第一个客户:

"org.apache.kafka" % "kafka_2.11" % "0.8.2.2" 

用法:

val kafkaConfig = KafkaUtils.buildKafkaConfig("kafkahost", 9092)
val producer = new Producer[String, String](new ProducerConfig(kafkaConfig))

// ... somewhere in code 
producer.send(new KeyedMessage[String, String]("my-topic", data))

第二个客户:

"org.apache.kafka" % "kafka-clients" % "0.8.2.2"

用法:

val kafkaConfig = KafkaUtils.buildKafkaConfig("kafkahost", 9092)
val producer = new KafkaProducer[String, String](kafkaConfig)
// ... somewhere in code 
producer.send(new ProducerRecord[String, String]("my-topic", data))

我的问题是:

  • 2 个客户有什么区别?
  • 对于大规模应用,我应该配置哪些属性,以实现最佳的高负载写入性能?

推荐答案

两个客户端有什么区别?

what is the difference between 2 clients?

它们只是新旧 API.Kafka 从 0.8.2.x 开始公开了一组新的 API 来与 kafka 一起工作,旧的是 ProducerKeyedMessage[K,V] 一起工作,其中新的 API 是 KafkaProducerProducerRecord[K,V]:

They are simply old vs new APIs. Kafka starting 0.8.2.x exposed a new set of API's to work with kafka, older being Producer which works with KeyedMessage[K,V] where the new API is KafkaProducer with ProducerRecord[K,V]:

从 0.8.2 版本开始,我们鼓励所有新开发使用新的 Java 生产者.此客户端经过生产测试,通常两者都经过测试比之前的 Scala 客户端更快、功能更全.

As of the 0.8.2 release we encourage all new development to use the new Java producer. This client is production tested and generally both faster and more fully featured than the previous Scala client.

您最好使用受支持的新版本.

You should preferably be using the new supported version.

我应该配置哪些属性,考虑到实现最佳、高重写入性能,适用于大规模应用?

Which properties should I configure, take into account to achieve optimal, high heavy writes performance, for high scale application?

这是一个非常广泛的问题,很大程度上取决于您的软件架构.它随着规模、生产者数量、消费者数量等而变化.需要考虑的事情很多.我建议阅读 文档 并阅读有关 Kafka 架构和设计的部分以获得更好地了解其内部运作方式.

This is a very broad question, which depends a lot on the architecture of your software. It varies with scale, amount of producers, amount of consumers, etc.. There are many things to be taken into account. I would suggest going through the documentation and reading up the sections talking about Kafka's architecture and design to get a better picture of how it works internally.

一般来说,根据我的经验,您需要平衡数据的复制因子、保留时间和每个队列进入的分区数.如果您以后有更具体的问题,您绝对应该提出问题.

Generally speaking, from my experience you'll need to balance the replication factor of your data, along with retention times and number of partitions each queue goes into. If you have more specific questions down the road, you should definitely post a question.

这篇关于kafka ProducerRecord 和 KeyedMessage 有什么区别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆