kafka ProducerRecord和KeyedMessage有什么区别 [英] what is the difference between kafka ProducerRecord and KeyedMessage

查看:982
本文介绍了kafka ProducerRecord和KeyedMessage有什么区别的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在衡量卡夫卡生产者生产者的表现. 目前,我遇到了两个配置和用法略有不同的客户端:

I'm measuring the kafka producer producer performance. Currently I've met two clients with bit different configuration and usage:

常用:

def buildKafkaConfig(hosts: String, port: Int): Properties = {
  val props = new Properties()    
  props.put("metadata.broker.list", brokers)
  props.put("serializer.class", "kafka.serializer.StringEncoder")
  props.put("producer.type", "async") 
  props.put("request.required.acks", "0")
  props.put("queue.buffering.max.ms", "5000")
  props.put("queue.buffering.max.messages", "2000")
  props.put("batch.num.messages", "300")
  props
}

第一位客户

"org.apache.kafka" % "kafka_2.11" % "0.8.2.2" 

用法:

val kafkaConfig = KafkaUtils.buildKafkaConfig("kafkahost", 9092)
val producer = new Producer[String, String](new ProducerConfig(kafkaConfig))

// ... somewhere in code 
producer.send(new KeyedMessage[String, String]("my-topic", data))

第二个客户端:

"org.apache.kafka" % "kafka-clients" % "0.8.2.2"

用法:

val kafkaConfig = KafkaUtils.buildKafkaConfig("kafkahost", 9092)
val producer = new KafkaProducer[String, String](kafkaConfig)
// ... somewhere in code 
producer.send(new ProducerRecord[String, String]("my-topic", data))

我的问题是:

  • 2个客户之间有什么区别?
  • 对于大规模应用程序,我应该配置哪些属性,以实现最佳的高重写入性能?

推荐答案

2个客户之间有什么区别?

what is the difference between 2 clients?

它们只是旧API与新API .从0.8.2.x开始的Kafka公开了一组与kafka一起使用的API,较早的版本是Producer,它与KeyedMessage[K,V]一起使用,其中新的API是KafkaProducerProducerRecord[K,V]:

They are simply old vs new APIs. Kafka starting 0.8.2.x exposed a new set of API's to work with kafka, older being Producer which works with KeyedMessage[K,V] where the new API is KafkaProducer with ProducerRecord[K,V]:

从0.8.2版本开始,我们鼓励所有新开发都使用 新的Java生产者.此客户已通过生产测试,通常都 比以前的Scala客户端更快,功能更全.

As of the 0.8.2 release we encourage all new development to use the new Java producer. This client is production tested and generally both faster and more fully featured than the previous Scala client.

您最好使用支持的新版本.

You should preferably be using the new supported version.

我应该配置哪些属性,以实现 最佳的高写入性能,适合大规模应用?

Which properties should I configure, take into account to achieve optimal, high heavy writes performance, for high scale application?

这是一个非常广泛的问题,在很大程度上取决于软件的体系结构.它随规模,生产者数量,消费者数量等而变化.要考虑很多因素.我建议阅读文档,并阅读有关Kafka架构和设计的章节,以获取相关知识.更好地了解其内部工作原理.

This is a very broad question, which depends a lot on the architecture of your software. It varies with scale, amount of producers, amount of consumers, etc.. There are many things to be taken into account. I would suggest going through the documentation and reading up the sections talking about Kafka's architecture and design to get a better picture of how it works internally.

通常来说,根据我的经验,您需要平衡数据的复制因子,以及每个队列进入的保留时间和分区数.如果您日后还有其他更具体的问题,则绝对应该发布一个问题.

Generally speaking, from my experience you'll need to balance the replication factor of your data, along with retention times and number of partitions each queue goes into. If you have more specific questions down the road, you should definitely post a question.

这篇关于kafka ProducerRecord和KeyedMessage有什么区别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆