你用 Apache Kafka 做什么? [英] What do you use Apache Kafka for?

查看:39
本文介绍了你用 Apache Kafka 做什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

请问我对Kafka的理解是否正确.

对于真正真正的大数据流,传统的数据库是不够的,所以人们使用诸如 Hadoop 或 Storm 之类的东西.Kafka 位于上述数据库之上,并提供......实时数据应该去哪里的方向?

解决方案

我不这么认为.

Kafka 是消息系统,它不位于数据库之上.

您可以将 KafkaActiveMQRabbitMQ 等消息系统进行比较

来自 Apache 文档

客户端和服务器之间的通信是通过简单、高性能、与语言无关的 TCP 协议完成的.

用例:

  1. 消息传递:Kafka 可以很好地替代更传统的消息代理.在这个领域,Kafka 可与传统的消息系统(如 ActiveMQ 或 RabbitMQ)相媲美
  2. 网站活动跟踪:Kafka 的原始用例是能够将用户活动跟踪管道重建为一组实时发布订阅供稿
  3. 指标:Kafka 通常用于操作监控数据,这涉及从分布式应用程序汇总统计数据以生成操作数据的集中提要
  4. 日志聚合
  5. 流处理
  6. 事件溯源是一种应用程序设计风格,其中状态更改记录为按时间排序的记录序列.
  7. 提交日志:Kafka 可以作为分布式系统的一种外部提交日志.日志有助于在节点之间复制数据,并作为故障节点恢复数据的重新同步机制

I would like to ask if my understanding of Kafka is correct.

For really really big data stream, conventional database is not adequate so people use things such as Hadoop or Storm. Kafka sits on top of said databases and provide ...directions where the real time data should go?

解决方案

I don't think so.

Kafka is messaging system and it does not sit on top of database.

You can compare Kafka with messaging systems like ActiveMQ, RabbitMQ etc.

From Apache documentation page

Kafka is a distributed, partitioned, replicated commit log service. It provides the functionality of a messaging system, but with a unique design.

Key takeaways:

  1. Kafka maintains feeds of messages in categories called topics.
  2. We'll call processes that publish messages to a Kafka topic producers.
  3. We'll call processes that subscribe to topics and process the feed of published messages consumers..
  4. Kafka is run as a cluster comprised of one or more servers each of which is called a broker.

Communication between the clients and the servers is done with a simple, high-performance, language agnostic TCP protocol.

Use Cases:

  1. Messaging: Kafka works well as a replacement for a more traditional message broker. In this domain Kafka is comparable to traditional messaging systems such as ActiveMQ or RabbitMQ
  2. Website Activity Tracking: The original use case for Kafka was to be able to rebuild a user activity tracking pipeline as a set of real-time publish-subscribe feeds
  3. Metrics: Kafka is often used for operational monitoring data, which involves aggregating statistics from distributed applications to produce centralized feeds of operational data
  4. Log Aggregation
  5. Stream Processing
  6. Event sourcing is a style of application design where state changes are logged as a time-ordered sequence of records.
  7. Commit Log: Kafka can serve as a kind of external commit-log for a distributed system. The log helps replicate data between nodes and acts as a re-syncing mechanism for failed nodes to restore their data

这篇关于你用 Apache Kafka 做什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆