为什么我们需要带有NoSQL数据库的Apache Kafka? [英] Why we require Apache Kafka with NoSQL databases?

查看:92
本文介绍了为什么我们需要带有NoSQL数据库的Apache Kafka?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Apache Kafka是一种实时消息传递服务.它以分布式且容错的方式安全地存储数据流.在向生产者发送消息时,我们可以过滤流数据.我并不理解为什么我们需要像MongoDB这样的NoSQL数据库在Apache Kafka中存储相同的数据.真正的问题是,为什么我们将相同的数据存储在NoSQL数据库和Apache Kafka中?

Apache Kafka is an real-time messaging service. It stores streams of data safely in distributed and fault-tolerant. We can filter streaming data when comming producer. I don't understant that why we need NoSQL databases like as MongoDB to store same data in Apache Kafka. The true question is that why we store same data in a NoSQL database and Apache Kafka?

我认为,如果需要NoSQL数据库,我们可以首先从MongoDB中的客户端收集数据流,而无需使用Apache Kafka.但是,大多数大数据架构偏好在数据源和NoSQL数据库之间使用Apache Kafka.(

I think if we need a NoSQL database, we can collect streams of data from clients in MongoDB at first without the use of Apache Kafka. But, most of big data architecture preference using Apache Kafka between data source and NoSQL database.(see)

对于真实系统,它有什么优势?

What is the advantages of that for real systems?

推荐答案

此体系结构具有多个优点:

This architecture has several advantages:

  1. Kafka作为数据集成总线

它有助于轻松地在多个生产者和许多消费者之间分发数据.在此,Apache Kafka用作数据"数据库.集成消息总线.

It helps distribute data between several producers and many consumers easily. Here Apache Kafka serves as an "data" integration message bus.

Kafka作为数据缓冲区

将Kafka放在您的终端"前面像MongoDB或MySQL这样的数据存储就像一个自然数据缓冲区.因此,您能够独立部署/维护/重新部署您的消费者服务.在您的服务停止维护时,Kafka仍在存储所有传入数据,这非常有用.

Putting Kafka in front of your "end" data storages like MongoDB or MySQL acts like a natural data buffer. So you are able to deploy/maintain/redeploy your consumer services independently. At the time your service is down for maintanance Kafka is still storing all incoming data, that is quite useful.

Kafka作为短期数据存储

您不必将所有内容都存储在Kafka中:通常,您会保留使用Kafka主题.这意味着Kafka会自动删除所有早于某个值的数据.因此,例如,您可能拥有保留1周的Kafka主题(因此,您仅存储1周的数据),但同时您的数据仍位于长期存储服务中,例如经典SQL-DB或Cassandra等.

You don't have to store everything in Kafka: very often you use Kafka topics with retention. It means all data older than some value will be deleted by Kafka automatically. So, for example you may have Kafka topic with 1 week retention (so you store 1 week of data only) but at the same time your data lives in long time storage services like classic SQL-DBs or Cassandra etc.

Kafka作为长期数据存储

另一方面,您可以将Apache Kafka用作长期存储系统.使用紧凑主题可以使您仅存储每个键的最后一个值.因此,您的主题将成为应用程序的最后状态存储.

On the other hand you can use Apache Kafka as a long term storage system. Using compacted topics enables you to store only the last value for each key. So your topic becomes a last state storage of your app.

这篇关于为什么我们需要带有NoSQL数据库的Apache Kafka?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆