MongoDB 与 Redis 与 Cassandra 的快速写入临时行存储解决方案 [英] MongoDB vs. Redis vs. Cassandra for a fast-write, temporary row storage solution

查看:35
本文介绍了MongoDB 与 Redis 与 Cassandra 的快速写入临时行存储解决方案的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在构建一个系统来跟踪和验证广告展示次数和点击次数.这意味着有很多插入命令(平均约 90 次/秒,峰值为 250 次)和一些读取操作,但重点是性能并使其速度极快.

I'm building a system that tracks and verifies ad impressions and clicks. This means that there are a lot of insert commands (about 90/second average, peaking at 250) and some read operations, but the focus is on performance and making it blazing-fast.

该系统目前在 MongoDB 上,但从那时起我就开始接触 Cassandra 和 Redis.使用这两种解决方案之一而不是留在 MongoDB 上会是个好主意吗?为什么或为什么不?

The system is currently on MongoDB, but I've been introduced to Cassandra and Redis since then. Would it be a good idea to go to one of these two solutions, rather than stay on MongoDB? Why or why not?

谢谢

推荐答案

对于这样的收获解决方案,我建议采用多阶段方法.Redis 擅长实时通信.Redis 被设计为内存中的键/值存储,并继承了作为内存数据库的一些非常好的好处:O(1) 列表操作.只要服务器上有 RAM 可用,Redis 就不会放慢推送到列表末尾的速度,这在您需要以如此极端的速度插入项目时非常有用.不幸的是,Redis 无法处理大于您拥有的 RAM 量的数据集(它只写入到磁盘,读取用于重新启动服务器或系统崩溃的情况)并且扩展必须由你的申请来完成.(一种常见的方法是将密钥分散到多个服务器上,这是由一些 Redis 驱动程序实现的,尤其是那些用于 Ruby on Rails 的驱动程序.)Redis 还支持简单的发布/订阅消息传递,这有时也很有用.

For a harvesting solution like this, I would recommend a multi-stage approach. Redis is good at real time communication. Redis is designed as an in-memory key/value store and inherits some very nice benefits of being a memory database: O(1) list operations. For as long as there is RAM to use on a server, Redis will not slow down pushing to the end of your lists which is good when you need to insert items at such an extreme rate. Unfortunately, Redis can't operate with data sets larger than the amount of RAM you have (it only writes to disk, reading is for restarting the server or in case of a system crash) and scaling has to be done by you and your application. (A common way is to spread keys across numerous servers, which is implemented by some Redis drivers especially those for Ruby on Rails.) Redis also has support for simple publish/subscribe messenging, which can be useful at times as well.

在这种情况下,Redis 是第一阶段".对于每种特定类型的事件,您在 Redis 中创建一个具有唯一名称的列表;例如,我们有页面浏览"和链接点击".为简单起见,我们希望确保每个列表中的数据具有相同的结构;单击的链接可能有用户令牌、链接名称和 URL,而查看的页面可能只有用户令牌和 URL.您首先关心的是了解它发生的事实,以及推送您需要的任何绝对必要数据.

In this scenario, Redis is "stage one." For each specific type of event you create a list in Redis with a unique name; for example we have "page viewed" and "link clicked." For simplicity we want to make sure the data in each list is the same structure; link clicked may have a user token, link name and URL, while the page viewed may only have the user token and URL. Your first concern is just getting the fact it happened and whatever absolutely neccesary data you need is pushed.

接下来,我们有一些简单的处理工人,它们通过要求它从列表的末尾取出一个项目并将其移交给 Redis 的手,从而将这些疯狂插入的信息从 Redis 手中夺走.工作人员可以进行任何调整/重复数据删除/ID 查找以正确归档数据并将其移交给更永久的存储站点.根据需要启动尽可能多的这些工作器,以保持 Redis 的内存负载可承受.只要有 Redis 驱动程序(现在大多数 Web 语言都可以)和用于所需存储的驱动程序(SQL、Mongo 等),您就可以使用任何您希望的方式(Node.js、C#、Java 等)编写工作程序.)

Next we have some simple processing workers that take this frantically inserted information off of Redis' hands, by asking it to take an item off the end of the list and hand it over. The worker can make any adjustments/deduplication/ID lookups needed to properly file the data and hand it off to a more permanent storage site. Fire up as many of these workers as you need to keep Redis' memory load bearable. You could write the workers in anything you wish (Node.js, C#, Java, ...) as long as it has a Redis driver (most web languages do now) and one for your desired storage (SQL, Mongo, etc.)

MongoDB 擅长文档存储.与 Redis 不同,它能够处理大于 RAM 的数据库,并且它自己支持分片/复制.与基于 SQL 的选项相比,MongoDB 的一个优势是您不必拥有预先确定的架构,您可以随时随意更改数据的存储方式.

MongoDB is good at document storage. Unlike Redis it is able to deal with databases larger than RAM and it supports sharding/replication on it's own. An advantage of MongoDB over SQL-based options is that you don't have to have a predetermined schema, you're free to change the way data is stored however you want at any time.

然而,我建议在第一步"阶段使用 Redis 或 Mongo 来保存数据以进行处理,并使用传统的 SQL 设置(可能是 Postgres 或 MSSQL)来存储后处理数据.跟踪客户行为对我来说听起来像是关系数据,因为您可能想要向我显示查看此页面的每个人"或此人在这一天查看了多少页面"或哪一天的查看者总数最多?".出于分析目的,您可能会想到更复杂的连接或查询,而成熟的 SQL 解决方案可以为您做很多这种过滤;NoSQL(特别是 Mongo 或 Redis)无法对不同的数据集进行连接或复杂查询.

I would, however, suggest Redis or Mongo for the "step one" phase of holding data for processing and use a traditional SQL setup (Postgres or MSSQL, perhaps) to store post-processed data. Tracking client behavior sounds like relational data to me, since you may want to go "Show me everyone who views this page" or "How many pages did this person view on this given day" or "What day had the most viewers in total?". There may be even more complex joins or queries for analytic purposes you come up with, and mature SQL solutions can do a lot of this filtering for you; NoSQL (Mongo or Redis specifically) can't do joins or complex queries across varied sets of data.

这篇关于MongoDB 与 Redis 与 Cassandra 的快速写入临时行存储解决方案的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆