MongoDB vs. Redis vs. Cassandra是一个快速写入,临时行存储解决方案 [英] MongoDB vs. Redis vs. Cassandra for a fast-write, temporary row storage solution

查看:122
本文介绍了MongoDB vs. Redis vs. Cassandra是一个快速写入,临时行存储解决方案的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我建立了一个追踪和验证广告曝光和点击的系统。这意味着有很多插入命令(大约90 /秒的平均,峰值在250)和一些读取操作,但重点是性能和使其快速。

I'm building a system that tracks and verifies ad impressions and clicks. This means that there are a lot of insert commands (about 90/second average, peaking at 250) and some read operations, but the focus is on performance and making it blazing-fast.

系统目前是在MongoDB上,但自那以后我已经介绍了Cassandra和Redis。这是一个好主意,去这两个解决方案之一,而不是留在MongoDB?为什么或为什么不?

The system is currently on MongoDB, but I've been introduced to Cassandra and Redis since then. Would it be a good idea to go to one of these two solutions, rather than stay on MongoDB? Why or why not?

谢谢

推荐答案

解决方案这样,我会推荐一个多阶段的方法。 Redis善于实时通信。 Redis被设计为内存中的键/值存储,并继承了作为内存数据库的一些非常好的好处:O(1)列表操作。只要在服务器上使用RAM,Redis就不会减速到你的列表的结尾,当你需要以这样的极端速率插入项目时,这是很好的。不幸的是,Redis不能操作数据集大于你拥有的RAM的数量(它只是写入磁盘,读取是重新启动服务器或在系统崩溃的情况下),缩放已经可通过您的申请来完成。 (一种常见的方式是在许多服务器上传播密钥,这是由一些Redis驱动程序实现的,尤其是Ruby on Rails的。)Redis也支持简单的发布/订阅消息,这在某些时候也是有用的。

For a harvesting solution like this, I would recommend a multi-stage approach. Redis is good at real time communication. Redis is designed as an in-memory key/value store and inherits some very nice benefits of being a memory database: O(1) list operations. For as long as there is RAM to use on a server, Redis will not slow down pushing to the end of your lists which is good when you need to insert items at such an extreme rate. Unfortunately, Redis can't operate with data sets larger than the amount of RAM you have (it only writes to disk, reading is for restarting the server or in case of a system crash) and scaling has to be done by you and your application. (A common way is to spread keys across numerous servers, which is implemented by some Redis drivers especially those for Ruby on Rails.) Redis also has support for simple publish/subscribe messenging, which can be useful at times as well.

在这种情况下,Redis是第一阶段。对于每个特定类型的事件,您在Redis中创建一个具有唯一名称的列表;例如我们有页面查看和链接点击。为了简单起见,我们要确保每个列表中的数据是相同的结构;链接点击可以具有用户令牌,链接名称和URL,而查看的页面可以仅具有用户令牌和URL。您的第一个问题只是获得发生的事实以及您需要的任何绝对必要的数据

In this scenario, Redis is "stage one." For each specific type of event you create a list in Redis with a unique name; for example we have "page viewed" and "link clicked." For simplicity we want to make sure the data in each list is the same structure; link clicked may have a user token, link name and URL, while the page viewed may only have the user token and URL. Your first concern is just getting the fact it happened and whatever absolutely neccesary data you need is pushed.

接下来我们有一些简单的处理工作者把这个疯狂的插入信息从Redis的手中拿出来,要求它把一个项目从列表的结尾处移开。工作人员可以进行任何调整/重复数据删除/ ID查找,以便正确地归档数据并将其移交到更永久的存储站点。激发这些工人的许多,你需要保持Redis的记忆负荷可承受。只要它有一个Redis驱动程序(大多数的网络语言现在)和一个为你想要的存储(SQL,Mongo等),你可以写任何你想要的工作(Node.js,C#,Java,...)。 )

Next we have some simple processing workers that take this frantically inserted information off of Redis' hands, by asking it to take an item off the end of the list and hand it over. The worker can make any adjustments/deduplication/ID lookups needed to properly file the data and hand it off to a more permanent storage site. Fire up as many of these workers as you need to keep Redis' memory load bearable. You could write the workers in anything you wish (Node.js, C#, Java, ...) as long as it has a Redis driver (most web languages do now) and one for your desired storage (SQL, Mongo, etc.)

MongoDB擅长文档存储。与Redis不同,它能够处理大于RAM的数据库,并且它支持自己的分片/复制。 MongoDB相对于基于SQL的选项的一个优点是,您不必具有预定的模式,您可以随时更改存储数据的方式。

MongoDB is good at document storage. Unlike Redis it is able to deal with databases larger than RAM and it supports sharding/replication on it's own. An advantage of MongoDB over SQL-based options is that you don't have to have a predetermined schema, you're free to change the way data is stored however you want at any time.

然而,我会建议Redis或Mongo处理数据的第一步阶段,并使用传统的SQL设置(Postgres或MSSQL)来存储后处理的数据。跟踪客户行为听起来像是关系数据,因为您可能想显示查看此页面的每个人或此人在该特定日期查看多少页面或哪一天的观看者总数最多? 。可能有更复杂的连接或查询用于您想出的分析目的,成熟的SQL解决方案可以为您执行大量的此过滤; NoSQL(特别是Mongo或Redis)不能对不同的数据集进行连接或复杂查询。

I would, however, suggest Redis or Mongo for the "step one" phase of holding data for processing and use a traditional SQL setup (Postgres or MSSQL, perhaps) to store post-processed data. Tracking client behavior sounds like relational data to me, since you may want to go "Show me everyone who views this page" or "How many pages did this person view on this given day" or "What day had the most viewers in total?". There may be even more complex joins or queries for analytic purposes you come up with, and mature SQL solutions can do a lot of this filtering for you; NoSQL (Mongo or Redis specifically) can't do joins or complex queries across varied sets of data.

这篇关于MongoDB vs. Redis vs. Cassandra是一个快速写入,临时行存储解决方案的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆