哪个集群 NoSQL DB 用于消息存储? [英] Which clustered NoSQL DB for a Message Storing purpose?

查看:20
本文介绍了哪个集群 NoSQL DB 用于消息存储?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

关于选择哪种 NoSQL 的另一个问题.但是,我还没有发现有人要求这种类型的目的,消息存储...

Yet another question about which NoSQL to choose. However, I haven't found yet someone asking for this type of purpose, message storing...

我制作了一个 Erlang 聊天服务器,我已经在使用 MySQL 来存储好友列表和需要加入"信息.

I have an Erlang Chat Server made, I'm already using MySQL for storing friend list, and "JOIN needed" informations.

我想存储消息(该用户未收到,因为他处于离线状态...)并检索它们.

I would like to store Messages (That user has not receive because he was offline...) and retrieve them.

我已经预先选择了 NoSQL,我不能使用像 MongoDB 这样的东西,因为它是面向 RAM 的范式,并且不能像其他人一样集群.我想我的清单有 3 个选择:

I have made a pre-selection of NoSQL, I can't use things like MongoDB due to it's RAM oriented paradigm, and fail to cluster like others. I have down my list to 3 choices I guess :

  • Hbase
  • 里亚克
  • 卡桑德拉

我知道他们的模型不同,一个使用键/值,另一个使用 SuperColumns 和 co.

I know that their model are quit different, one using key/value, the other using SuperColumns and co.

直到现在我更喜欢 Riak,因为它是 Erlang 的稳定客户端库.

Until now I had a preference for Riak due to it's stable client library for Erlang.

我知道我可以将 Cassandra 与 Thrift 一起使用,但它与 Erlang 似乎不太稳定(我没有得到很好的回报)

I know that I can use Cassandra with Thrift, but it seems not very stable with Erlang (I haven't got good returns about it)

我现在对 HBase 一无所知,只知道它存在并且基于 Dynamo,如 Cassandra 和 Riak.

I don't really know anything about HBase right now, just know it exist and is based on Dynamo like Cassandra and Riak.

所以这是我需要做的:

  • 为每位注册用户存储 1 到 X 条消息.
  • 获取每个用户存储的消息数.
  • 一次检索来自用户的所有消息.
  • 一次性删除一个用户的所有消息.
  • 删除所有超过 X 个月的消息

现在,我对那些 NoSQL DB 真的很陌生,我一直是 MySQL 爱好者,这就是为什么我问你这个问题,作为一个新手,有没有比我更有经验的人可以帮助我选择哪个一个更好,可以让我做我想做的一切而没有太多麻烦......

Right now, I'm really new to those NoSQL DB, I always been a MySQL aficionados, This is why I ask you this question, as a Newbie, would someone who has more experience than I could Help me to choose which one is better, and would let me do everything I want to without to much hassle...

谢谢!

推荐答案

我不能代表 Cassandra 或 Hbase,但让我谈谈 Riak 部分.

I can't speak for Cassandra or Hbase, but let me address the Riak part.

是的,Riak 适合您的场景(我已经看到多家公司和社交网络将其用于类似目的).

Yes, Riak would be appropriate for your scenario (and I've seen several companies and social networks use it for a similar purpose).

要实现这一点,您需要简单的 Riak Key/Value 操作,以及某种索引引擎.您的选择是(按偏好的粗略顺序):

To implement this, you would need the plain Riak Key/Value operations, plus some sort of indexing engine. Your options are (in rough order of preference):

  1. CRDT 集.如果您的 1-N 集合大小合理(假设每个用户的消息少于 50 条或其他),您可以将子集合的键存储在 CRDT 设置数据类型.

  1. CRDT Sets. If your 1-N collection size is reasonably sized (let's say, there's less than 50 messages per user or whatever), you can store the keys of the child collection in a CRDT Set Data Type.

Riak 搜索.如果你的集合很大,特别是如果你需要在任意字段上搜索你的对象,你可以使用 Riak 搜索.它在后台启动 Apache Solr,并根据您定义的模式为您的对象编制索引.它具有非常棒的搜索、聚合和统计、地理空间功能等.

Riak Search. If your collection size is large, and especially if you need to search your objects on arbitrary fields, you can use Riak Search. It spins up Apache Solr in the background, and indexes your objects according to a schema you define. It has pretty awesome searching, aggregation and statistics, geospatial capabilities, etc.

二级索引.您可以在 之上运行 RiakeLevelDB 存储后端,并启用二级索引 (2i) 功能.

Secondary Indexes. You can run Riak on top of an eLevelDB storage back end, and enable Secondary Index (2i) functionality.

运行一些性能测试,以选择最快的方法.

Run a few performance tests, to pick the fastest approach.

就架构而言,我建议使用两个存储桶(对于您描述的设置):一个用户存储桶和一个消息存储桶.

As far as schema, I would recommend using two buckets (for the setup you describe): a User bucket, and a Message bucket.

索引消息桶.(通过将搜索索引与其关联,或通过 2i 存储 user_key).这使您可以执行所有必需的操作(并且消息日志不必放入内存中):

Index the message bucket. (Either by associating a Search index with it, or by storing a user_key via 2i). This lets you do all of the required operations (and the message log does not have to fit into memory):

  • 为每个注册用户存储 1 到 X 条消息 - 一旦您创建了一个 User 对象并获得了一个用户密钥,就很容易为每个用户存储任意数量的消息,它们将直接写入消息桶,每条消息存储适当的 user_key 作为二级索引.
  • 获取每个用户存储的消息数 - 没问题.获取属于用户的消息键列表(通过搜索查询,通过检索保存键的 Set 对象,或通过对 user_key 的 2i 查询).这让您可以在客户端获得计数.
  • 一次检索来自用户的所有消息 - 请参阅上一项.获取属于用户的所有消息的键列表(通过 Search、Sets 或 2i),然后通过多次获取每个键的值来获取这些键的实际消息(所有官方 Riak 客户端都有一个 multiFetch 能力,客户端).
  • 一次删除一个用户的所有消息 - 非常相似.获取用户的消息键列表,在客户端向他们发出删除.
  • 删除所有超过 X 个月的消息 - 您可以在日期上添加索引.然后,检索所有超过 X 个月的消息密钥(通过 Search 或 2i),并为它们发出客户端删除.
  • Store from 1 to X messages per registered user - Once you create a User object and get a user key, storing an arbitrary amount of messages per user is easy, they would be straight up writes to the Message bucket, each message storing the appropriate user_key as a secondary index.
  • Get the number of stored messages per user - No problem. Get the list of message keys belonging to a user (via a search query, by retrieving the Set object where you're keeping the keys, or via a 2i query on user_key). This lets you get the count on the client side.
  • retrieve all messages from a user at once - See previous item. Get the list of keys of all messages belonging to the user (via Search, Sets or 2i), and then fetch the actual messages for those keys by multi-fetching the values for each key (all the official Riak clients have a multiFetch capability, client-side).
  • delete all messages from a user at once - Very similar. Get list of message keys for the user, issue Deletes to them on the client side.
  • delete all messages that are older than X months - You can add an index on Date. Then, retrieve all message keys older than X months (via Search or 2i), and issue client-side Deletes for them.

这篇关于哪个集群 NoSQL DB 用于消息存储?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆