MongoDB 与 Cassandra [英] MongoDB vs. Cassandra

查看:36
本文介绍了MongoDB 与 Cassandra的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在评估什么可能是最佳迁移选项.

目前,我在一个分片 MySQL(水平分区)上,我的大部分数据存储在 JSON blob 中.我没有任何复杂的 SQL 查询(自从我对我的数据库进行分区后已经迁移了).

现在,似乎 MongoDB 和 Cassandra 都是可能的选择.我的情况:

  • 每个查询中的大量读取,较少的定期写入
  • 不担心大规模"可扩展性
  • 更关心简单的设置、维护和代码
  • 最大限度地降低硬件/服务器成本

解决方案

每次查询的读取次数较多,常规写入次数较少

两个数据库在读取热数据集适合内存的情况下都表现良好.两者都强调无连接数据模型(并鼓励反规范化),并且都提供关于 文档的索引rows,尽管 MongoDB 的索引是目前更灵活.

无论您的数据集有多大,Cassandra 的存储引擎都能提供恒定时间写入.MongoDB 中的写入问题更多,部分是因为基于 b-tree 的存储引擎,但更多是因为 多粒度锁定确实如此.

对于分析,MongoDB 提供了自定义的 map/reduce 实现;Cassandra 提供原生 Hadoop 支持,包括对 Hive(基于 Hadoop map/reduce 的 SQL 数据仓库)和 Pig(一种 Hadoop 特定的分析语言,许多人认为它比 SQL 更适合映射/减少工作负载).Cassandra 还支持使用 Spark.>

不担心大规模"可扩展性

如果您正在查看单个服务器,MongoDB 可能更合适.对于那些更关心扩展性的人来说,Cassandra 的无单点故障架构将更容易设置且更可靠.(MongoDB 的全局写锁也往往变得更加痛苦.)Cassandra 还提供了对复制工作方式的更多控制,包括对多个数据中心的支持.

更关心简单的设置、维护和代码

两者的设置都很简单,单个服务器具有合理的开箱即用默认值.Cassandra 在多服务器配置中设置更简单,因为无需担心特殊角色节点.

如果您目前正在使用 JSON blob,那么 MongoDB 非常适合您的用例,因为它使用 BSON 来存储数据.与现有数据库相比,您将能够拥有更丰富、更可查询的数据.这将是 Mongo 最重要的胜利.

I am evaluating what might be the best migration option.

Currently, I am on a sharded MySQL (horizontal partition), with most of my data stored in JSON blobs. I do not have any complex SQL queries (already migrated away after since I partitioned my db).

Right now, it seems like both MongoDB and Cassandra would be likely options. My situation:

  • Lots of reads in every query, less regular writes
  • Not worried about "massive" scalability
  • More concerned about simple setup, maintenance and code
  • Minimize hardware/server cost

解决方案

Lots of reads in every query, fewer regular writes

Both databases perform well on reads where the hot data set fits in memory. Both also emphasize join-less data models (and encourage denormalization instead), and both provide indexes on documents or rows, although MongoDB's indexes are currently more flexible.

Cassandra's storage engine provides constant-time writes no matter how big your data set grows. Writes are more problematic in MongoDB, partly because of the b-tree based storage engine, but more because of the multi-granularity locking it does.

For analytics, MongoDB provides a custom map/reduce implementation; Cassandra provides native Hadoop support, including for Hive (a SQL data warehouse built on Hadoop map/reduce) and Pig (a Hadoop-specific analysis language that many think is a better fit for map/reduce workloads than SQL). Cassandra also supports use of Spark.

Not worried about "massive" scalability

If you're looking at a single server, MongoDB is probably a better fit. For those more concerned about scaling, Cassandra's no-single-point-of-failure architecture will be easier to set up and more reliable. (MongoDB's global write lock tends to become more painful, too.) Cassandra also gives a lot more control over how your replication works, including support for multiple data centers.

More concerned about simple setup, maintenance and code

Both are trivial to set up, with reasonable out-of-the-box defaults for a single server. Cassandra is simpler to set up in a multi-server configuration since there are no special-role nodes to worry about.

If you're presently using JSON blobs, MongoDB is an insanely good match for your use case, given that it uses BSON to store the data. You'll be able to have richer and more queryable data than you would in your present database. This would be the most significant win for Mongo.

这篇关于MongoDB 与 Cassandra的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆