您在使用 NoSQL 数据存储时遇到了哪些可扩展性问题? [英] What scalability problems have you encountered using a NoSQL data store?

查看:8
本文介绍了您在使用 NoSQL 数据存储时遇到了哪些可扩展性问题?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

NoSQL 是指非关系数据存储,它打破了关系数据库的历史和 ACID 保证.流行的开源 NoSQL 数据存储包括:

NoSQL refers to non-relational data stores that break with the history of relational databases and ACID guarantees. Popular open source NoSQL data stores include:

  • Cassandra(表格,用 Java 编写,被 Cisco、WebEx、Digg、Facebook、IBM 使用、Mahalo、Rackspace、Reddit 和 Twitter)
  • CouchDB(文档,用 Erlang 编写,由 BBC 和 Engine Yard 使用)
  • Dynomite(键值对,用 Erlang 编写,Powerset 使用)
  • HBase(键值对,Java 编写,Bing 使用)
  • Hypertable(表格,C++编写,百度使用)
  • Kai(键值对,用 Erlang 编写)
  • MemcacheDB(键值对,用 C 编写,Reddit 使用)
  • MongoDB(文档,用 C++ 编写,由 Electronic Arts、Github、NY Times 和 Sourceforge 使用)
  • Neo4j(图表,用 Java 编写,一些瑞典大学使用)
  • Project Voldemort(键值,用 Java 编写,LinkedIn 使用)
  • Redis(键值对,用 C 语言编写,供 Craigslist、Engine Yard 和Github)
  • Riak(键值对,用 Erlang 编写,Comcast 和 Mochi Media 使用)
  • Ringo(键值对,用 Erlang 编写,诺基亚使用)
  • Scalaris(键值对,用 Erlang 编写,由 OnScale 使用)
  • Terrastore(文档,用 Java 编写)
  • ThruDB(文档,用 C++ 编写,由 JunkDepot.com 使用)
  • 东京内阁/东京暴君(key-value,用 C 语言编写,由 Mixi.jp 使用(日语社交网站))
  • Cassandra (tabular, written in Java, used by Cisco, WebEx, Digg, Facebook, IBM, Mahalo, Rackspace, Reddit and Twitter)
  • CouchDB (document, written in Erlang, used by BBC and Engine Yard)
  • Dynomite (key-value, written in Erlang, used by Powerset)
  • HBase (key-value, written in Java, used by Bing)
  • Hypertable (tabular, written in C++, used by Baidu)
  • Kai (key-value, written in Erlang)
  • MemcacheDB (key-value, written in C, used by Reddit)
  • MongoDB (document, written in C++, used by Electronic Arts, Github, NY Times and Sourceforge)
  • Neo4j (graph, written in Java, used by some Swedish universities)
  • Project Voldemort (key-value, written in Java, used by LinkedIn)
  • Redis (key-value, written in C, used by Craigslist, Engine Yard and Github)
  • Riak (key-value, written in Erlang, used by Comcast and Mochi Media)
  • Ringo (key-value, written in Erlang, used by Nokia)
  • Scalaris (key-value, written in Erlang, used by OnScale)
  • Terrastore (document, written in Java)
  • ThruDB (document, written in C++, used by JunkDepot.com)
  • Tokyo Cabinet/Tokyo Tyrant (key-value, written in C, used by Mixi.jp (Japanese social networking site))

我想了解您(SO 读者)使用数据存储解决的具体问题以及您使用的 NoSQL 数据存储.

I'd like to know about specific problems you - the SO reader - have solved using data stores and what NoSQL data store you used.

问题:

  • 您使用 NoSQL 数据存储解决了哪些可扩展性问题?
  • 您使用了什么 NoSQL 数据存储?
  • 在切换到 NoSQL 数据存储之前,您使用了什么数据库?

我正在寻找第一手经验,所以除非你有,否则请不要回答.

推荐答案

我已经将一个小子项目从 MySQL 切换到 CouchDB,以便能够处理负载.结果是惊人的.

I've switched a small subproject from MySQL to CouchDB, to be able to handle the load. The result was amazing.

大约 2 年前,我们在 http://www.ubuntuusers.de/(这可能是德国最大的 Linux 社区网站).该网站是用 Python 编写的,我们添加了一个 WSGI 中间件,它能够捕获所有异常并将它们发送到另一个由 MySQL 驱动的小型网站.这个小网站使用哈希来确定不同的错误,并存储出现的次数和最后一次出现的次数.

About 2 years ago, we've released a self written software on http://www.ubuntuusers.de/ (which is probably the biggest German Linux community website). The site is written in Python and we've added a WSGI middleware which was able to catch all exceptions and send them to another small MySQL powered website. This small website used a hash to determine different bugs and stored the number of occurrences and the last occurrence as well.

不幸的是,在发布后不久,traceback-logger 网站不再响应.我们的主站点的生产数据库存在一些锁定问题,几乎每个请求都会抛出异常,以及其他几个我们在测试阶段尚未探索的错误.我们主站点的服务器集群,每秒调用 traceback-logger 提交页面数 k 次.这对于托管回溯记录器的小型服务器来说太过分了(它已经是一台旧服务器,仅用于开发目的).

Unfortunately, shortly after the release, the traceback-logger website wasn't responding anymore. We had some locking issues with the production db of our main site which was throwing exceptions nearly every request, as well as several other bugs, which we haven't explored during the testing stage. The server cluster of our main site, called the traceback-logger submit page several k times per second. And that was a way too much for the small server which hosted the traceback logger (it was already an old server, which was only used for development purposes).

此时 CouchDB 相当流行,所以我决定尝试一下,并用它编写一个小型 traceback-logger.新的记录器只包含一个 python 文件,它提供了一个带有排序和过滤选项的错误列表以及一个提交页面.在后台,我启动了一个 CouchDB 进程.新软件对所有请求的响应速度极快,我们能够查看大量自动错误报告.

At this time CouchDB was rather popular, and so I decided to try it out and write a small traceback-logger with it. The new logger only consisted of a single python file, which provided a bug list with sorting and filter options and a submit page. And in the background I've started a CouchDB process. The new software responded extremely quickly to all requests and we were able to view the massive amount of automatic bug reports.

一个有趣的事情是,之前的解决方案是在旧的专用服务器上运行的,而另一方面,基于 CouchDB 的新站点只在资源非常有限的共享 xen 实例上运行.而且我什至没有使用键值存储的优势来横向扩展.CouchDB/Erlang OTP 在不锁定任何内容的情况下处理并发请求的能力已经足以满足需求.

One interesting thing is, that the solution before, was running on an old dedicated server, where the new CouchDB based site on the other hand was only running on a shared xen instance with very limited resources. And I haven't even used the strength of key-values stores to scale horizontally. The ability of CouchDB / Erlang OTP to handle concurrent requests without locking anything was already enough to serve the needs.

现在,快速编写的 CouchDB-traceback 记录器仍在运行,是探索主网站上的错误的有用方法.无论如何,大约每月一次,数据库变得太大,CouchDB 进程被杀死.但是随后,CouchDB 的 compact-db 命令再次将大小从几 GB 减小到几 KB,并且数据库再次启动并运行(也许我应该考虑在那里添加一个 cronjob...0o).

Now, the quickly written CouchDB-traceback logger is still running and is a helpful way to explore bugs on the main website. Anyway, about once a month the database becomes too big and the CouchDB process gets killed. But then, the compact-db command of CouchDB reduces the size from several GBs to some KBs again and the database is up and running again (maybe i should consider adding a cronjob there... 0o).

总之,CouchDB 无疑是这个子项目的最佳选择(或者至少比 MySQL 更好),而且它的工作做得很好.

In a summary, CouchDB was surely the best choice (or at least a better choice than MySQL) for this subproject and it does its job well.

这篇关于您在使用 NoSQL 数据存储时遇到了哪些可扩展性问题?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆