使用NoSQL数据存储遇到了什么可扩展性问题? [英] What scalability problems have you encountered using a NoSQL data store?

查看:205
本文介绍了使用NoSQL数据存储遇到了什么可扩展性问题?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

NoSQL指的是与关系数据库和ACID保证的历史冲突的非关系数据存储。热门开源NoSQL数据存储包括:

NoSQL refers to non-relational data stores that break with the history of relational databases and ACID guarantees. Popular open source NoSQL data stores include:


  • Cassandra (表格,由Java编写,由Cisco,WebEx,Digg,Facebook,IBM,Mahalo,Rackspace,Reddit和Twitter使用)

  • CouchDB (文档,用Erlang编写,由BBC和Engine Yard使用)

  • a href =http://github.com/cliffmoon/dynomite =noreferrer> Dynomite (键值,由Erlang撰写,Powerset使用)

  • HBase (键值,由Java编写,由Bing使用)

  • Hypertable (表格,用C ++编写,百度使用)

  • Kai (键值,以Erlang编写)

  • MemcacheDB (键值,由C编写,由Reddit使用)

  • MongoDB (电子艺术,Github,纽约时报和Sourceforge使用的以C ++编写的文档)

  • Neo4j (图表,用Java编写,由一些瑞典大学使用)

  • Project Voldemort (键值,以Java编写,由LinkedIn使用)

  • Redis (键值,用C语言编写,使用由Craigslist,Engine Yard和Github提供)

  • Riak (键值,用Erlang编写,由Comcast和Mochi Media使用)

  • Ringo (键值,用Erlang编写,由Nokia使用)

  • Scalaris (键值,用Erlang编写,由OnScale使用)

  • Terrastore (文档,用Java编写)

  • ThruDB (文档,用C ++编写,由JunkDepot.com使用)

  • Tokyo Cabinet / Tokyo Tyrant (键值,用C写成,由Mixi.jp(日本社交网站)使用)

  • Cassandra (tabular, written in Java, used by Cisco, WebEx, Digg, Facebook, IBM, Mahalo, Rackspace, Reddit and Twitter)
  • CouchDB (document, written in Erlang, used by BBC and Engine Yard)
  • Dynomite (key-value, written in Erlang, used by Powerset)
  • HBase (key-value, written in Java, used by Bing)
  • Hypertable (tabular, written in C++, used by Baidu)
  • Kai (key-value, written in Erlang)
  • MemcacheDB (key-value, written in C, used by Reddit)
  • MongoDB (document, written in C++, used by Electronic Arts, Github, NY Times and Sourceforge)
  • Neo4j (graph, written in Java, used by some Swedish universities)
  • Project Voldemort (key-value, written in Java, used by LinkedIn)
  • Redis (key-value, written in C, used by Craigslist, Engine Yard and Github)
  • Riak (key-value, written in Erlang, used by Comcast and Mochi Media)
  • Ringo (key-value, written in Erlang, used by Nokia)
  • Scalaris (key-value, written in Erlang, used by OnScale)
  • Terrastore (document, written in Java)
  • ThruDB (document, written in C++, used by JunkDepot.com)
  • Tokyo Cabinet/Tokyo Tyrant (key-value, written in C, used by Mixi.jp (Japanese social networking site))

我想知道你 - SO阅读器 - 使用数据存储和你使用的NoSQL数据存储解决的具体问题。

I'd like to know about specific problems you - the SO reader - have solved using data stores and what NoSQL data store you used.

问题:


  • 您使用NoSQL数据存储解决了哪些可扩展性问题?

  • 你使用NoSQL数据存储?

  • 在切换到NoSQL数据存储之前使用了什么数据库?

推荐答案

我在寻找第一手的经验,所以请不要回答,除非你有。我把一个小的子项目从MySQL切换到CouchDB,以便能够处理负载。结果是惊人的。

I've switched a small subproject from MySQL to CouchDB, to be able to handle the load. The result was amazing.

大约2年前,我们在 http://www.ubuntuusers.de/ (这可能是最大的德国Linux社区网站)。该网站是用Python编写的,我们添加了一个WSGI中间件,它能够捕获所有异常并将它们发送到另一个小型MySQL驱动的网站。这个小网站使用哈希来确定不同的错误,并存储发生的次数和最后一次发生。

About 2 years ago, we've released a self written software on http://www.ubuntuusers.de/ (which is probably the biggest German Linux community website). The site is written in Python and we've added a WSGI middleware which was able to catch all exceptions and send them to another small MySQL powered website. This small website used a hash to determine different bugs and stored the number of occurrences and the last occurrence as well.

不幸的是,发布后不久,traceback-logger网站没有回应了。我们有一些锁定问题与我们的主要网站的生产数据库,这是抛出异常几乎每一个请求,以及其他几个错误,我们在测试阶段没有探讨。我们的主站点的服务器集群,称为traceback-logger每秒提交页面几k次。这对于托管跟踪记录器的小型服务器来说太过分了(它已经是一个旧的服务器,只用于开发目的)。

Unfortunately, shortly after the release, the traceback-logger website wasn't responding anymore. We had some locking issues with the production db of our main site which was throwing exceptions nearly every request, as well as several other bugs, which we haven't explored during the testing stage. The server cluster of our main site, called the traceback-logger submit page several k times per second. And that was a way too much for the small server which hosted the traceback logger (it was already an old server, which was only used for development purposes).

在这个时间CouchDB是相当普遍,所以我决定尝试它,并写一个小的跟踪记录器。新的记录器只包含一个单一的python文件,它提供了一个包含排序和过滤器选项和提交页面的错误列表。在后台我开始了一个CouchDB进程。新软件对所有请求的响应都非常快,我们可以查看大量的自动错误报告。

At this time CouchDB was rather popular, and so I decided to try it out and write a small traceback-logger with it. The new logger only consisted of a single python file, which provided a bug list with sorting and filter options and a submit page. And in the background I've started a CouchDB process. The new software responded extremely quickly to all requests and we were able to view the massive amount of automatic bug reports.

一个有趣的事情是,在旧的专用服务器上,另一方面,新的基于CouchDB的站点仅在非常有限的资源的共享xen实例上运行。我甚至没有使用键值存储的强度水平扩展。 CouchDB / Erlang OTP处理并发请求而不锁定任何东西的能力已经足以满足需要。

One interesting thing is, that the solution before, was running on an old dedicated server, where the new CouchDB based site on the other hand was only running on a shared xen instance with very limited resources. And I haven't even used the strength of key-values stores to scale horizontally. The ability of CouchDB / Erlang OTP to handle concurrent requests without locking anything was already enough to serve the needs.

现在,快速编写的CouchDB-traceback记录器仍在运行,是一个有用的方法来探索主网站上的错误。反正,大约每月一次数据库变得太大,CouchDB进程被杀死。但是,然后,CouchDB的compact-db命令将大小从几GB减少到一些KB再次和数据库启动并再次运行(也许我应该考虑添加一个cronjob有... 0o)。

Now, the quickly written CouchDB-traceback logger is still running and is a helpful way to explore bugs on the main website. Anyway, about once a month the database becomes too big and the CouchDB process gets killed. But then, the compact-db command of CouchDB reduces the size from several GBs to some KBs again and the database is up and running again (maybe i should consider adding a cronjob there... 0o).

总之,CouchDB肯定是这个子项目的最佳选择(或者至少比MySQL更好的选择),它的工作也很好。

In a summary, CouchDB was surely the best choice (or at least a better choice than MySQL) for this subproject and it does its job well.

这篇关于使用NoSQL数据存储遇到了什么可扩展性问题?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆