NoSQL(MongoDB)vs Lucene(或Solr)作为数据库 [英] NoSQL (MongoDB) vs Lucene (or Solr) as your database

查看:89
本文介绍了NoSQL(MongoDB)vs Lucene(或Solr)作为数据库的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

随着基于文档数据库的NoSQL运动的发展,我最近研究了MongoDB.我已经注意到在如何将项目视为文档"方面有惊人的相似之处,就像Lucene(和Solr的用户)一样.

With the NoSQL movement growing based on document-based databases, I've looked at MongoDB lately. I have noticed a striking similarity with how to treat items as "Documents", just like Lucene does (and users of Solr).

所以,问题是:为什么要在Lucene(或Solr)上使用NoSQL(MongoDB,Cassandra,CouchDB等)作为数据库"?

我在寻找答案(我确信其他人正在寻找)是对它们进行深入的比较.让我们一起跳过关系数据库的讨论,因为它们有不同的用途.

What I am (and I am sure others are) looking for in an answer is some deep-dive comparisons of them. Let's skip over relational database discussions all together, as they serve a different purpose.

Lucene具有一些重要的优势,例如强大的搜索和权重系统.更不用说Solr中的方面了(是的,Solr即将集成到Lucene中,是的!).您可以使用Lucene文档来存储ID,并像访问MongoDB一样访问文档.将其与Solr混合使用,您现在可以获得基于WebService的负载平衡解决方案.

Lucene gives some serious advantages, such as powerful searching and weight systems. Not to mention facets in Solr (which Solr is being integrated into Lucene soon, yay!). You can use Lucene documents to store IDs, and access the documents as such just like MongoDB. Mix it with Solr, and you now get a WebService-based, load balanced solution.

在谈论类似的数据存储和MongoDB的可伸缩性时,您甚至可以对Velocity或MemCached之类的进程外缓存提供程序进行比较.

You can even throw in a comparison of out-of-proc cache providers such as Velocity or MemCached when talking about similar data storing and scalability of MongoDB.

关于MongoDB的限制使我想起使用MemCached,但是我可以使用Microsoft的Velocity,并且对MongoDB具有更多的分组和列表收集功能(我认为).没有比在内存中缓存数据更快或可扩展的方法了.甚至Lucene都有一个内存提供程序.

The restrictions around MongoDB reminds me of using MemCached, but I can use Microsoft's Velocity and have more grouping and list collection power over MongoDB (I think). Can't get any faster or scalable than caching data in memory. Even Lucene has a memory provider.

MongoDB(及其他)确实具有一些优势,例如易于使用其API.新建一个文档,创建一个ID,然后存储它.完毕.很好,很容易.

MongoDB (and others) do have some advantages, such as the ease of use of their API. New up a document, create an id, and store it. Done. Nice and easy.

推荐答案

这是一个很好的问题,我已经深思了一下.我将总结我的经验教训:

This is a great question, something I have pondered over quite a bit. I will summarize my lessons learned:

  1. 在几乎所有情况下,您都可以轻松地使用Lucene/Solr代替MongoDB,但反之则不然.格兰特·英格索尔(Grant Ingersoll)的帖子在此处进行了总结.

MongoDB等似乎用于不需要搜索和/或构面的目的.对于程序员从RDBMS世界中解毒而言,这似乎是一个更简单且可以说更容易的过渡.除非习惯了Lucene& Solr的学习曲线更陡峭.

MongoDB etc. seem to serve a purpose where there is no requirement of searching and/or faceting. It appears to be a simpler and arguably easier transition for programmers detoxing from the RDBMS world. Unless one's used to it Lucene & Solr have a steeper learning curve.

使用Lucene/Solr作为数据存储的例子并不多,但Guardian取得了一些进展,并以出色的

There aren't many examples of using Lucene/Solr as a datastore, but Guardian has made some headway and summarize this in an excellent slide-deck, but they too are non-committal on totally jumping on Solr bandwagon and "investigating" combining Solr with CouchDB.

最后,我将提供我们的经验,很遗憾,我无法透露很多有关业务案例的信息.我们正在处理几TB的数据规模,这是一种近乎实时的应用程序.在研究了各种组合之后,决定坚持使用Solr.到目前为止,没有遗憾(六个月及以上),也没有理由改用其他方法.

Finally, I will offer our experience, unfortunately cannot reveal much about the business-case. We work on the scale of several TB of data, a near real-time application. After investigating various combinations, decided to stick with Solr. No regrets thus far (6-months & counting) and see no reason to switch to some other.

摘要:如果您没有搜索要求,Mongo会提供一个简单的&强大的方法.但是,如果搜索是您提供产品的关键,那么您最好还是坚持使用一项技术(Solr/Lucene)并对其进行优化,减少运动部件.

Summary: if you do not have a search requirement, Mongo offers a simple & powerful approach. However if search is key to your offering, you are likely better off sticking to one tech (Solr/Lucene) and optimizing the heck out of it - fewer moving parts.

我的2美分,希望对您有所帮助.

My 2 cents, hope that helped.

这篇关于NoSQL(MongoDB)vs Lucene(或Solr)作为数据库的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆