Lucene如何工作 [英] How does Lucene work

查看:63
本文介绍了Lucene如何工作的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想了解Lucene搜索如何如此快速地工作.我在网络上找不到任何有用的文档.如果您需要阅读任何内容(缺少Lucene源代码),请告诉我.

I would like to find out how lucene search works so fast. I can't find any useful docs on the web. If you have anything (short of lucene source code) to read, let me know.

在我的情况下,使用带有索引的mysql5文本搜索的文本搜索查询大约需要18分钟. lucene搜索同一查询只需不到一秒钟的时间.

A text search query using mysql5 text search with index takes about 18 minutes in my case. A lucene search for the same query takes less than a second.

推荐答案

Lucene是一个反向的全文本索引.这意味着它需要所有文档,将它们拆分为单词,然后为每个单词构建一个索引 .由于索引是完全无序的字符串匹配,因此它可能会非常快.假设地,varchar字段上的SQL无序索引可能会同样快,实际上,我认为您会发现大型数据库在这种情况下可以非常快速地执行简单的字符串相等查询.

Lucene is an inverted full-text index. This means that it takes all the documents, splits them into words, and then builds an index for each word. Since the index is an exact string-match, unordered, it can be extremely fast. Hypothetically, an SQL unordered index on a varchar field could be just as fast, and in fact I think you'll find the big databases can do a simple string-equality query very quickly in that case.

Lucene不必针对事务处理进行优化.添加文档时,不必确保查询立即即可看到它.而且它不需要针对现有文档的更新进行优化.

Lucene does not have to optimize for transaction processing. When you add a document, it need not ensure that queries see it instantly. And it need not optimize for updates to existing documents.

但是,在一天结束时,如果您真的想知道,则需要阅读源.毕竟,您所引用的这两种东西都是开源的.

However, at the end of the day, if you really want to know, you need to read the source. Both things you reference are open source, after all.

这篇关于Lucene如何工作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆