MongoDB全文搜索 - 匹配单词和精确短语 [英] MongoDB full text search - matching words and exact phrases

查看:481
本文介绍了MongoDB全文搜索 - 匹配单词和精确短语的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前在MongoDB中有全文搜索功能的问题。特别是在试图匹配确切的短语时。



我正在测试mongo shell的功能,但最终我将使用Spring Data MongoDB和Java。因此,我首先尝试运行此命令来搜索单词延迟,迟到和短语按时。

  db.mycollection.find({$ text:{$ search:delay late \on time \}})。explain(true); 

结果解释查询告诉我:

< pre $ parsedTextQuery:{
terms:[
delay,
late,
time
$,
negatedTerms:[],
短语:[
on time
],
negatedPhrases:[]},

这里的问题是我不想搜索单词time,而是短语按时。我想搜索延迟和迟到,理想情况下不希望阻止词干。



我尝试了几种不同的排列,例如

  db.mycollection.find({$ text:{$ search:delay late \''on time'\}})。explain真正); 

db.mycollection.find({$ text:{$ search:delay late \on\\time \}})。explain(true);

但似乎无法获得正确的结果。我在文档中看不到任何明显的内容。



为了我的目的,我应该使用单词的全文搜索和短语的正则表达式搜索功能吗?



目前正在使用MongoDB 2.6.5版本。感谢。

解决方案

您是否尝试过文本搜索以查看其行为是否正确?它在MongoDB 2.6.7上按预期工作:

 > db.test.drop()
> db.test.insert({t:我准时,不迟到或延迟})
> db.test.insert({t:我要么迟到,要么延迟})
> db.test.insert({t:Time like a banana})
> db.test.ensureIndex({t:text})

> db.test.find({$ text:{$ search:time late delay}},{_id:0})
{t:我准时,不迟到或延迟}
{t:时间飞得像香蕉一样}
{t:我要么迟到,要么迟到}

> db.test.find({$ text:{$ search:late delay}},{_id:0})
{t:我准时,不是迟到或延迟}
{t:我要么迟到,要么迟到}

> db.test.find({$ text:{$ search:late delay \on time \}},{_id:0})
{t: 我准时,不迟到或迟到}

为什么时间在 terms 数组中的解释?因为如果短语按时出现在文档中,则还必须使用时间这个词。 MongoDB在可以帮助查找短语的范围内使用文本索引,然后检查索引结果以查看哪些内容与整个短语完全匹配,而不仅仅是短语中的术语。


I'm currently having some issues with the full text search functionality in MongoDB. Specifically when trying to match exact phrases.

I'm testing out the functionality in the mongo shell, but ultimately I'll be using Spring Data MongoDB with Java.

So I first tried running this command to search for the words "delay", "late" and the phrase "on time"

db.mycollection.find( { $text: { $search: "delay late \"on time\"" } }).explain(true);

And the resulting explain query told me:

"parsedTextQuery" : {
    "terms" : [
            "delay",
            "late",
            "time"
    ],
    "negatedTerms" : [ ],
    "phrases" : [
            "on time"
    ],
    "negatedPhrases" : [ ] },

The issues here being that I don't want to search for the word "time", but rather the phrase "on time". I do want to search for delay and late and ideally don't want to prevent the stemming.

I tried a few different permutations e.g.

db.mycollection.find( { $text: { $search: "delay late \"'on time'\"" } }).explain(true);

db.mycollection.find( { $text: { $search: "delay late \"on\" \"time\"" } }).explain(true);

But couldn't seem to get the right results. I can't see anything obvious in the documentation about this.

For my purposes should I use the full text search for individual words and the regex search functionality for phrases?

Currently working with MongoDB version 2.6.5. Thanks.

解决方案

Did you try the text search to see if it didn't behave correctly? It works as expected for me on MongoDB 2.6.7:

> db.test.drop()
> db.test.insert({ "t" : "I'm on time, not late or delayed" })
> db.test.insert({ "t" : "I'm either late or delayed" })
> db.test.insert({ "t" : "Time flies like a banana" })
> db.test.ensureIndex({ "t" : "text" })

> db.test.find({ "$text" : { "$search" : "time late delay" } }, { "_id" : 0 })
{ "t" : "I'm on time, not late or delayed" }
{ "t" : "Time flies like a banana" }
{ "t" : "I'm either late or delayed" }

> db.test.find({ "$text" : { "$search" : "late delay" } }, { "_id" : 0 })
{ "t" : "I'm on time, not late or delayed" }
{ "t" : "I'm either late or delayed" }

> db.test.find({ "$text" : { "$search" : "late delay \"on time\"" } }, { "_id" : 0 })
{ "t" : "I'm on time, not late or delayed" }

Why is "time" in the terms array in the explain? Because if the phrase "on time" occurs in a document, the term time must also. MongoDB uses the text index to the extent it can to help locate the phrase and then will check the index results to see which actually matches the full phrase and not just the terms in the phrase.

这篇关于MongoDB全文搜索 - 匹配单词和精确短语的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆