猫鼬用部分字符串进行文本搜索 [英] Mongoose text-search with partial string

查看:120
本文介绍了猫鼬用部分字符串进行文本搜索的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用猫鼬来搜索我的收藏夹中的人.

/*Person model*/
{
    name: {
       first: String,
       last: String
    }
}

现在我要搜索有疑问的人:

let regex = new RegExp(QUERY,'i');

Person.find({
   $or: [
      {'name.first': regex},
      {'name.last': regex}
   ]
}).exec(function(err,persons){
  console.log(persons);
});

如果我搜索 John ,我会得到结果(如果我搜索 Jo ,则为事件). 但是,如果我搜索 John Doe ,显然不会得到任何结果.

如果我将 QUERY 更改为 John | Doe ,我会得到结果,但会返回所有具有 John Doe的人的姓氏/名字.

接下来的事情是尝试使用猫鼬textsearch:

首先将字段添加到索引:

PersonSchema.index({
   name: {
      first: 'text',
      last: 'text'
   }
},{
   name: 'Personsearch index',
   weights: {
      name: {
          first : 10,
          last: 10
   }
}
});

然后修改Person查询:

Person.find({ 
    $text : { 
        $search : QUERY
    } 
},
{ score:{$meta:'textScore'} })
.sort({ score : { $meta : 'textScore' } })
.exec(function(err,persons){
    console.log(persons);
});

这很好用! 但是现在只有返回的人与整个姓氏/姓氏匹配:

-> John 返回值

-> Jo 不返回任何值

有没有办法解决这个问题?

首选没有外部插件的答案,但也希望其他答案.

解决方案

您可以使用aggregate管道执行此操作,该管道使用$concat将名字和姓氏连接在一起,然后针对该名字进行搜索:

let regex = new RegExp(QUERY,'i');

Person.aggregate([
    // Project the concatenated full name along with the original doc
    {$project: {fullname: {$concat: ['$name.first', ' ', '$name.last']}, doc: '$$ROOT'}},
    {$match: {fullname: regex}}
], function(err, persons) {
    // Extract the original doc from each item
    persons = persons.map(function(item) { return item.doc; });
    console.log(persons);
});

但是性能是一个问题,因为它不能使用索引,因此需要进行完整的集合扫描.

您可以通过在$project阶段之前使用$match查询来缓解这种情况,该查询可以使用索引来减少其余管道需要查看的文档集.

因此,如果您分别索引name.firstname.last,然后将搜索字符串的第一个单词作为锚定查询(例如/^John/i),则可以将以下内容放在管道的开头:

{$match: $or: [
  {'name.first': /^John/i},
  {'name.last': /^John/i}
]}

显然,您需要以编程方式生成该第一个单词"正则表达式,但希望它能为您提供想法.

Hi i'm using mongoose to search for persons in my collection.

/*Person model*/
{
    name: {
       first: String,
       last: String
    }
}

Now i want to search for persons with a query:

let regex = new RegExp(QUERY,'i');

Person.find({
   $or: [
      {'name.first': regex},
      {'name.last': regex}
   ]
}).exec(function(err,persons){
  console.log(persons);
});

If i search for John i get results (event if i search for Jo). But if i search for John Doe i am not getting any results obviously.

If i change QUERY to John|Doe i get results, but it returns all persons who either have John or Doe in their last-/firstname.

The next thing was to try with mongoose textsearch:

First add fields to index:

PersonSchema.index({
   name: {
      first: 'text',
      last: 'text'
   }
},{
   name: 'Personsearch index',
   weights: {
      name: {
          first : 10,
          last: 10
   }
}
});

Then modify the Person query:

Person.find({ 
    $text : { 
        $search : QUERY
    } 
},
{ score:{$meta:'textScore'} })
.sort({ score : { $meta : 'textScore' } })
.exec(function(err,persons){
    console.log(persons);
});

This works just fine! But now it is only returning persons that match with the whole first-/lastname:

-> John returns value

-> Jo returns no value

Is there a way to solve this?

Answers without external plugins are preferred but others are wished too.

解决方案

You can do this with an aggregate pipeline that concatenates the first and last names together using $concat and then searches against that:

let regex = new RegExp(QUERY,'i');

Person.aggregate([
    // Project the concatenated full name along with the original doc
    {$project: {fullname: {$concat: ['$name.first', ' ', '$name.last']}, doc: '$$ROOT'}},
    {$match: {fullname: regex}}
], function(err, persons) {
    // Extract the original doc from each item
    persons = persons.map(function(item) { return item.doc; });
    console.log(persons);
});

Performance is a concern, however, as this can't use an index so it will require a full collection scan.

You can mitigate that by preceding the $project stage with a $match query that can use an index to reduce the set of docs the rest of the pipeline needs to look at.

So if you separately index name.first and name.last and then take the first word of your search string as an anchored query (e.g. /^John/i), you could prepend the following to the beginning of your pipeline:

{$match: $or: [
  {'name.first': /^John/i},
  {'name.last': /^John/i}
]}

Obviously you'd need to programmicatically generate that "first word" regex, but hopefully it gives you the idea.

这篇关于猫鼬用部分字符串进行文本搜索的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆