Wilcard搜索或Elastic搜索中的部分匹配 [英] Wilcard search or partial matching in Elastic search

查看:153
本文介绍了Wilcard搜索或Elastic搜索中的部分匹配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试向最终用户提供搜索类型,因为它们更像是sqlserver。
我能够为给定的sql方案实现ES查询:

I am trying to provide the search to end user with type as they go which is is more like sqlserver. I was able to implement ES query for the given sql scenario:

 select * from table where name like '%pete%' and type != 'xyz and type!='abc'

但ES查询不为这个sql查询工作

But the ES query doesnt work for this sql query

  select * from table where name like '%peter tom%' and type != 'xyz and type!='abc'

在我的弹性搜索中,通配符查询我还需要执行一些布尔过滤查询

In my elastic search alongwith the wildcard query i also need to perform some boolean filtered query

{
"query": {
"filtered": {
"filter": {
"bool": {
"should": [
{
"query": {
"wildcard": {
"name":
{ "value": "*pete*" }
}
}
}
],
"must_not": [
{
"match":
{ "type": "xyz" }
}, {
"match":
{ "type": "abc" }
}
]
}
}
}
}
}

上述弹性查询与通配符搜索工作正常得到我所有与pete匹配的文档,不是xyz和abc类型的文档。但是当我尝试使用2分隔字符的空格执行通配符时,相同的查询将返回空白,如下所示。例如

The above elastic query with wildcard search works fine and gets me all the documents that matches pete and are not of type xyz and abc .But when i try perform the wildcard with 2 seprate words seprated by space then the same query returns me empty as shown below.For example

{
    "query": {
    "filtered": {
    "filter": {
    "bool": {
    "should": [
    {
    "query": {
    "wildcard": {
    "name":
    { "value": "*peter tom*" }
    }
    }
    }
    ],
    "must_not": [
    {
    "match":
    { "type": "xyz" }
    }, {
    "match":
    { "type": "abc" }
    }
    ]
    }
    }
    }
    }
    }

我的映射如下:

{
  "properties": {
     "name": {
      "type": "string"
    }
    "type": {
      "type": "string"
    }
  }
}

为了让空格分隔的单词可以进行通配符搜索,我应该使用t查询。

What query should i use in order to make wild card search possible for words seprated by spaces

推荐答案

最有效的解决方案是利用 ngram tokenizer 以便标记部分名称字段。例如,如果您有一个名称,如 peter tomson ,则ngram tokenizer将进行标记和索引,如下所示:

The most efficient solution involves leveraging an ngram tokenizer in order to tokenize portions of your name field. For instance, if you have a name like peter tomson, the ngram tokenizer will tokenize and index it like this:


  • pe

  • pet

  • pete

  • peter

  • peter t

  • peter to

  • 彼得·汤姆

  • 彼得·托姆斯>
  • peter tomso

  • eter tomson

  • ter tomson

  • er tomson

  • r tomson

  • tomson

  • tomson

  • omson
  • mson

  • 儿子

  • on

  • pe
  • pet
  • pete
  • peter
  • peter t
  • peter to
  • peter tom
  • peter toms
  • peter tomso
  • eter tomson
  • ter tomson
  • er tomson
  • r tomson
  • tomson
  • tomson
  • omson
  • mson
  • son
  • on

所以,当这被索引时,搜索任何这些令牌将检索您的文档,其中包含 peter thomson

So, when this has been indexed, searching for any of those tokens will retrieve your document with peter thomson in it.

让我们创建索引:

PUT likequery
{
  "settings": {
    "analysis": {
      "analyzer": {
        "my_ngram_analyzer": {
          "tokenizer": "my_ngram_tokenizer"
        }
      },
      "tokenizer": {
        "my_ngram_tokenizer": {
          "type": "nGram",
          "min_gram": "2",
          "max_gram": "15"
        }
      }
    }
  },
  "mappings": {
    "typename": {
      "properties": {
        "name": {
          "type": "string",
          "fields": {
            "search": {
              "type": "string",
              "analyzer": "my_ngram_analyzer"
            }
          }
        },
        "type": {
          "type": "string",
          "index": "not_analyzed"
        }
      }
    }
  }
}

然后,您可以使用简单高效的术语查询:

You'll then be able to search like this with a simple and very efficient term query:

POST likequery/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "term": {
            "name.search": "peter tom"
          }
        }
      ],
      "must_not": [
        {
          "match": {
            "type": "xyz"
          }
        },
        {
          "match": {
            "type": "abc"
          }
        }
      ]
    }
  }
}

这篇关于Wilcard搜索或Elastic搜索中的部分匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆