Elasticsearch多字段查询 [英] Elasticsearch multi field query

查看:94
本文介绍了Elasticsearch多字段查询的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我无法在ElasticSearch中构建地址搜索查询.

I'm having trouble framing an address search query in ElasticSearch.

地址以以下结构存储在ES中:
地址 {街道,城市,邮政编码}

The address is stored in ES with the following structure:
Address { street, city, zipcode }

这是一个示例查询:

GET /adr-address/_search
{   
  "query": {
    "multi_match": {
      "query":       "mainstreet, houston",
      "type":        "most_fields",
      "fields":      [ "street", "city", "zipcode"]
    }
  }
}

"hits": [
 {
      "_source": {
       "id": "S6v4xyO8UE5NRcWtmMATPQ==",
       "street": "Houston 2nd Avenue",
       "zipcode": "8032",
       "city": "Houston"
    }
 },
 {
    "_source": {
       "id": "aLgQFrO8zCT8m88lAnYZPQ==",
       "street": "Houston 1st Avenue",
       "zipcode": "8044",
       "city": "Houston"
    }
 },
 {
    "_source": {
       "id": "aLgQFrO8zCT8m88lAnYZPQ==",
       "street": "mainstreet",
       "zipcode": "8044",
       "city": "Houston"
    }
 },

多匹配查询在大多数情况下都可以正常工作,但街道也包含城市名称的情况除外.Elasticsearch将这些结果赋予更高的优先级,即使不可接受,这也是完全可以理解的.

The multi match query works fine most of the time, except for the scenario when street contains the city name as well. Elasticsearch assigns higher priority to these results which is totally understandable even though not acceptable.

这是_analyze结果:

Here is the _analyze result:

GET /adr-address/_validate/query?explain
{
  "query": {
    "multi_match": {
      "query":       "mainstreet, houston",
      "type":        "most_fields",
      "fields":      [ "street", "city", "zipcode" ]
    }
  }
}

{
   "valid": true,
   "_shards": {
      "total": 1,
      "successful": 1,
      "failed": 0
   },
   "explanations": [
      {
         "index": "adr-address",
         "valid": true,
         "explanation": "(zipcode:mainstreet zipcode:houston) (street:mainstreet street:houston) (city:mainstreet city:houston)"
      }
   ]
}

请注意,对于相同的查询,google maps api返回准确的结果.

It should be noted that google maps api returns accurate results for the same query.

到目前为止所做的假设/条件:

Assumptions/conditions made until now:

  1. 令牌生成器是:空格,逗号,数字等
  2. 输入词可以按任意顺序包含多字街道名称,邮政编码或城市

关于如何改善搜索重用率的任何建议?

Any suggestion on how I could improve the search reuslts?

推荐答案

尝试使用cross_fields而不是most_fields作为multi_match的类型.

Try using cross_fields instead of most_fields as a type for the multi_match.

从文档中

cross_fields类型对于结构化文档特别有用多个字段应匹配的位置.例如,当查询威尔·史密斯"的first_name和last_name字段,最匹配的是可能在一个字段中有"Will",而在另一个字段中有"Smith".

The cross_fields type is particularly useful with structured documents where multiple fields should match. For instance, when querying the first_name and last_name fields for "Will Smith", the best match is likely to have "Will" in one field and "Smith" in the other.

您正在使用的most_fields似乎是用于搜索相同的文本,但是以不同的方式进行了分析.

And the most_fields that you are using seems to be for searching through the same text, but analysed in different ways.

查询示例:

GET /adr-address/_search
{   
  "query": {
    "multi_match": {
      "query":       "mainstreet, houston",
      "type":        "cross_fields",
      "fields":      [ "street", "city", "zipcode"]
    }
  }
}

链接至文档

这篇关于Elasticsearch多字段查询的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆