Elasticsearch匹配字符串数组 [英] Elasticsearch match an array of strings

查看:149
本文介绍了Elasticsearch匹配字符串数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的Elasticsearch(v5.4.1)文档具有一个 _patents 字段,例如:

My Elasticsearch (v5.4.1) documents have a _patents field as such :

{
    // (Other fields : title, text, date, etc.)
    ,
    "_patents": [
        {"cc": "US"},
        {"cc": "MX"},
        {"cc": "KR"},
        {"cc": "JP"},
        {"cc": "CN"},
        {"cc": "CA"},
        {"cc": "AU"},
        {"cc": "AR"}
    ]
}

我正在尝试建立一个查询,该查询将仅返回其专利与一系列国家/地区代码相匹配的文档.例如,如果我的过滤器是 ["US","AU"] ,我需要退回所有在 US AU .排除具有 US 但没有 AU 的文档.

I'm trying to build a query that would return only documents whose patents match an array of country codes. For instance, if my filter is ["US","AU"] I need to be returned all documents that have patents in US and in AU. Exclude documents that have US but not AU.

到目前为止,我已经尝试在当前的工作查询中添加一个条款"字段:

So far I have tried to add a "term" field to my current working query :

{
    "query": {
        "bool": {
            "must": [
                // (Other conditions here : title match, text match, date range, etc.) These work
                 ,
                {
                    "terms": {
                        "_patents.cc": [ // I tried just "_patents"
                            "US",
                            "AU"
                        ]
                    }
                }
            ]
        }
    }
}

或者将此作为过滤器:

{
    "query": {
        "bool": {
            "must": [...],
            "filter": {
                "terms": {
                    "_patents": [
                        "US",
                        "AU"
                    ]
                }
            }
        }
    }
}

这些查询和我尝试过的变体不会产生错误,但会返回0结果.

These queries and the variants I've tried don't produce an error, but return 0 result.

我无法将ES文档模型更改为更易于匹配的内容,例如"_ patents":["US","CA","AU","CN","JP"] ,因为这是一个填充的字段.在建立索引时,我填充并引用具有许多字段的 Patent 文档,其中包括 cc .

I can't change my ES document model to something easier to match, like "_patents": [ "US","CA", "AU", "CN", "JP" ] because this is a populated field. At indexation time, I populate and reference Patent documents that have many fields, including cc.

推荐答案

我找到了解决方案.过滤后的国家/地区名称必须为小写 ...

I found the solution. The filtered country names have to be lowercase...

"US" 不返回结果,但是"us" 起作用,尽管索引字段为"US" .....微弱-_-'

"US" returns no result, but "us" works, despite the indexed field being "US" ...... Faint -_-'

我也这样写查询:

{
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "_patents.cc": "us"
          }
        },
        {
          "term": {
            "_patents.cc": "ca"
          }
        }
      ]
    }
  }
}  

这篇关于Elasticsearch匹配字符串数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆