Elasticsearch - 对嵌套对象列表进行脚本过滤 [英] Elasticsearch - Script Filter over a list of nested objects

查看:67
本文介绍了Elasticsearch - 对嵌套对象列表进行脚本过滤的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正试图弄清楚如何解决我在 ES 5.6 索引中遇到的这两个问题.

I am trying to figure out how to solve these two problems that I have with my ES 5.6 index.

"mappings": {
    "my_test": {
        "properties": {
            "Employee": {
                "type": "nested",
                "properties": {
                    "Name": {
                        "type": "keyword",
                        "normalizer": "lowercase_normalizer"
                    },
                    "Surname": {
                        "type": "keyword",
                        "normalizer": "lowercase_normalizer"
                    }
                }
            }
        }
    }
}

我需要创建两个单独的脚本过滤器:

I need to create two separate scripted filters:

1 - 过滤员工数组大小为 == 3 的文档

1 - Filter documents where size of employee array is == 3

2 - 过滤数组的第一个元素具有Name"==John"的文档

2 - Filter documents where the first element of the array has "Name" == "John"

我试图迈出第一步,但我无法遍历列表.我总是有空指针异常错误.

I was trying to make some first steps, but I am unable to iterate over the list. I always have a null pointer exception error.

{
  "bool": {
    "must": {
      "nested": {
        "path": "Employee",
        "query": {
          "bool": {
            "filter": [
              {
                "script": {
                  "script" :     """

                   int array_length = 0; 
                   for(int i = 0; i < params._source['Employee'].length; i++) 
                   {                              
                    array_length +=1; 
                   } 
                   if(array_length == 3)
                   {
                     return true
                   } else 
                   {
                     return false
                   }

                     """
                }
              }
            ]
          }
        }
      }
    }
  }
}

推荐答案

正如 Val 所注意到的,您无法在 Elasticsearch 的最新版本中访问脚本查询中的 _source 文档.但elasticsearch 允许您在分数上下文"中访问此_source.

As Val noticed, you cant access _source of documents in script queries in recent versions of Elasticsearch. But elasticsearch allow you to access this _source in the "score context".

因此,一种可能的解决方法(但您需要注意性能)是在您的查询中使用脚本分数与 min_score 相结合.

So a possible workaround ( but you need to be careful about the performance ) is to use a scripted score combined with a min_score in your query.

您可以在此堆栈溢出帖子中找到此行为的示例 通过elasticsearch中嵌套字段值的总和查询文档 .

You can find an example of this behavior in this stack overflow post Query documents by sum of nested field values in elasticsearch .

在您的情况下,这样的查询可以完成这项工作:

In your case a query like this can do the job :

POST <your_index>/_search
{
  "min_score": 0.1,
  "query": {
    "function_score": {
      "query": {
        "match_all": {}
      },
      "functions": [
        {
          "script_score": {
            "script": {
              "source": """
              if (params["_source"]["Employee"].length === params.nbEmployee) {
                def firstEmployee = params._source["Employee"].get(0);
                if (firstEmployee.Name == params.name) {
                  return 1;
                } else {
                  return 0;
                }
              } else {
                return 0;
              }
""",
              "params": {
                "nbEmployee": 3,
                "name": "John"
              }
            }
          }
        }
      ]
    }
  }
}

应该在参数中设置员工的数量和名字,以避免针对此脚本的每个用例重新编译脚本.

但请记住,正如 Val 已经提到的那样,它对您的集群来说可能非常繁重.您应该通过在 function_score 查询(在我的示例中为 match_all )中添加过滤器来缩小您将应用脚本的文档集的范围.在任何情况下,这都不是 Elasticsearch 应该使用的方式,您不能指望这样一个被黑的查询会有出色的表现.

But remember it can be very heavy on your cluster as Val already mentioned. You should narrow the set a document on which your will apply the script by adding filters in the function_score query ( match_all in my example ). And in any case, it is not the way Elasticsearch should be used and you cant expect bright performances with such a hacked query.

这篇关于Elasticsearch - 对嵌套对象列表进行脚本过滤的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆