ElasticSearch按嵌套文档总数过滤 [英] ElasticSearch Filter by sum of nested documents

查看:129
本文介绍了ElasticSearch按嵌套文档总数过滤的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试过滤嵌套过滤的对象中属性总和在某个范围内的产品.

I am trying to filter products where a sum of properties in the nested filtered objects is in some range.

我有以下映射:

{
  "product": {
    "properties": {
      "warehouses": {
        "type": "nested",
        "properties": {
          "stock_level": {
            "type": "integer"
          }
        }
      }
    }
  }
}

示例数据:

{
  "id": 1,
  "warehouses": [
    {
      "id": 2001,
      "stock_level": 5
    },
    {
      "id": 2002,
      "stock_level": 0
    },
    {
      "id": 2003,
      "stock_level": 2
    }
  ]
}

在ElasticSearch 5.6中,我曾经这样做:

In ElasticSearch 5.6 I used to do this:

GET products/_search
{
  "query": {
    "bool": {
      "filter": [
        [
          {
            "script": {
              "script": {
                "source": """
int total = 0;
for (def warehouse: params['_source']['warehouses']) {
  if (params.warehouse_ids == null || params.warehouse_ids.contains(warehouse.id)) {
    total += warehouse.stock_level;
  }
}
boolean gte = true;
boolean lte = true;
if (params.gte != null) {
  gte = (total >= params.gte);
}
if (params.lte != null) {
  lte = (total <= params.lte);
}
return (gte && lte);

""",
                "lang": "painless",
                "params": {
                  "gte": 4
                }
              }
            }
          }
        ]
      ]
    }
  }
}

问题是 params ['_ source'] ['warehouses'] 在ES 6.8中不再起作用,并且我无法找到一种方法来访问脚本中的嵌套文档.

The problem is that params['_source']['warehouses'] no longer works in ES 6.8, and I am unable to find a way to access nested documents in the script.

我尝试过:

  • doc ['warehouses'] -返回错误(在映射类型为[]"的[warehouses]中找不到字段)
  • ctx._source.warehouses -未定义变量[ctx]."
  • doc['warehouses'] - returns error ("No field found for [warehouses] in mapping with types []" )
  • ctx._source.warehouses - "Variable [ctx] is not defined."

我也尝试过使用scripted_field,但似乎脚本字段是在最后阶段进行计算的,在查询过程中不可用.

I have also tried to use scripted_field but it seems that scripted fields are getting calculated on the very last stage and are not available during query.

我也有按相同逻辑排序(按给定仓库中的存货总和对产品进行排序),它的工作原理就像是一种魅力:

I also have a sorting by the same logic (sort products by the sum of stocks in the given warehouses), and it works like a charm:

  "sort": {
    "warehouses.stock_level": {
      "order": "desc",
      "mode": "sum",
      "nested": {
        "path": "warehouses"
        "filter": {
           "terms": {
             "warehouses.id": [2001, 2003]
           }
        }
      }
    }
  }

但是我也找不到一种访问此排序值的方法:(

But I can't find a way to access this sort value either :(

任何想法我该如何实现?谢谢.

Any ideas how can I achieve this? Thanks.

推荐答案

我最近遇到了同样的问题.事实证明,更改为 6.4左右的某个地方发生,并且强烈建议不要访问 _source ,但是人们似乎仍在使用/想要使用它.

I recently had the same issue. It turns out the change occurred somewhere around 6.4 during refactoring and while accessing _source is strongly discouraged, it looks like people are still using / wanting to use it.

这是一种利用 include_in_root 参数.

Here's a workaround taking advantage of the include_in_root parameter.

  1. 调整地图

PUT product
{
  "mappings": {
    "properties": {
      "warehouses": {
        "type": "nested",
        "include_in_root": true,     <--
        "properties": {
          "stock_level": {
            "type": "integer"
          }
        }
      }
    }
  }
}

  1. 拖放重新编制索引
  2. 在访问展平值时,在for循环中重构各个仓库物料:

GET product/_search
{
  "query": {
    "bool": {
      "filter": [
        {
          "script": {
            "script": {
              "source": """
                  int total = 0;
                  
                  def ids = doc['warehouses.id'];
                  def levels = doc['warehouses.stock_level'];
                  
                  for (def i = 0; i <  ids.length; i++) {
                    def warehouse = ['id':ids[i], 'stock_level':levels[i]];
                    
                    if (params.warehouse_ids == null || params.warehouse_ids.contains(warehouse.id)) {
                      total += warehouse.stock_level;
                    }
                  }
                  
                  boolean gte = true;
                  boolean lte = true;
                  if (params.gte != null) {
                    gte = (total >= params.gte);
                  }
                  if (params.lte != null) {
                    lte = (total <= params.lte);
                  }
                  return (gte && lte);
              """,
              "lang": "painless",
              "params": {
                  "gte": 4
              }
            }
          }
        }
      ]
    }
  }
}

请注意,这种方法假设所有仓库都包含非空ID和库存水平.

Be aware that this approach assumes that all warehouses include a non-null id and stock level.

这篇关于ElasticSearch按嵌套文档总数过滤的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆