Elasticsearch嵌套排序 [英] Elasticsearch nested sorting

查看:203
本文介绍了Elasticsearch嵌套排序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在Elasticsearch中进行嵌套排序,但到目前为止还没有成功。

I'm trying to do nested sorting in Elasticsearch but so far didn't succeed.

我的数据结构:

{ "_id" : 1,
"authorList" : [
  {"lastName":"hawking", "firstName":"stephan"},
  {"lastName":"frey", "firstName":"richard"}
]
}

{ "_id" : 2,
"authorList" : [
  {"lastName":"roger", "firstName":"christina"},
  {"lastName":"freud", "firstName":"damian"}
]
}

我想根据最后一位作者对文件进行排序文件中的名称。

I want to sort the documents according the first authors last name in the documents.

使用的映射:

"authorList" : { "type" : "nested", "properties" : {"lastName":{"type":"keyword"}}}

使用SearchRequestBuilder(JAVA)排序:

Sort using SearchRequestBuilder (JAVA):

    searchRequestBuilder.addSort(
SortBuilders.fieldSort("authorList.lastName")
.order(SortOrder.ASC)
.sortMode(SortMode.MIN)
.setNestedPath("authorList")
)

这有效但不能给出想要的结果(例如首先兜售然后咆哮)。

This works but doesn't give the wanted result (e.g. first "hawking" then "roger").

我错过了什么吗?有没有办法表明Elasticsearch访问数组authorList的index = 0?是否有任何映射/规范化器可以单独索引数组的第一个条目?

Did I missed something? Is there a way to indicate Elasticsearch to access index=0 of the array authorList? Is there any mapping / normalizer to index the first entry of the array separately?

推荐答案

嵌套文档不会保存为简单数组或列表。它们由Elasticsearch内部管理:

Nested documents are not saved as a simple array or list. They are managed internally by Elasticsearch:


Elasticsearch仍然基本持平,但它在内部管理嵌套的
关系以给出外观嵌套层次结构当
创建一个嵌套文档时,Elasticsearch实际上会索引两个
单独的文档(根对象和嵌套对象),然后在内部关联
。 (更多此处

我认为你需要向elasticsearch提供一些额外的信息,这些信息将成为作者主要/第一的指标。只需将这个附加字段放在嵌套对象中的一个作者就足够了(您的映射可以像以前一样保留),如下所示:

I think you need to provide some additional information to elasticsearch that will be an indicator which author is the "primary/first" one. It is enough to put this additional field only to one author in a nested object (your mapping can stay as before), something like this:

{
    "authorList" : [
      {"lastName":"roger", "firstName":"christina", "authorOrder": 1},
      {"lastName":"freud", "firstName":"damian"}
    ]
},
{
    "authorList" : [
      {"lastName":"hawking", "firstName":"stephan", "authorOrder": 1},
      {"lastName":"adams", "firstName": "mark" }
      {"lastName":"frey", "firstName":"richard"}
    ]
},
{
    "authorList" : [
      {"lastName":"adams", "firstName":"monica", "authorOrder": 1},
      {"lastName":"adams", "firstName":"richard"}
    ]
}

然后查询可能是:

{
  "query" : {
    "nested" : {
      "query" : {
        "bool" : {
          "must" : [
            {
              "match" : {
                "authorList.authorOrder" : 1
              }
            }
          ]
        }
      },
      "path" : "authorList"
    }
  },
  "sort" : [
    {
      "authorList.lastName" : {
        "order" : "asc",
        "nested_filter" : {
          "bool" : {
            "must" : [
              {
                "match" : {
                  "authorList.authorOrder" : 1
                }
              }
            ]
          }
        },
        "nested_path" : "authorList"
      }
    }
  ]
}

使用Java API:

And with Java API:

QueryBuilder matchFirst = QueryBuilders.boolQuery()
        .must(QueryBuilders.matchQuery("authorList.authorOrder", 1));
QueryBuilder mainQuery = QueryBuilders.nestedQuery("authorList", matchFirst, ScoreMode.None);

SortBuilder sb = SortBuilders.fieldSort("authorList.lastName")
    .order(SortOrder.ASC)
    .setNestedPath("authorList")
    .setNestedFilter(matchFirst);

SearchRequestBuilder builder = client.prepareSearch("test")
        .setSize(50)
        .setQuery(mainQuery)
        .addSort(sb);

请注意 SortBuilder .setNestedFilter(matchAll)这意味着排序基于 authorList.lastName 字段,但仅限于您的主要/第一嵌套元素。没有它,elasticsearch将首先对所有嵌套文档进行排序,从升序排序列表中选择第一个元素,并在此基础上对父文档进行排序。所以带有霍金的文件可能是第一个,因为它有亚当斯的姓氏。

Note that SortBuilder has .setNestedFilter(matchAll) which means that sorting is based on authorList.lastName field but only of your "primary/first" nested elements. Without it, elasticsearch would first sort all nested documents, pick first element from ascending sorted list and based on this it would sort parent documents. So document with "Hawking" could be first as it has "Adams" last name.

最终结果是:

"authorList" : [
      {"lastName":"adams", "firstName":"monica", "authorOrder": 1},
      {"lastName":"adams", "firstName":"richard"}
    ],
}
"authorList" : [
      {"lastName":"hawking", "firstName":"stephan", "authorOrder": 1},
      {"lastName":"adams", "firstName":"mark"},
      {"lastName":"frey", "firstName":"richard"}
    ]
},
{
    "authorList" : [
      {"lastName":"roger", "firstName":"christina", "authorOrder": 1},
      {"lastName":"freud", "firstName":"damian"}
    ]
}

这篇关于Elasticsearch嵌套排序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆