Elasticsearch 嵌套排序 [英] Elasticsearch nested sorting

查看:42
本文介绍了Elasticsearch 嵌套排序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在 Elasticsearch 中进行嵌套排序,但目前没有成功.

I'm trying to do nested sorting in Elasticsearch but so far didn't succeed.

我的数据结构:

{ "_id" : 1,
"authorList" : [
  {"lastName":"hawking", "firstName":"stephan"},
  {"lastName":"frey", "firstName":"richard"}
]
}

{ "_id" : 2,
"authorList" : [
  {"lastName":"roger", "firstName":"christina"},
  {"lastName":"freud", "firstName":"damian"}
]
}

我想根据文档中的第一作者姓氏对文档进行排序.

I want to sort the documents according the first authors last name in the documents.

使用的映射:

"authorList" : { "type" : "nested", "properties" : {"lastName":{"type":"keyword"}}}

使用 SearchRequestBuilder (JAVA) 排序:

Sort using SearchRequestBuilder (JAVA):

    searchRequestBuilder.addSort(
SortBuilders.fieldSort("authorList.lastName")
.order(SortOrder.ASC)
.sortMode(SortMode.MIN)
.setNestedPath("authorList")
)

这有效,但没有给出想要的结果(例如,先hawking"然后是roger").

This works but doesn't give the wanted result (e.g. first "hawking" then "roger").

我错过了什么吗?有没有办法指示 Elasticsearch 访问数组 authorList 的 index=0?是否有任何映射/规范器可以单独索引数组的第一个条目?

Did I missed something? Is there a way to indicate Elasticsearch to access index=0 of the array authorList? Is there any mapping / normalizer to index the first entry of the array separately?

推荐答案

嵌套文档不会保存为简单的数组或列表.它们由 Elasticsearch 内部管理:

Nested documents are not saved as a simple array or list. They are managed internally by Elasticsearch:

Elasticsearch 从根本上来说仍然是扁平的,但它管理嵌套的内部关系以提供嵌套层次结构的外观.什么时候你创建了一个嵌套文档,Elasticsearch 实际上索引了两个单独的文档(根对象和嵌套对象),然后关联两个内部.(更多此处)

Elasticsearch is still fundamentally flat, but it manages the nested relation internally to give the appearance of nested hierarchy. When you create a nested document, Elasticsearch actually indexes two separate documents (root object and nested object), then relates the two internally. (more here)

我认为您需要向 elasticsearch 提供一些额外的信息,这将表明作者是主要/第一"作者.将这个额外的字段仅放在嵌套对象中的一位作者就足够了(您的映射可以保持原样),如下所示:

I think you need to provide some additional information to elasticsearch that will be an indicator which author is the "primary/first" one. It is enough to put this additional field only to one author in a nested object (your mapping can stay as before), something like this:

{
    "authorList" : [
      {"lastName":"roger", "firstName":"christina", "authorOrder": 1},
      {"lastName":"freud", "firstName":"damian"}
    ]
},
{
    "authorList" : [
      {"lastName":"hawking", "firstName":"stephan", "authorOrder": 1},
      {"lastName":"adams", "firstName": "mark" }
      {"lastName":"frey", "firstName":"richard"}
    ]
},
{
    "authorList" : [
      {"lastName":"adams", "firstName":"monica", "authorOrder": 1},
      {"lastName":"adams", "firstName":"richard"}
    ]
}

那么查询可以是:

{
  "query" : {
    "nested" : {
      "query" : {
        "bool" : {
          "must" : [
            {
              "match" : {
                "authorList.authorOrder" : 1
              }
            }
          ]
        }
      },
      "path" : "authorList"
    }
  },
  "sort" : [
    {
      "authorList.lastName" : {
        "order" : "asc",
        "nested_filter" : {
          "bool" : {
            "must" : [
              {
                "match" : {
                  "authorList.authorOrder" : 1
                }
              }
            ]
          }
        },
        "nested_path" : "authorList"
      }
    }
  ]
}

使用 Java API:

And with Java API:

QueryBuilder matchFirst = QueryBuilders.boolQuery()
        .must(QueryBuilders.matchQuery("authorList.authorOrder", 1));
QueryBuilder mainQuery = QueryBuilders.nestedQuery("authorList", matchFirst, ScoreMode.None);

SortBuilder sb = SortBuilders.fieldSort("authorList.lastName")
    .order(SortOrder.ASC)
    .setNestedPath("authorList")
    .setNestedFilter(matchFirst);

SearchRequestBuilder builder = client.prepareSearch("test")
        .setSize(50)
        .setQuery(mainQuery)
        .addSort(sb);

请注意,SortBuilder 具有 .setNestedFilter(matchAll) 这意味着 排序基于 authorList.lastName 字段,但仅基于您的主要/第一个"嵌套元素.如果没有它,elasticsearch 将首先对所有嵌套文档进行排序,从升序排序列表中选择第一个元素,并基于此对父文档进行排序.因此,带有Hawking"的文档可以排在第一位,因为它的姓氏是Adams".

Note that SortBuilder has .setNestedFilter(matchAll) which means that sorting is based on authorList.lastName field but only of your "primary/first" nested elements. Without it, elasticsearch would first sort all nested documents, pick first element from ascending sorted list and based on this it would sort parent documents. So document with "Hawking" could be first as it has "Adams" last name.

最终结果是:

"authorList" : [
      {"lastName":"adams", "firstName":"monica", "authorOrder": 1},
      {"lastName":"adams", "firstName":"richard"}
    ],
}
"authorList" : [
      {"lastName":"hawking", "firstName":"stephan", "authorOrder": 1},
      {"lastName":"adams", "firstName":"mark"},
      {"lastName":"frey", "firstName":"richard"}
    ]
},
{
    "authorList" : [
      {"lastName":"roger", "firstName":"christina", "authorOrder": 1},
      {"lastName":"freud", "firstName":"damian"}
    ]
}

这篇关于Elasticsearch 嵌套排序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆