如何在弹性搜索中分组结果? [英] How to group results in elasticsearch?

查看:97
本文介绍了如何在弹性搜索中分组结果?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在弹性搜索中存储书名,它们都属于很多商店。像这样:

  {
books:[
{
id 1,
title:Title 1,
store:store1
},
{
id:2,
title:Title 1,
store:store2
},
{
id:3,
title 标题1,
store:store3
},
{
id:4,
title:Title 2
store:store2
},
{
id:5,
title:Title 2,
商店:store3
}
]
}

如何获取所有的书籍,并按标题分组,每组一个结果(一行与同一个标题组合,以便我可以获得所有的ids和商店)?



基于数据a我想要获得两个结果,其中包含所有ids和商店。



预期结果:

  {
hits:{
total:2,
hits:[
{
0
title:Title 1,
group:[
{
id:1,
store:store1
},
{
id:2,
store:store2
},
{
id ,
store:store3
},
]
}
},
{
1:{
title:Title 2,
group:[
{
id:4 ,
store:store2
},
{
id:5,
store:store3
}
]
}
}
]
}
}


解决方案

您在寻找的是不可能的弹性搜索,至少不与当前版本(1.1)。



有一个很长的杰出的这个功能的问题有很多+ 1和需求背后。



至于声明: Simon说,它需要大量的重构,虽然它是有计划的,但是没有办法说明什么时候实施甚至发货。



克林顿·戈姆利在他的网络研讨会上发表了类似的声明>,该分组需要做很多努力才能正确,特别是因为Elasticsearch是一个分层和分散的环境。这不是一个很大的交易,如果你不理会分片,但Elasticsearch只想运送功能,可以扩展整个系统,并在数百台机器上工作,就像在一个单一的盒子。 / p>

如果您没有绑定到Elasticsearch, Solr offers这样的功能



否则,目前可能最好的解决方案是做这个客户端。也就是说,查询一些文档,对您的客户端进行分组,如果需要,可以获取更多结果以满足您所需的组大小(据我所知,这是Solr正在开展的工作)。



不完全是你想要的,但你也可以去聚合;为标题创建一个桶,并在 id 字段上完成子集合。您不会得到存储的值,但您可以从数据存储区中获取它们。



{
aggs:{
titles:{
terms:{field:title},
aggs:{
ids:{
terms:{field:id}
}
}
}
}
}

编辑通过 top_hits聚合,结果分组即将实施。


I am storing Book Titles in elasticsearch and they all belong to many shops. Like this:

{
    "books": [
        {
            "id": 1,
            "title": "Title 1",
            "store": "store1" 
        },
        {             
            "id": 2,
            "title": "Title 1",
            "store": "store2" 
        },
        {             
            "id": 3,
            "title": "Title 1",
            "store": "store3" 
        },
        {             
            "id": 4,
            "title": "Title 2",
            "store": "store2" 
        },
        {             
            "id": 5,
            "title": "Title 2",
            "store": "store3" 
        }
    ]
}

How can I get all the books and group them by title... and one result per group (one row with group with the same title so i can get all ids and stores)?

Based on data above I want to get two results with all ids and stores in them.

Expected results:

{
"hits":{
    "total" : 2,
    "hits" : [
        {                
            "0" : {
                "title" : "Title 1",
                "group": [
                     {
                         "id": 1,
                         "store": "store1"
                     },
                     {
                         "id": 2,
                         "store": "store2"
                     },
                     {
                         "id": 3,
                         "store": "store3"
                     },
                ]
            }
        },
        {                
            "1" : {
                "title" : "Title 2",
                "group": [
                     {
                         "id": 4,
                         "store": "store2"
                     },
                     {
                         "id": 5,
                         "store": "store3"
                     }
                ]
            }
        }
    ]
}
}

解决方案

What you are looking for is not possible in Elasticsearch, at least not with the current version (1.1).

There is a long outstanding issue for this feature with a lot of +1's and demand behind it.

As for statements: Simon says, it requires a lot of refactoring and although it is planned, there is no way of saying, when it will be implemented or even shipped.

A similar statement was made by Clinton Gormley in his webinar, that field grouping needs a lot of effort to be done right, especially since Elasticsearch is a sharded and distributed environment by nature. It would be not that big of a deal, if you'd ignore sharding, but Elasticsearch wants to ship only with features, that can scale with the complete system and work as well on hundreds of machines as they would on a single box.

If you're not tied to Elasticsearch, Solr offers such a feature.

Otherwise, probably the best solution at the moment is to do this client side. That is, query for some documents, do the grouping on you client and if needed, fetch some more results to satisfy your desired group size (as far as i know, this is what Solr is doing under the hood).

Not exactly what you wanted, but you could also go for aggregations; create one bucket for your title and have a sub-aggregation done on the id field. You won't get the store values with this, but you could retrieve them from your datastore once you have the ids.

{
    "aggs" : {
        "titles" : {
            "terms" : { "field" : "title" },
            "aggs": {
                "ids": {
                    "terms": { "field" : "id" }
                }
            }
        }
    }
}

Edit: It seems, that with the top_hits aggregations, result grouping could be implemented soon.

这篇关于如何在弹性搜索中分组结果?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆