如何在elasticsearch中对结果进行分组? [英] How to group results in elasticsearch?

查看:52
本文介绍了如何在elasticsearch中对结果进行分组?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我将书名存储在 elasticsearch 中,它们都属于许多商店.像这样:

I am storing Book Titles in elasticsearch and they all belong to many shops. Like this:

{
    "books": [
        {
            "id": 1,
            "title": "Title 1",
            "store": "store1" 
        },
        {             
            "id": 2,
            "title": "Title 1",
            "store": "store2" 
        },
        {             
            "id": 3,
            "title": "Title 1",
            "store": "store3" 
        },
        {             
            "id": 4,
            "title": "Title 2",
            "store": "store2" 
        },
        {             
            "id": 5,
            "title": "Title 2",
            "store": "store3" 
        }
    ]
}

如何获取所有书籍并按标题对其进行分组...以及每组一个结果(一行具有相同标题的组,以便我可以获取所有 ID 和商店)?

How can I get all the books and group them by title... and one result per group (one row with group with the same title so i can get all ids and stores)?

根据上面的数据,我想得到两个结果,其中包含所有 id 和 store.

Based on data above I want to get two results with all ids and stores in them.

预期结果:

{
"hits":{
    "total" : 2,
    "hits" : [
        {                
            "0" : {
                "title" : "Title 1",
                "group": [
                     {
                         "id": 1,
                         "store": "store1"
                     },
                     {
                         "id": 2,
                         "store": "store2"
                     },
                     {
                         "id": 3,
                         "store": "store3"
                     },
                ]
            }
        },
        {                
            "1" : {
                "title" : "Title 2",
                "group": [
                     {
                         "id": 4,
                         "store": "store2"
                     },
                     {
                         "id": 5,
                         "store": "store3"
                     }
                ]
            }
        }
    ]
}
}

推荐答案

您要查找的内容在 Elasticsearch 中是不可能的,至少在当前版本 (1.1) 中是不可能的.

What you are looking for is not possible in Elasticsearch, at least not with the current version (1.1).

有一个长期悬而未决的此功能的问题,有很多 +1 和背后的需求.

There is a long outstanding issue for this feature with a lot of +1's and demand behind it.

至于陈述:Simon说,它需要很多重构,虽然是计划中的,但没有办法说什么时候会实施甚至交付.

As for statements: Simon says, it requires a lot of refactoring and although it is planned, there is no way of saying, when it will be implemented or even shipped.

Clinton Gormley 在他的网络研讨会上发表了类似的声明,字段分组需要付出很多努力才能正确完成,特别是因为 Elasticsearch 本质上是一个分片和分布式环境.如果您忽略分片,这没什么大不了的,但 Elasticsearch 只想提供可以与完整系统一起扩展的功能,并且可以在数百台机器上工作,就像在单个机器上一样.

A similar statement was made by Clinton Gormley in his webinar, that field grouping needs a lot of effort to be done right, especially since Elasticsearch is a sharded and distributed environment by nature. It would be not that big of a deal, if you'd ignore sharding, but Elasticsearch wants to ship only with features, that can scale with the complete system and work as well on hundreds of machines as they would on a single box.

如果您未绑定到 Elasticsearch,Solr 提供了这样的功能.

If you're not tied to Elasticsearch, Solr offers such a feature.

否则,目前最好的解决方案可能是做这个客户端.也就是说,查询一些文档,对您的客户端进行分组,如果需要,获取更多结果以满足您所需的组大小(据我所知,这就是 Solr 在幕后所做的).

Otherwise, probably the best solution at the moment is to do this client side. That is, query for some documents, do the grouping on you client and if needed, fetch some more results to satisfy your desired group size (as far as i know, this is what Solr is doing under the hood).

不完全是您想要的,但您也可以选择 聚合;为您的 title 创建一个存储桶,并在 id 字段上完成子聚合.你不会得到 store 值,但是一旦你有了 id,你就可以从你的数据存储中检索它们.

Not exactly what you wanted, but you could also go for aggregations; create one bucket for your title and have a sub-aggregation done on the id field. You won't get the store values with this, but you could retrieve them from your datastore once you have the ids.

{
    "aggs" : {
        "titles" : {
            "terms" : { "field" : "title" },
            "aggs": {
                "ids": {
                    "terms": { "field" : "id" }
                }
            }
        }
    }
}

编辑:看来,与 top_hits 聚合,结果分组很快就会实现.

Edit: It seems, that with the top_hits aggregations, result grouping could be implemented soon.

这篇关于如何在elasticsearch中对结果进行分组?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆