Elasticsearch-合并多个文档中的字段 [英] Elasticsearch - combine fields from multiple documents

查看:166
本文介绍了Elasticsearch-合并多个文档中的字段的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有一堆这样的文档:

Let's say I have a bunch of documents like this:

{
    "foo" : [1, 2, 3]
}

{
    "foo" : [3, 4, 5]
}

对于针对这些文档运行的查询,我正在寻找一种方法来返回 foo (最好是唯一的值,但可以重复):

For a query run against these documents, I'm looking for a way to return an array of all values for foo (ideally the unique values, but duplicates are OK):

{
    "foo" : [1, 2, 3, 3, 4, 5]
}

我看过进入聚合API中,但如果有可能,我看不到如何实现。我当然可以用代码手动编译结果,但是我可以有成千上万的文档,以这种方式获得结果要干净得多。

I've looked into the aggregations APIs but I can't see how to achieve this, if its at all possible. I could of course compile the results manually in code, however I could have thousands of documents and it would be far cleaner to obtain the result in this manner.

推荐答案

您可以使用脚本式指标聚合,其中包含 reduce_script

You can use Scripted Metric Aggregation with a reduce_script.

设置一些测试数据:

curl -XPUT http://localhost:9200/testing/foo/1 -d '{ "foo" : [1, 2, 3] }'
curl -XPUT http://localhost:9200/testing/foo/2 -d '{ "foo" : [4, 5, 6] }'

现在尝试以下聚合:

curl -XGET "http://localhost:9200/testing/foo/_search" -d'
{
  "size": 0,
  "aggs": {
    "fooreduced": {
      "scripted_metric": {
        "init_script": "_agg[\"result\"] = []",
        "map_script":  "_agg.result.add(doc[\"foo\"].values)",
        "reduce_script": "reduced = []; for (a in _aggs) { for (entry in a) { word = entry.key; reduced += entry.value } }; return reduced.flatten().sort()"

      }
    }
  }
}'

呼叫将返回以下内容:

{
  "took": 50,
  "timed_out": false,
  "_shards": {
    "total": 6,
    "successful": 6,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "fooreduced": {
      "value": [
        1,
        2,
        3,
        4,
        5,
        6
      ]
    }
  }
}

.flatten() ,但我对groovy的了解还不多找到这样的解决方案。我不能说这种聚合的性能如何,您必须自己进行测试。

It might be possible that there is a solution withoun .flatten(), but I'm not that much into groovy (yet) to find such a solution. And I can't say how good the performance of this aggregation is, you have to test it for yourself.

这篇关于Elasticsearch-合并多个文档中的字段的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆