Elasticsearch-合并多个文档中的字段 [英] Elasticsearch - combine fields from multiple documents
问题描述
假设我有一堆这样的文档:
Let's say I have a bunch of documents like this:
{
"foo" : [1, 2, 3]
}
{
"foo" : [3, 4, 5]
}
对于针对这些文档运行的查询,我正在寻找一种方法来返回 foo $的所有值的数组c $ c>(最好是唯一的值,但可以重复):
For a query run against these documents, I'm looking for a way to return an array of all values for foo
(ideally the unique values, but duplicates are OK):
{
"foo" : [1, 2, 3, 3, 4, 5]
}
我看过进入聚合API中,但如果有可能,我看不到如何实现。我当然可以用代码手动编译结果,但是我可以有成千上万的文档,以这种方式获得结果要干净得多。
I've looked into the aggregations APIs but I can't see how to achieve this, if its at all possible. I could of course compile the results manually in code, however I could have thousands of documents and it would be far cleaner to obtain the result in this manner.
推荐答案
您可以使用脚本式指标聚合,其中包含 reduce_script 。
You can use Scripted Metric Aggregation with a reduce_script.
设置一些测试数据:
curl -XPUT http://localhost:9200/testing/foo/1 -d '{ "foo" : [1, 2, 3] }'
curl -XPUT http://localhost:9200/testing/foo/2 -d '{ "foo" : [4, 5, 6] }'
现在尝试以下聚合:
curl -XGET "http://localhost:9200/testing/foo/_search" -d'
{
"size": 0,
"aggs": {
"fooreduced": {
"scripted_metric": {
"init_script": "_agg[\"result\"] = []",
"map_script": "_agg.result.add(doc[\"foo\"].values)",
"reduce_script": "reduced = []; for (a in _aggs) { for (entry in a) { word = entry.key; reduced += entry.value } }; return reduced.flatten().sort()"
}
}
}
}'
呼叫将返回以下内容:
{
"took": 50,
"timed_out": false,
"_shards": {
"total": 6,
"successful": 6,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 0,
"hits": []
},
"aggregations": {
"fooreduced": {
"value": [
1,
2,
3,
4,
5,
6
]
}
}
}
.flatten()
,但我对groovy的了解还不多找到这样的解决方案。我不能说这种聚合的性能如何,您必须自己进行测试。
It might be possible that there is a solution withoun .flatten()
, but I'm not that much into groovy (yet) to find such a solution. And I can't say how good the performance of this aggregation is, you have to test it for yourself.
这篇关于Elasticsearch-合并多个文档中的字段的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!