弹性搜索查询不同的嵌套值 [英] Elastic Search Query for Distinct Nested Values

查看:71
本文介绍了弹性搜索查询不同的嵌套值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用高级REST客户端进行Elastic Search 6.2.2.假设我在索引"DOCUMENTS"中有两个文档,其类型为"DOCUMENTS"

I am using the High Level REST Client for Elastic Search 6.2.2. Suppose that I have two documents in index "DOCUMENTS" with type "DOCUMENTS" that are

{
   "_id": 1,
   "Name": "John",
   "FunFacts": {
       "FavColor": "Green",
       "Age": 32
   }
},
{
   "_id": 2,
   "Name": "Amy",
   "FunFacts": {
       "FavFood": "Pizza",
       "Age": 33
   }
}

我想找出所有不同的有趣事实及其独特的价值,最终返回

I want to find out all of the distinct fun facts and their distinct values, ultimately returning an end result of

{
    "FavColor": ["Green"],
    "Age": [32, 33],
    "FavFood": ["Pizza"]
}

对于Elastic Search要求不止一个查询是可以的,但是我更喜欢只查询一个.此外,Elastic Search索引可能会变得很大,因此我必须强制在ES实例上执行尽可能多的执行.

It is ok for this to require more than one query to Elastic Search, but I prefer to have only one query. Furthermore, the Elastic Search index may grow to be rather large so I must force as much execution as possible to occur on the ES instance.

此代码似乎产生仅包含 FunFacts 的文档列表,但是我仍然必须自己执行汇总,这是非常不希望的.

This code seems to produce a list of documents containing only FunFacts but I must still perform the aggregation myself, which is very very not desirable.

SearchRequest searchRequest = new SearchRequest("DOCUMENTS");
searchRequest.types("DOCUMENTS");
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
searchSourceBuilder.query(QueryBuilders.matchAllQuery());
String [] includes = new String[1];
includes[0] = "FunFacts";
String [] excludes = new String[1];
excludes[0] = "Name";
searchSourceBuilder.fetchSource(includes, excludes);
searchRequest.source(searchSourceBuilder);

SearchResponse searchResponse =
    restHighLevelClient.search(searchRequest);

有人能指出我正确的方向吗?我注意到几乎所有的Elastic Search文档都是以 curl 命令的形式出现的,这对我没有帮助,因为我不精通将这些命令转换为JAVA.

Can anyone point me in the right direction? I notice that nearly all of the Elastic Search documentation comes in the form of curl commands, which is not helpful for me as I am not well versed enough to translate such commands to JAVA.

这是你的故事情节.由于允许用户决定他们的有趣事实,因此我们无法提前知道 FunFacts 地图中的关键是什么.:/

Here is your plot twist. Since users are allowed to decide what will be their fun facts, we cannot know ahead of time what will be the keys inside of the FunFacts Map. :/

谢谢,马特

推荐答案

您可以使用聚合在一个查询中完成所有操作.假设您的索引中包含以下文件

You can do it all in one query by using aggregations. Assuming you have the following documents in your index

{
   "Name": "Jake",
   "FunFacts": {
       "FavFood": "Burgers",
       "Age": 32
   }
}

{
   "Name": "Amy",
   "FunFacts": {
       "FavFood": "Pizza",
       "Age": 33
   }
}

{
   "Name": "Alex",
   "FunFacts": {
       "FavFood": "Burgers",
       "Age": 28
   }
}

,而您想获得独特的"FavFood"选择,则可以使用以下术语汇总(

, and you want to get the distinct "FavFood" choices, you could do so by using the following terms aggregation (docs on this topic)

{
  "aggs": {
    "disticnt_fun_facts": {
      "terms": { "field": "FunFacts.FavFood" }
    }
  }
}

,这将导致类似的情况

{
  ...
  "hits": { ... },
  "aggregations": {
    "disticnt_fun_facts": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "burgers",
          "doc_count": 2
        },
        {
          "key": "pizza",
          "doc_count": 1
        }
      ]
    }
  }
}

为了简洁起见,我只在结果响应中保留了 aggregations 部分,因此要注意的重要一点是 buckets 数组,该数组代表找到的每个不同的术语, ,以及它们在文档中的出现次数 doc_count .

For brevity purposes I just left the aggregations part on the resulting response, so the important thing to notice is the buckets array, which represent each of the distinct terms found, key, and they number of occurrences within your documents, doc_count.

希望有帮助.

这篇关于弹性搜索查询不同的嵌套值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆