ElasticSearch使用Java API的日期范围查询汇总 [英] ElasticSearch date range query aggregation using Java API

查看:3107
本文介绍了ElasticSearch使用Java API的日期范围查询汇总的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我拥有cpu使用的文档,其中包含date_time字段。现在我想在日期范围内找到avg cpu的用法。我想出了以下解决方案。如果我是新的弹性搜索,请让我知道,如果有任何提前或更好的方法。

  client.prepareSearch(myindex )。 
setTypes(mytype)。
setQuery(
QueryBuilders.filteredQuery(QueryBuilders.matchAllQuery(),
FilterBuilders.andFilter(FilterBuilders.termFilter(server,x),
FilterBuilders.rangeFilter(date_time ).from(fdate).to(tdate))))get()

返回我作为预期文件,从/到期限范围内。现在我尝试做的是使用 SearchHits 找到所有这些文档的唯一日期,并将这个独特的日期组合存储在 HashSet 现在对于 HashSet 中的所有项目我执行以下查询

  client.prepareSearch( myindex)。 
setTypes(mytype)。
setQuery(
QueryBuilders.filteredQuery(QueryBuilders.matchAllQuery(),
FilterBuilders.andFilter(FilterBuilders.termFilter(server,x),
FilterBuilders.termFilter(date_time ), dateinputfromloop)))。
addAggregation(AggregationBuilders.avg(cpu_agg)。field(cpu_time))
.get()

现在上面的查询工作正常,给出输出,我得到每个日期时间组合的平均CPU。我想知道这些是否是任何更好的方法,因为我在循环中执行上述查询所有日期组合。请指导提前感谢。

解决方案

所以,我想你可以每天都有数据,
你可以使用<一个href =http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-datehistogram-aggregation.html =noreferrer> date_histogram 为此聚合。所以,你的两个请求可以在单个请求中完成。



这里是代码,

  client.prepareSearch(myindex )。 
setTypes(mytype)。
setQuery(
QueryBuilders.filteredQuery(QueryBuilders.matchAllQuery(),
FilterBuilders.andFilter(FilterBuilders.termFilter(server,x),
FilterBuilders.rangeFilter( DATE_TIME )。从( FDATE )。向( tdate))))。
addAggregation(
AggregationBuilders.dateHistogram(dateagg)。field(date_time)。interval(DateHistogram.Interval.DAY)
.subAggregation(
AggregationBuilders.avg( cpu_agg)。字段(cpu_time)


.get();

您可以更改dateHistogram聚合中的间隔,以满足您的需要。



如果您要基于唯一(以毫秒为单位),则可以使用术语聚合,而不是日期直方图聚合。



术语汇总


基于多桶值的聚合,其中桶为
动态构建 - 每个唯一值一个。


希望这有帮助,谢谢。


Hi I have the document for cpu usage with date_time field inside it. Now I would like to find avg cpu usage for the date range. I have come up with the following solution. Please let me know if there are any advance or better approach as I am new to Elastic Search.

client.prepareSearch("myindex").
       setTypes("mytype").
       setQuery(
           QueryBuilders.filteredQuery(QueryBuilders.matchAllQuery(),
           FilterBuilders.andFilter(FilterBuilders.termFilter("server","x"),
           FilterBuilders.rangeFilter(date_time).from(fdate).to(tdate)))).get()

Now above query returns me as expected documents which falls within from/to date range. Now what I try to do is I find all unique dates from these documents using SearchHitsand I store this unique combinations of dates in a HashSet and now for all items inside this HashSet I execute the following query

client.prepareSearch("myindex").
       setTypes("mytype").
       setQuery(
           QueryBuilders.filteredQuery(QueryBuilders.matchAllQuery(),
           FilterBuilders.andFilter(FilterBuilders.termFilter("server","x"),
           FilterBuilders.termFilter(date_time),"dateinputfromloop"))).
       addAggregation(AggregationBuilders.avg("cpu_agg").field("cpu_time"))
       .get()

Now above query works fine and gives output I get avg CPU for each date time combination. I was wondering if these is any better approach as I execute above query in a loop for all date combinations. Please guide thanks in advance.

解决方案

So, I think you can have data for daily, You can use date_histogram aggregation for this. So, that your two request can be done in single request.

Here is code,

client.prepareSearch("myindex").
                setTypes("mytype").
                setQuery(
                        QueryBuilders.filteredQuery(QueryBuilders.matchAllQuery(),
                                FilterBuilders.andFilter(FilterBuilders.termFilter("server","x"),
                                        FilterBuilders.rangeFilter("date_time").from("fdate").to("tdate")))).
                addAggregation(
                        AggregationBuilders.dateHistogram("dateagg").field("date_time").interval(DateHistogram.Interval.DAY)
                                .subAggregation(
                                AggregationBuilders.avg("cpu_agg").field("cpu_time")
                        )
                )
                .get();

You can change interval in dateHistogram aggregation to fit your need.

If you want to be based on unique (in millisecond also), then you can use terms aggregation for date instead of date histogram aggregation.

Terms Aggregation

A multi-bucket value source based aggregation where buckets are dynamically built - one per unique value.

Hope this helps, Thanks.

这篇关于ElasticSearch使用Java API的日期范围查询汇总的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆