Elassandra/Elastic Search中的聚合,日期范围查询 [英] Aggregation, Date range query in Elassandra/Elastic Search

查看:493
本文介绍了Elassandra/Elastic Search中的聚合,日期范围查询的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

搜索日期范围聚合索引时获得不同的结果.

Getting different results while searching on the date range aggregation indexing.

创建如下所示的索引.

curl -XPUT -H 'Content-Type: application/json' 'http://x.x.x.x:9200/date_index' -d '{
  "settings" : { "keyspace" : "keyspace1"},
  "mappings" : {
    "table1" : {
      "discover":"sent_date",
      "properties" : {
        "sent_date" : { "type": "date", "format": "yyyy-MM-dd HH:mm:ssZZ" }
        }
    }
  }
}'

尝试使用以下代码进行搜索时,我得到了不同的日期范围结果.

When trying searching with below code, i am getting different date range results.

    curl -XGET -H 'Content-Type: application/json' 'http://x.x.x.x:9200/date_index/_search?pretty=true' -d '
    {
      "aggs" : {
        "sentdate_range_search" : {
          "date_range" : {
            "field" : "sent_date",
            "time_zone": "UTC",
            "format" : "yyyy-MM-dd HH:mm:ssZZ",
            "ranges" : [
              { "from" : "2010-05-07 11:22:34+0000", "to" : "2011-05-07 11:22:34+0000"}
            ]
      }
    }
  }
}'

样本输出,显示不同的结果,例如2039、2024等.

Sample output, showing different results like 2039, 2024 etc.

{
  "took" : 26,
  "timed_out" : false,
  "_shards" : {
    "total" : 3,
    "successful" : 3,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 417427,
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "date_index",
        "_type" : "table1",
        "_id" : "P89200822_4210021505784",
        "_score" : 1.0,
        "_source" : {
          "sent_date" : "2039-05-22T14:45:39.000Z"
        }
      },
      {
        "_index" : "date_index",
        "_type" : "table1",
        "_id" : "P89200605_4210020537428",
        "_score" : 1.0,
        "_source" : {
           "sent_date" : "2024-06-05T07:20:57.000Z"
        }
      },
      .........
    "aggregations" : {
    "sentdate_range_search" : {
      "buckets" : [
        {
          "key" : "2010-05-07 11:22:34+00:00-2011-05-07 11:22:34+00:00",
          "from" : 1.273231354E12,
          "from_as_string" : "2010-05-07 11:22:34+00:00",
          "to" : 1.304767354E12,
          "to_as_string" : "2011-05-07 11:22:34+00:00",
          "doc_count" : 0
         }
      ]
    }
  }

仅供参考::我使用的是Cassandra数据库中存储的数据,其中"sent_date"字段与UTC时区一起存储.

FYI: I am using the data that was resided in Cassandra Database where the field "sent_date" is stored with UTC timezone.

请告知,谢谢

推荐答案

==基于评论中的会话重新编写的答案==

== Reworked answer based on conversation in the comments ==

汇总与搜索查询不同.汇总将沿指定维度合并记录(即汇总!).问题中的查询将两个指定日期之间的记录聚合到一个存储桶中.有关聚合的更多信息,请参见 Elasticsearch文档

Aggregations are different than search queries. Aggregations combine records (i.e. aggregate!) along specified dimensions. The query in the question aggregates records that fall between the two specified dates into a single bucket. More info on aggregations can be found in the Elasticsearch documentation

由于要求是过滤两个日期之间的记录,因此日期范围过滤器是合适的方法:

Since the requirement is to filter records that fall between two dates, a date range filter is the appropriate approach:

GET date_index/_search
{
   "query": {
       "bool": {
           "filter": {
               "range": {
                   "sent_date": {
                       "gte": "2010-05-07 11:22:34+0000",
                       "lte": "2011-05-07 11:22:34+0000"
                   }
               }
            }
        }
    }
}

为什么要过滤而不是常规查询?筛选器比搜索速度快,因为它们不有助于文档评分,并且可以缓存.您可以结合使用过滤器和搜索功能,例如,获取给定时间范围内与短语所有工作无济于事,使杰克成为一个愚蠢的男孩"相匹配的所有记录.

Why filter instead of regular query? Filters are faster than searches because they don't contribute to document scoring and they're cacheable. You can combine filters and searches to, for example, get all records within the given time range that match the phrase "all work and no play makes jack a dull boy."

这篇关于Elassandra/Elastic Search中的聚合,日期范围查询的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆