ElasticSearch的年和每周汇总 [英] Leap year and aggregations per week in ElasticSearch

查看:53
本文介绍了ElasticSearch的年和每周汇总的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我每年进行一次嵌套聚合,然后每年在弹性搜索中进行每周一次.年有53周,但ElasticSearch的结果给出的是year年的最后一周key ="1"而不是"53".如何让ElasticSearch在上周返回53而不是1?

I'm doing a nested aggregation, per year, and then per week each year in elasticsearch. Leap years have 53 weeks, but the result from ElasticSearch gives the last week of a leap year key="1" and not "53". How can I make ElasticSearch return 53 in stead of 1 for the last week?

这是我的查询:

GET _search
    {
  "size": 0,
  "aggs": {
    "activities_per_year": {
      "date_histogram": {
        "field": "start",
        "interval": "1y",
        "format": "yyyy"
      },
      "aggs": {
        "activities_per_week": {
          "date_histogram": {
            "field": "start",
            "interval": "week",
            "format": "w"
          }
        }
      }
    }
  }
}

结果(中间删除的数据):

And the result (removed data in the middle):

"key_as_string": "2008",
           "key": 1199145600000,
           "doc_count": 872,
           "activities_per_week": {
              "buckets": [
                 {
                    "key_as_string": "1",
                    "key": 1199059200000,
                    "doc_count": 6
                 },
                 {
                    "key_as_string": "2",
                    "key": 1199664000000,
                    "doc_count": 5
                 },
                 {
                    "key_as_string": "3",
                    "key": 1200268800000,
                    "doc_count": 15
                 },       {
                    "key_as_string": "51",
                    "key": 1229299200000,
                    "doc_count": 18
                 },
                 {
                    "key_as_string": "52",
                    "key": 1229904000000,
                    "doc_count": 7
                 },
                 {
                    "key_as_string": "1",
                    "key": 1230508800000,
                    "doc_count": 1
                 }
              ]

2008年是a年,上周的值为"key_as_string":"1".我希望它是53,所以我可以将其添加到字典中:)我该怎么做?

2008 is a leap year, and the last week has "key_as_string": "1". I want this to be 53, so I can add it to my dictionary :) How can I do this?

此外,elasticsearch会返回两周的"key_as_string":"1"代表2013年,我认为2013年不是a年吗?

Also, elasticsearch returns two weeks with "key_as_string": "1" for year 2013, and I don't think 2013 is a leap year?

推荐答案

这有一些需要您注意的细微陷阱.首先,Elasticsearch将 Joda Time API 用于与日期时间相关的事情.

This has some subtle gotchas that one needs to be aware of. First of all, Elasticsearch uses Joda Time API for date-time related stuff.

其次,请查看关于什么实际上是周":

Secondly, take a look at this explanation of what actually is a "week":

基于一周的年份是指将日期表示为一周中的某一天的年份,周数和年份(基于周).以下说明是此方法的实现在此使用的ISO8601标准库.

A week based year is one where dates are expressed as a day of week, week number and year (week based). The following description is of the ISO8601 standard used by implementations of this method in this library.

以一周为基础的一周,每周的运行时间从1到52-53.的第一天周定义为星期一,并赋予值1.

Weeks run from 1 to 52-53 in a week based year. The first day of the week is defined as Monday and given the value 1.

一年的第一周定义为一年中至少四天.根据此定义,第1周可能会延续到上一年,而第52/53周可能会延续到下一年.因此,需要周年"字段.

The first week of a year is defined as the first week that has at least four days in the year. As a result of this definition, week 1 may extend into the previous year, and week 52/53 may extend into the following year. Hence the need for the year of weekyear field.

例如,2003-01-01是星期三.这意味着五天,该星期的星期三至星期日是2003年.因此,整个星期为被认为是2003年的第一周.2003年的第一周星期一,即2002年12月30日开始.在2002年.

For example, 2003-01-01 was a Wednesday. This means that five days, Wednesday to Sunday, of that week are in 2003. Thus the whole week is considered to be the first week of 2003. Since all weeks start on Monday, the first week of 2003 started on 2002-12-30, ie. in 2002.

基于星期的年份具有特定的文本格式.2002-12-30(星期一2002年12月30日)表示为2003-W01-1.2003-01-01(2003年1月1日,星期三)将表示为2003-W01-3.

The week based year has a specific text format. 2002-12-30 (Monday 30th December 2002) would be represented as 2003-W01-1. 2003-01-01 (Wednesday 1st January 2003) would be represented as 2003-W01-3.

因此,在您的情况下,您将2008年12月29日视为属于第1周,因为2008年12月29日所在的一周是2008年的3天和2009年的4天.根据上述规则,该周从2009年开始排名第一.这与years年无关.举一个例子,尝试索引31-12-2009和31-12-2015.两者都给你第53周,他们不是they年.

So, in your case, you are seeing 29-12-2008 as belonging to week 1, because Dec 29th 2008 is in a week with three days in 2008 and four days in 2009. According to the above rule, that's week #1 from year 2009. And this has nothing to do with leap years. To give you an example, try indexing 31-12-2009 and 31-12-2015. Both will give you week 53 and they are not leap years.

为更好地了解这些内容,我建议您使用以下格式进行汇总:"format":"x-w --- yyyy-MM-dd" :

To see these things better I suggest the following format for your aggregation: "format": "x-w---yyyy-MM-dd":

{
  "size": 0,
  "aggs": {
    "activities_per_year": {
      "date_histogram": {
        "field": "start",
        "interval": "1y",
        "format": "yyyy"
      },
      "aggs": {
        "activities_per_week": {
          "date_histogram": {
            "field": "start",
            "interval": "week",
            "format": "x-w---yyyy-MM-dd"
          }
        }
      }
    }
  }
}

这篇关于ElasticSearch的年和每周汇总的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆