如何在Elasticsearch聚合查询中过滤当前日期减去N天? [英] How to filter current date minus N days in Elasticsearch aggregation query?

查看:238
本文介绍了如何在Elasticsearch聚合查询中过滤当前日期减去N天?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试对Elasticsearch查询中的聚合结果使用过滤器.基本上,我有数百万个具有以下格式的文档:

I am trying to use a filter over an aggregation result in a Elasticsearch query. Basically I have millions of documents with the following format:

{
  "useraccountid": 123456,
  "purchases_history" : {
    "last_updated" : "Sat Apr 27 13:41:46 UTC 2019",
    "purchases" : [
      {
        "purchase_id" : 19854284,
        "purchase_date" : "Jan 11, 2017 7:53:35 PM"
      },
      {
        "purchase_id" : 19854285,
        "purchase_date" : "Jan 12, 2017 7:53:35 PM"
      },
      {
        "purchase_id" : 19854286,
        "purchase_date" : "Jan 13, 2017 7:53:35 PM"
      }
    ]
  }
}

首先,我需要做类似 SELECT useraccountid,max(purchases_history.purchases.purchase_date)FROM my_index GROUP BY useraccountid 的操作,该操作使用以下查询完成,并补充了以下内容管道过滤器以添加 HAVING max(purchases_history.purchases.purchase_date)<getdate()-365 子句,因此我只会获取最近一次购买超过一年的文档(即用户帐户).

And first of all I need to do something like a SELECT useraccountid, max(purchases_history.purchases.purchase_date) FROM my_index GROUP BY useraccountid, which was done using the following query that was also complemented with a pipeline filter to add a HAVING max(purchases_history.purchases.purchase_date) < getdate() - 365 clause so I only get those documents (ie. user account) that last purchased more than one year ago.

GET my_personal_index/_search
{
  "aggs": {
    "buckets": {
      "composite": {
        "size": 1000,
        "sources": [
          {
            "user_account_id": {
              "terms": {
                "field": "useraccountid"
              }
            }
          }
        ]
      },
      "aggs": {
        "max_purchase_date": {
          "max": {
            "field": "purchases_history.purchases.purchase_date"
          }
        },
        "max_purchase_date_filter": {
          "bucket_selector": {
            "buckets_path": { 
              "maxPurchaseDate": "max_purchase_date" 
            },
            "script": {
              "lang": "painless",
              "source": "long now = new Date().getTime(); params.maxPurchaseDate < now - 365"
            }
          }
        }
      }
    }
  }
}

运行此查询时,没有收到任何错误或警告,但结果没有任何意义.我相信,因为当我执行"long now = new Date().getTime(); params.maxPurchaseDate< now-365" 时,也许是在比较香蕉到苹果".由于我实际上不是程序员或技术人员,所以我不知道如何继续绕过并使其正确地过滤汇总日期.

When I run this query I get no errors or warnings but the result makes no sense. I believe because perhaps I am comparing "banana to apple" when I do the "long now = new Date().getTime(); params.maxPurchaseDate < now - 365". As I am not actually a programer or a very technical person I don't know much how to move forward to bypass and make this to filter the aggregated date properly.

这是日期容器块的映射:

Here is the mapping of the date container block:

"purchases_history": {
  "properties": {
    "purchases": {
      "purchase_date": {
        "type": "date",
        "format": "EEE MMM dd HH:mm:ss z yyyy||MMM d, yyyy HH:mm:ss a"
      },
      "purchase_id": {
        "type": "long"
      },
    }
  }
}

有什么建议吗?谢谢.

推荐答案

想到的最简单的方法是将脚本更改为

The most simple that comes to mind is to change your script to

源":"long now = new Date().getTime(); params.maxPurchaseDate> now-365 * 86400000L"

其中每天 86400000 的毫秒数.

请注意,尽管根据 https://www.elastic.co/guide/en/elasticsearch/painless/master/painless-datetime.html

现在的约会时间

在大多数无痛上下文中,当前日期时间,现在,不支持.这有两个主要原因.这首先是脚本通常每个文档运行一次,因此每次脚本运行不同,现在返回.第二个是脚本通常以分布式方式运行,而没有适当的方式现在同步.而是使用以下任一方法传递用户定义的参数字符串日期时间或数字日期时间.数字日期时间为首选,因为无需进行解析以进行比较.

Under most Painless contexts the current datetime, now, is not supported. There are two primary reasons for this. The first is scripts are often run once per document, so each time the script is run a different now is returned. The second is scripts are often run in a distributed fashion without a way to appropriately synchronize now. Instead, pass in a user-defined parameter with either a string datetime or numeric datetime for now. A numeric datetime is preferred as there is no need to parse it for comparison.

更新

更多通用脚本:

long nowMillis = new Date().getTime();
Instant instant = Instant.ofEpochMilli(nowMillis);
ZonedDateTime now = ZonedDateTime.ofInstant(instant, ZoneId.of('Z')); // if you need zones
def limit = now.plusDays(-8);
return params.maxPurchaseDate > limit.toInstant().toEpochMilli();

Date currentDate = new Date();
Calendar c = Calendar.getInstance();
c.setTime(currentDate);
c.add(Calendar.DATE, -7);
return params.maxPurchaseDate > c.getTimeInMillis();

或其他一些Java解决方案也可以工作

or some other java solution might work as well

这篇关于如何在Elasticsearch聚合查询中过滤当前日期减去N天?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆