Elasticsearch-查询以从展平结构中获取记录的最新版本 [英] Elasticsearch - query to get latest version of records from a flattened structure

查看:63
本文介绍了Elasticsearch-查询以从展平结构中获取记录的最新版本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一种情况,我想从Elasticsearch中按某个键值分组的索引返回最新的非规范化数据-在以下情况下=> TradeRef.

I have a scenario where I'd like to return the latest de-normalized data from an index in Elasticsearch grouped by a certain key value - in the scenario below => TradeRef.

下面的内容可以更好地描述保留在索引中的数据:

The below paints a better picture of data persisted in the index:

{"Row": "1", "TradeRef": "A", "TradeRefDate": "2019-01-01 13:00", "TradeRefId": "FFF", "MessageId": "XXX", "MessageStatus": "S-Open"}, 
{"Row": "2", "TradeRef": "B", "TradeRefDate": "2019-01-01 13:00", "TradeRefId": "GGG", "MessageId": "YYY", "MessageStatus": "P-Open"},
{"Row": "3", "TradeRef": "C", "TradeRefDate": "2019-01-01 13:00", "TradeRefId": "HHH", "MessageId": "ZZZ", "MessageStatus": "Q-Open"},
{"Row": "4", "TradeRef": "A", "TradeRefDate": "2019-01-01 14:00", "TradeRefId": "III", "MessageId": "AAA", "MessageStatus": "R-Open"},
{"Row": "5", "TradeRef": "B", "TradeRefDate": "2019-01-01 14:00", "TradeRefId": "JJJ", "MessageId": "BBB", "MessageStatus": "T-Open"},
{"Row": "6", "TradeRef": "A", "TradeRefDate": "2019-01-01 14:00", "TradeRefId": "III", "MessageId": "CCC", "MessageStatus": "R-Open"},
{"Row": "7", "TradeRef": "B", "TradeRefDate": "2019-01-01 14:00", "TradeRefId": "JJJ", "MessageId": "DDD", "MessageStatus": "T-Open"}

我希望我的查询返回以下结果,其中第1行和第2行被消除,因为它们引用了贸易参考编号'A'&.较旧的TradeRefDate为(B)(2019-01-01 13:00).

I desire my query to return the following results where rows 1 and 2 are eliminated because they reference Trade Refs 'A' & 'B' with an older TradeRefDate (2019-01-01 13:00).

索引中最近的行包含相同的TradeRef'A'&带有最近TradeRefDate(2019-01-01 14:00)的'B':

More recent rows in the index contain the same TradeRef 'A' & 'B' with a more recent TradeRefDate (2019-01-01 14:00):

{"Row": "3", "TradeRef": "C", "TradeRefDate": "2019-01-01 13:00", "TradeRefId": "HHH", "MessageId": "ZZZ", "MessageStatus": "Q-Open"},
{"Row": "4", "TradeRef": "A", "TradeRefDate": "2019-01-01 14:00", "TradeRefId": "III", "MessageId": "AAA", "MessageStatus": "R-Open"},
{"Row": "5", "TradeRef": "B", "TradeRefDate": "2019-01-01 14:00", "TradeRefId": "JJJ", "MessageId": "BBB", "MessageStatus": "T-Open"},
{"Row": "6", "TradeRef": "A", "TradeRefDate": "2019-01-01 14:00", "TradeRefId": "III", "MessageId": "CCC", "MessageStatus": "R-Open"},
{"Row": "7", "TradeRef": "B", "TradeRefDate": "2019-01-01 14:00", "TradeRefId": "JJJ", "MessageId": "DDD", "MessageStatus": "T-Open"}

任何帮助将不胜感激.我已经尝试过下面的查询,但是它只给我每个TradeRef一行,而不是与最新TradeRef值关联的匹配记录:

Any assistance will be appreciated. I have tried the below query, but it just gives me one row per TradeRef instead of the matching records associated with the latest TradeRef value:

GET /flattened_index_v1/_search
{
  "from": 0,
  "query": {
    "bool": {
      "must": [
        {
          "range": {
            "TradeRefDate": {
              "gte": "2018-09-01T00:00:00",
              "lte": "2019-10-26T00:00:00"
            }
          }
        },
        {
          "exists": {
            "field": "MessageId"
          }
        }
      ]
    }
  },
  "size": 0,
  "aggs": {
    "grp_by_trade_ref": {
      "terms": {
        "field": "TradeRef.keyword",
        "size": 1000
      },
      "aggs": {
        "latest_trecs": {
          "top_hits": {
            "size": 1,
            "sort": [
              {
                "TradeRefDate": {
                  "order": "desc"
                }
              }
            ],
            "_source": {"includes": ["TradeRef", "TradeRefId", "MessageId", "MessageStatus", "TradeRefDate"]}
          }
        }
      }
    }
  }
}

推荐答案

  1. 对关键字进行字词汇总
  2. 对关键字下的日期进行字词汇总.i根据日期从大到小的顺序选择前1个ii.返回top_hits

GET index22/_search
{
  "size": 0,
  "aggs": {
    "TradeRef": {
      "terms": {
        "field": "TradeRef.keyword",
        "size": 10
      },
      "aggs": {
        "RefDate": {
          "terms": {
            "field": "TradeRefDate",
            "order": {
              "_term": "desc"
            },
            "size": 1
          },
          "aggs": {
            "TopDocuments": {
              "top_hits": {
                "size": 10
              }
            }
          }
        }
      }
    }
  }
}

您可以使用(复合聚合)[

EDIT 1: You can use (composite aggregation)[https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-composite-aggregation.html]

在Composite聚合中,您可以使用after_key进行连续分页,即您可以先获取n条记录,然后获取接下来的n条记录,但不能从第1页跳到第3页.

In Composite aggregation you can paginate serially using after_key i.e you can fetch n records then next n records, you cannot jump from page 1 to page 3.

GET index22/_search
{
  "size": 0,
  "aggs": {
    "pagination": {
      "composite": {
       "size": 2,  ---> page_size
        "sources": [
          {
            "TradeRef": {
              "terms": {
                "field": "TradeRef.keyword"
              }
            }
          }
        ]
      },
      "aggs": {
        "RefDate": {
          "terms": {
            "field": "TradeRefDate",
            "order": {
              "_term": "desc"
            },
            "size": 1
          },
          "aggs": {
            "TopDocuments": {
              "top_hits": {
                "size": 10
              }
            }
          }
        }
      }
    }
  }
}

响应:

"aggregations" : {
    "pagination" : {
      "after_key" : {
        "TradeRef" : "B"    ----> use to fetch next set of records.
      },
      "buckets" : [
        {
          "key" : {
            "TradeRef" : "A"
          },
          "doc_count" : 3,
          "RefDate" : {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 1,
            "buckets" : [
              {
                "key" : 50460000,
                "key_as_string" : "1970-01-01 14:00",
                "doc_count" : 2,
                "TopDocuments" : {
                  "hits" : {
                    "total" : {
                      "value" : 2,
                      "relation" : "eq"
                    },
                    "max_score" : 1.0,
                    "hits" : [
                      {
                        "_index" : "index22",
                        "_type" : "_doc",
                        "_id" : "IkBnyG0BwSpwFwW4UeB7",
                        "_score" : 1.0,
                        "_source" : {
                          "Row" : "4",
                          "TradeRef" : "A",
                          "TradeRefDate" : "2019-01-01 14:00",
                          "TradeRefId" : "III",
                          "MessageId" : "AAA",
                          "MessageStatus" : "R-Open"
                        }
                      },
                      {
                        "_index" : "index22",
                        "_type" : "_doc",
                        "_id" : "JEBnyG0BwSpwFwW4ceBs",
                        "_score" : 1.0,
                        "_source" : {
                          "Row" : "6",
                          "TradeRef" : "A",
                          "TradeRefDate" : "2019-01-01 14:00",
                          "TradeRefId" : "III",
                          "MessageId" : "CCC",
                          "MessageStatus" : "R-Open"
                        }
                      }
                    ]
                  }
                }
              }
            ]
          }
        },
        {
          "key" : {
            "TradeRef" : "B"
          },
          "doc_count" : 3,
          "RefDate" : {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 1,
            "buckets" : [
              {
                "key" : 50460000,
                "key_as_string" : "1970-01-01 14:00",
                "doc_count" : 2,
                "TopDocuments" : {
                  "hits" : {
                    "total" : {
                      "value" : 2,
                      "relation" : "eq"
                    },
                    "max_score" : 1.0,
                    "hits" : [
                      {
                        "_index" : "index22",
                        "_type" : "_doc",
                        "_id" : "I0BnyG0BwSpwFwW4V-DW",
                        "_score" : 1.0,
                        "_source" : {
                          "Row" : "5",
                          "TradeRef" : "B",
                          "TradeRefDate" : "2019-01-01 14:00",
                          "TradeRefId" : "JJJ",
                          "MessageId" : "BBB",
                          "MessageStatus" : "T-Open"
                        }
                      },
                      {
                        "_index" : "index22",
                        "_type" : "_doc",
                        "_id" : "JUBnyG0BwSpwFwW4h-Cq",
                        "_score" : 1.0,
                        "_source" : {
                          "Row" : "7",
                          "TradeRef" : "B",
                          "TradeRefDate" : "2019-01-01 14:00",
                          "TradeRefId" : "JJJ",
                          "MessageId" : "DDD",
                          "MessageStatus" : "T-Open"
                        }
                      }
                    ]
                  }
                }
              }
            ]
          }
        }
      ]
    }
  }

这篇关于Elasticsearch-查询以从展平结构中获取记录的最新版本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆