Elasticsearch术语聚合和查询 [英] Elasticsearch terms aggregation and querying

查看:83
本文介绍了Elasticsearch术语聚合和查询的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两种日志消息:

Jul 23 09:24:16 rrr mrr-core[222]: Aweg3AOMTs_1563866656871111.mt processMTMessage() #12798 realtime: 5.684 ms

Jul 23 09:24:18 rrr mrr-core[2222]: Aweg3AOMTs_1563866656871111.0.dn processDN() #7750 realtime: 1.382 ms

第一个消息是已发送的消息,第二个是确认消息已传递的消息.

The first message is kind of sent message and second is message which confirm that message was delivered.

它们之间的区别是后缀,我已将其与"id"分开并可以对其进行查询.

The difference between them is the suffix which I have separated from "id" and can query it.

这些消息被解析并以以下格式存储在elasticsearch中:

These messages are parsed and stored in elasticsearch in following format:

messageId: Aweg3AOMTs_1563866656871111.0.dn
text: Aweg3AOMTs
num1: 1563866656871111
num2: 0
suffix: mt/dn

我想找出哪些消息已成功发送,哪些没有成功.我是Elasticsearch的初学者,所以我真的很努力.

I would like to find out which messages were succesfully delivered and which weren't. I am very beginner in elasticsearch so I'm really struggling.

此刻我正在尝试术语聚合,但是我所能实现的就是以下代码:

I'm trying terms aggregations at the moment but all I could've achieved is this code:

GET /my_index3/_search
{
  "size": 0,
  "aggs": {
    "num1": {
      "terms": {
        "field": "messageId.keyword",
        "include": ".*mt*."
      }
    }
  } 
}

向我显示已发送的消息.我不知道如何在其中添加一些过滤器或子句,以仅显示同时带有mt和dn后缀的消息.

Which shows me the sent messages. I don't know how to add some filter there or clause that could show me only messages having both mt and dn suffix.

如果有人有一个主意,我将非常感激:))

If anyone has an idea I'd be really thankful :))

推荐答案

在messageId.keyword上运行术语聚合不是那么好,因为每个消息都不相同('Aweg3AOMTs_1563866656871111.01.0.dn'与'Aweg3AOMTs_1563866656871111不同.mt').

Running the terms aggregation on messageId.keyword is not that good, as each message is different ('Aweg3AOMTs_1563866656871111.0.dn' is not the same as 'Aweg3AOMTs_1563866656871111.mt').

通过查看docs结构,我认为您最好在num1上运行术语聚合,这是.mt和.dn消息的常见部分.该聚合将为您提供每个唯一num1的邮件数.因此,对于每条收到请求的邮件&响应的计数为2,只有请求的消息的计数为1.

From looking at the docs structure, I think you better run the terms aggregation on num1 which is the common part of the .mt and .dn messages. That aggregation will give you the count of messages for each unique num1. So for each message which got a request & response the count would be 2, a message with only request would have a count of 1.

如果您还想查看数字本身,则可以在其中添加嵌套的聚合,例如大小为1的热门匹配聚合,该聚合将在其中显示num1字段:

If you also want to see the number itself, you can add a nested aggregation inside, like top-hits aggregation with size 1, that would display the num1 field inside:

GET /my_index3/_search {
"size": 0,
"aggs": {
    "num1": {
        "terms": {
            "field": "num1",
            "order": {
                "_count": "desc"
            },
            "aggs": {
                "count_of_distinct_suffix": {
                    "cardinality": {
                        "field": "suffix"
                    },
                    "aggs": {
                        "filter_count_is_2": {
                            "bucket_selector": {
                                "buckets_path": {
                                    "the_doc_count": "_count"
                                },
                                "script": "the_doc_count == 2"
                            }
                        }
                    }
                }
            }
          }
       }
    }
}

这篇关于Elasticsearch术语聚合和查询的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆