Kibana-如何从现有Kubernetes日志中提取字段 [英] Kibana - How to extract fields from existing Kubernetes logs

查看:628
本文介绍了Kibana-如何从现有Kubernetes日志中提取字段的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一种ELK堆栈,使用Fluentd而不是logstash,在Kubernetes集群上作为DaemonSet运行,并将所有容器中的所有日志以logstash格式发送到Elasticsearch服务器.

I have a sort of ELK stack, with fluentd instead of logstash, running as a DaemonSet on a Kubernetes cluster and sending all logs from all containers, in logstash format, to an Elasticsearch server.

在Kubernetes集群上运行的许多容器中,有些是nginx容器,它们输出以下格式的日志:

Out of the many containers running on the Kubernetes cluster some are nginx containers which output logs of the following format:

121.29.251.188 - [16/Feb/2017:09:31:35 +0000] host="subdomain.site.com" req="GET /data/schedule/update?date=2017-03-01&type=monthly&blocked=0 HTTP/1.1" status=200 body_bytes=4433 referer="https://subdomain.site.com/schedule/2589959/edit?location=23092&return=monthly" user_agent="Mozilla/5.0 (Windows NT 6.1; WOW64; rv:51.0) Gecko/20100101 Firefox/51.0" time=0.130 hostname=webapp-3188232752-ly36o

在Kibana中可见的字段如下图所示:

The fields visible in Kibana are as per this screenshot:

在对这种类型的日志建立索引之后是否可以从中提取字段?

Is it possible to extract fields from this type of log after it was indexed?

fluentd收集器配置有以下源代码,该源代码可处理所有容器,因此由于来自不同容器的输出截然不同,因此在此阶段无法强制执行格式:

The fluentd collector is configured with the following source, which handles all containers, so enforcing a format at this stage is not possible due to the very different outputs from different containers:

<source>
  type tail
  path /var/log/containers/*.log
  pos_file /var/log/es-containers.log.pos
  time_format %Y-%m-%dT%H:%M:%S.%NZ
  tag kubernetes.*
  format json
  read_from_head true
</source>

在理想情况下,我想用日志"字段中的元字段(例如主机",要求",状态"等)丰富上面屏幕截图中的可见字段.

In an ideal situation, I would like to enrich the fields visible in the screenshot above with the meta-fields in the "log" field, like "host", "req", "status" etc.

推荐答案

经过几天的研究并习惯了

After a few days of research and getting accustomed to the EFK stack, I arrived to an EFK specific solution, as opposed to that in Darth_Vader's answer, which is only possible on the ELK stack.

总而言之,我使用的是Fluentd而不是Logstash,因此,如果您还安装了 Fluentd Grok插件,我决定不这样做,因为:

So to summarize, I am using Fluentd instead of Logstash, so any grok solution would work if you also install the Fluentd Grok Plugin, which I decided not to do, because:

事实证明,通过使用

As it turns out, Fluentd has its own field extraction functionality through the use of parser filters. To solve the problem in my question, right before the <match **> line, so after the log line object was already enriched with kubernetes metadata fields and labels, I added the following:

<filter kubernetes.var.log.containers.webapp-**.log>
  type parser
  key_name log
  reserve_data yes
  format /^(?<ip>[^-]*) - \[(?<datetime>[^\]]*)\] host="(?<hostname>[^"]*)" req="(?<method>[^ ]*) (?<uri>[^ ]*) (?<http_version>[^"]*)" status=(?<status_code>[^ ]*) body_bytes=(?<body_bytes>[^ ]*) referer="(?<referer>[^"]*)" user_agent="(?<user_agent>[^"]*)" time=(?<req_time>[^ ]*)/
</filter>

说明:

<filter kubernetes.var.log.containers.webapp-**.log>-在与该标签匹配的所有行上应用该块;在我的情况下,Web服务器组件的容器称为webapp- {something}

<filter kubernetes.var.log.containers.webapp-**.log> - apply the block on all the lines matching this label; in my case the containers of the web server component are called webapp-{something}

type parser-告诉fluentd应用解析器过滤器

type parser - tells fluentd to apply a parser filter

key_name log-仅在日志行的log属性上应用模式,而不是在整个行(即json字符串)上应用模式

key_name log - apply the pattern only on the log property of the log line, not the whole line, which is a json string

reserve_data yes-非常重要,如果未指定,则整个日志行对象仅由从format提取的属性替换,因此,如果您已经具有其他属性,例如由kubernetes_metadata过滤器添加的属性,则这些不添加reserve_data选项

reserve_data yes - very important, if not specified the whole log line object is replaced by only the properties extracted from format, so if you already have other properties, like the ones added by the kubernetes_metadata filter, these are removed when not adding the reserve_data option

format-应用于log键的值的正则表达式,以提取命名属性

format - a regex that is applied on the value of the log key to extract named properties

请注意,因为我使用的是Fluentd 1.12,所以该语法与较新的1.14语法不完全兼容,但是该原理将对解析器声明进行一些细微调整.

Please note that I am using Fluentd 1.12, so this syntax is not fully compatible with the newer 1.14 syntax, but the principle will work with minor tweaks to the parser declaration.

这篇关于Kibana-如何从现有Kubernetes日志中提取字段的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆