logstash解析日志字段 [英] logstash parse log field

查看:1392
本文介绍了logstash解析日志字段的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试从Postfix日志中解析@message字段并将其提取到多个字段中.

I am trying to parse the @message field from a Postfix log and extract it into multiple fields.

消息:

<22>Sep 17 19:12:14 postfix/smtp[18852]: 28D40A036B: to=<test@gmail.com>, relay=192.244.100.25[192.244.100.25]:25, delay=0.13, delays=0.01/0.01/0.09/0.02, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as 9030A15D0)

LogStash输出:

{
  "@source": "syslog://192.244.100.42/",
  "@tags": [
    "_grokparsefailure"
  ],
  "@fields": {
    "priority": 13,
    "severity": 5,
    "facility": 1,
    "facility_label": "user-level",
    "severity_label": "Notice"
  },
  "@timestamp": "2013-09-17T17:12:06.958Z",
  "@source_host": "192.244.100.42",
  "@source_path": "/",
  "@message": "<22>Sep 17 19:12:14 postfix/smtp[18852]: 28D40A036B: to=<test@gmail.com>, relay=192.244.100.25[192.244.100.25]:25, delay=0.13, delays=0.01/0.01/0.09/0.02, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as 9030A15D0)",
  "@type": "syslog"
}

我尝试使用grok解析器,但数据保留在@message字段中.我想将syslog解析器与正则表达式一起使用.

I've tried to use the grok parser but the data remains in the @message field. I want to use syslog parser with regular expressions.

我应该遵循哪些步骤来解析@message字段?

What steps do I follow to parse the @message field?

推荐答案

虽然我们现在使用Logstash 5.x,但是grok的概念保持不变.

While we're now at Logstash 5.x, the concepts of grok remain the same.

不幸的是,Postfix在日志记录中有一些非常烦人的模式,因为在少数人中已经写了一些模式占了您最终将在Postfix日志中看到的大部分数据.我只会使用其中一些.

Unfortunately Postfix has some really annoying patterns in logging, as in a handful of people have written some patterns that account for most of the data you'll end up seeing in Postfix logs. I will only use a few of them.

关键是要识别消息的组成部分,如果它们符合标准或很流行,则可能已经为其编写了grok过滤器(例如syslog).消息的组成部分您不知道,可以使用grok为其编写过滤器.

The key is to identify components of the message, if they conform to a standard or is largely popular it is likely a grok filter already has been written for it (e.g. syslog). Components of a message you do not know, you can write a filter for with grok.

让我们将消息分解成碎片:

Let's break the message into pieces:

  • <22>Sep 17 19:12:14 postfix/smtp[18852]::这几乎是RFC5424 syslog,但缺少 ver(版本)字段.

  • <22>Sep 17 19:12:14 postfix/smtp[18852]:: This is very nearly RFC5424 syslog, but it is missing the ver (version) field.

  • SYSLOG5424PRI :优先级值
  • SYSLOGTIMESTAMP :不言而喻
  • SYSLOGPROG :应用程序的名称
  • SYSLOG5424PRI: Priority value
  • SYSLOGTIMESTAMP: Self explanatory
  • SYSLOGPROG: The application's name

28D40A036B: to=<test@gmail.com>, relay=192.244.100.25[192.244.100.25]:25, delay=0.13, delays=0.01/0.01/0.09/0.02, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as 9030A15D0):这是Postfix的特定于域的信息.

28D40A036B: to=<test@gmail.com>, relay=192.244.100.25[192.244.100.25]:25, delay=0.13, delays=0.01/0.01/0.09/0.02, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as 9030A15D0): This is domain-specific information to Postfix.

  • POSTFIX_KEYVALUE_DATA :用作另一个过滤器的组件,以匹配键值数据(例如relay = ...,delay = ...).
  • POSTFIX_QUEUEID :不言而喻
  • POSTFIX_KEYVALUE :结合POSTFIX_QUEUEID和POSTFIX_KEYVALUE_DATA.
  • POSTFIX_SMTP_DELIVERY :使用POSTFIX_KEYVALUE标识上述信息,直到status =,然后是SMTP响应.
  • POSTFIX_KEYVALUE_DATA: Used as a component of another filter to match key=value data (such as relay=..., delay=...).
  • POSTFIX_QUEUEID: Self explanatory
  • POSTFIX_KEYVALUE: Combines the POSTFIX_QUEUEID and POSTFIX_KEYVALUE_DATA.
  • POSTFIX_SMTP_DELIVERY: Uses POSTFIX_KEYVALUE to identify the above information until status=, after which is the SMTP response.

过滤器:

filter {
    if [type] == "postfix" {
        grok {
            patterns_dir   => "/etc/logstash/patterns"
            match => { "message" => "%{SYSLOG5424PRI}%{SYSLOGTIMESTAMP} %{SYSLOGPROG}: %{POSTFIX_SMTP_DELIVERY}" }
        }
    }
}

将Postfix模式保存在patterns_dir中的位置.

Where you would save the Postfix patterns in the patterns_dir.

输出:

{
    "postfix_queueid" => "28D40A036B",
    "@timestamp" => 2017-02-23T08:15:32.546Z,
    "postfix_smtp_response" => "250 2.0.0 Ok: queued as 9030A15D0",
    "port" => 50228,
    "postfix_keyvalue_data" => "to=<test@gmail.com>, relay=192.244.100.25[192.244.100.25]:25, delay=0.13, delays=0.01/0.01/0.09/0.02, dsn=2.0.0, status=sent",
    "syslog5424_pri" => "22",
    "@version" => "1",
    "host" => "10.0.2.2",
    "pid" => "18852",
    "program" => "postfix/smtp",
    "message" => "<22>Sep 17 19:12:14 postfix/smtp[18852]: 28D40A036B: to=<test@gmail.com>, relay=192.244.100.25[192.244.100.25]:25, delay=0.13, delays=0.01/0.01/0.09/0.02, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as 9030A15D0)"
}

上述所有grok过滤器都是常见或由其他人撰写以达到目的.幸运的是,有很多人使用Postfix,但是很少有人为此编写过滤器,因为它相当复杂.

All of the above grok filters are either common or written by someone else to serve a purpose. Luckily, many people use Postfix, but few have written filters for it, as it is fairly complex.

一旦建立,您将获得漂亮的

Once that is established, you can get pretty crafty with your Logstash configuration.

这篇关于logstash解析日志字段的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆