logstash解析日志字段 [英] logstash parse log field
问题描述
我正在尝试从Postfix日志中解析@message
字段并将其提取到多个字段中.
I am trying to parse the @message
field from a Postfix log and extract it into multiple fields.
消息:
<22>Sep 17 19:12:14 postfix/smtp[18852]: 28D40A036B: to=<test@gmail.com>, relay=192.244.100.25[192.244.100.25]:25, delay=0.13, delays=0.01/0.01/0.09/0.02, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as 9030A15D0)
LogStash输出:
{
"@source": "syslog://192.244.100.42/",
"@tags": [
"_grokparsefailure"
],
"@fields": {
"priority": 13,
"severity": 5,
"facility": 1,
"facility_label": "user-level",
"severity_label": "Notice"
},
"@timestamp": "2013-09-17T17:12:06.958Z",
"@source_host": "192.244.100.42",
"@source_path": "/",
"@message": "<22>Sep 17 19:12:14 postfix/smtp[18852]: 28D40A036B: to=<test@gmail.com>, relay=192.244.100.25[192.244.100.25]:25, delay=0.13, delays=0.01/0.01/0.09/0.02, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as 9030A15D0)",
"@type": "syslog"
}
我尝试使用grok解析器,但数据保留在@message
字段中.我想将syslog解析器与正则表达式一起使用.
I've tried to use the grok parser but the data remains in the @message
field. I want to use syslog parser with regular expressions.
我应该遵循哪些步骤来解析@message
字段?
What steps do I follow to parse the @message
field?
推荐答案
虽然我们现在使用Logstash 5.x,但是grok的概念保持不变.
While we're now at Logstash 5.x, the concepts of grok remain the same.
不幸的是,Postfix在日志记录中有一些非常烦人的模式,因为在少数人中已经写了一些模式占了您最终将在Postfix日志中看到的大部分数据.我只会使用其中一些.
Unfortunately Postfix has some really annoying patterns in logging, as in a handful of people have written some patterns that account for most of the data you'll end up seeing in Postfix logs. I will only use a few of them.
关键是要识别消息的组成部分,如果它们符合标准或很流行,则可能已经为其编写了grok过滤器(例如syslog).消息的组成部分您不知道,可以使用grok为其编写过滤器.
The key is to identify components of the message, if they conform to a standard or is largely popular it is likely a grok filter already has been written for it (e.g. syslog). Components of a message you do not know, you can write a filter for with grok.
让我们将消息分解成碎片:
Let's break the message into pieces:
-
<22>Sep 17 19:12:14 postfix/smtp[18852]:
:这几乎是RFC5424 syslog,但缺少 ver(版本)字段.
<22>Sep 17 19:12:14 postfix/smtp[18852]:
: This is very nearly RFC5424 syslog, but it is missing the ver (version) field.
- SYSLOG5424PRI :优先级值
- SYSLOGTIMESTAMP :不言而喻
- SYSLOGPROG :应用程序的名称
- SYSLOG5424PRI: Priority value
- SYSLOGTIMESTAMP: Self explanatory
- SYSLOGPROG: The application's name
28D40A036B: to=<test@gmail.com>, relay=192.244.100.25[192.244.100.25]:25, delay=0.13, delays=0.01/0.01/0.09/0.02, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as 9030A15D0)
:这是Postfix的特定于域的信息.
28D40A036B: to=<test@gmail.com>, relay=192.244.100.25[192.244.100.25]:25, delay=0.13, delays=0.01/0.01/0.09/0.02, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as 9030A15D0)
: This is domain-specific information to Postfix.
- POSTFIX_KEYVALUE_DATA :用作另一个过滤器的组件,以匹配键值数据(例如relay = ...,delay = ...).
- POSTFIX_QUEUEID :不言而喻
- POSTFIX_KEYVALUE :结合POSTFIX_QUEUEID和POSTFIX_KEYVALUE_DATA.
- POSTFIX_SMTP_DELIVERY :使用POSTFIX_KEYVALUE标识上述信息,直到status =,然后是SMTP响应.
- POSTFIX_KEYVALUE_DATA: Used as a component of another filter to match key=value data (such as relay=..., delay=...).
- POSTFIX_QUEUEID: Self explanatory
- POSTFIX_KEYVALUE: Combines the POSTFIX_QUEUEID and POSTFIX_KEYVALUE_DATA.
- POSTFIX_SMTP_DELIVERY: Uses POSTFIX_KEYVALUE to identify the above information until status=, after which is the SMTP response.
过滤器:
filter {
if [type] == "postfix" {
grok {
patterns_dir => "/etc/logstash/patterns"
match => { "message" => "%{SYSLOG5424PRI}%{SYSLOGTIMESTAMP} %{SYSLOGPROG}: %{POSTFIX_SMTP_DELIVERY}" }
}
}
}
将Postfix模式保存在patterns_dir中的位置.
Where you would save the Postfix patterns in the patterns_dir.
输出:
{
"postfix_queueid" => "28D40A036B",
"@timestamp" => 2017-02-23T08:15:32.546Z,
"postfix_smtp_response" => "250 2.0.0 Ok: queued as 9030A15D0",
"port" => 50228,
"postfix_keyvalue_data" => "to=<test@gmail.com>, relay=192.244.100.25[192.244.100.25]:25, delay=0.13, delays=0.01/0.01/0.09/0.02, dsn=2.0.0, status=sent",
"syslog5424_pri" => "22",
"@version" => "1",
"host" => "10.0.2.2",
"pid" => "18852",
"program" => "postfix/smtp",
"message" => "<22>Sep 17 19:12:14 postfix/smtp[18852]: 28D40A036B: to=<test@gmail.com>, relay=192.244.100.25[192.244.100.25]:25, delay=0.13, delays=0.01/0.01/0.09/0.02, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as 9030A15D0)"
}
上述所有grok过滤器都是常见或由其他人撰写以达到目的.幸运的是,有很多人使用Postfix,但是很少有人为此编写过滤器,因为它相当复杂.
All of the above grok filters are either common or written by someone else to serve a purpose. Luckily, many people use Postfix, but few have written filters for it, as it is fairly complex.
Once that is established, you can get pretty crafty with your Logstash configuration.
这篇关于logstash解析日志字段的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!