Logstash Grok模式与Python正则表达式? [英] Logstash Grok Pattern vs Python Regex?
问题描述
我正在尝试配置logstash来管理我的各种日志源,其中之一是Mongrel2. Mongrel2使用的格式为tnetstring
,其中日志消息的格式为
I am trying to configure logstash to manage my various log sources, one of which is Mongrel2. The format used by Mongrel2 is tnetstring
, where a log message will take the form
86:9:localhost,12:192.168.33.1,5:57089#10:1411396297#3:GET,1:/,8:HTTP/1.1,3:200#6:145978#]
我想编写自己的grok模式以从上述格式中提取某些字段.我首先在此处上测试我的正则表达式,正则表达式是>
I want to write my own grok patterns to extract certain fields from the above format. I started by testing my regex on the above message here, the regex is
^(?:[^:]*\:){2}([^,]*)
此匹配localhost
.当我在表单中使用相同的正则表达式作为grok模式时
this matches localhost
. When I use the same regex as a grok pattern in the form
TEST ^(?:[^:]*\:){2}([^,]*)
MONGREL %{TEST:test}
并使用以下命令配置logstash
and configure logstash with
filter {
grok {
match => [ "message", "%{MONGREL}" ]
}
}
相同的正则表达式导致匹配86:9:localhost
.我不知道我要去哪里错了?是我用来测试的正则表达式引擎是基于Python,而grok过滤器正则表达式是基于Onigurama吗?
the same regex results in the match 86:9:localhost
. I can't figure out where I am going wrong? Is is that the regex engine I was using to test is based on Python but the grok filter regex is based on Onigurama?
当前在 grokdebug 中进行测试,并输入以下内容
Currently testing it in grokdebug with the following input
86:9:localhost,12:192.168.33.1,5:57089#10:1411396297#3:GET,1:/,8:HTTP/1.1,3:200#6:145978#]
以及以下模式
(?<hostname>^(?:[^:]*\:){2}([^,]*))
导致
{
"hostname": [
[
"86:9:localhost"
]
]
}
我要去的地方
{
"hostname": [
[
"localhost"
]
]
}
推荐答案
类似这样的模式将提取主机名:
A pattern like this will extract the host name:
^(\d+)?:(\d+)?:(?<hostname>[^,]+),
或以与您已经编写的类似方式编写它:
Or writing it in a similar manner that you already wrote it:
^(?:[^:]*\:){2}(?<hostname>[^,]*)
捕获名称必须在要捕获的括号内...您的模式正在捕获到此为止的所有内容.
The capture name needs to be inside the parenthesis that you want to capture... your pattern was capturing everything up to that point.
这篇关于Logstash Grok模式与Python正则表达式?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!