Logstash grok过滤器-动态命名字段 [英] Logstash grok filter - name fields dynamically
问题描述
我有以下格式的日志行,并希望提取字段:
I've got log lines in the following format and want to extract fields:
[field1: content1] [field2: content2] [field3: content3] ...
我既不知道字段名,也不知道字段数.
I neither know the field names, nor the number of fields.
我尝试了使用反向引用和sprintf格式,但没有结果:
I tried it with backreferences and the sprintf format but got no results:
match => [ "message", "(?:\[(\w+): %{DATA:\k<-1>}\])+" ] # not working
match => [ "message", "(?:\[%{WORD:fieldname}: %{DATA:%{fieldname}}\])+" ] # not working
这似乎仅适用于一个领域,但不适用于其他领域:
This seems to work for only one field but not more:
match => [ "message", "(?:\[%{WORD:field}: %{DATA:content}\] ?)+" ]
add_field => { "%{field}" => "%{content}" }
kv过滤器也不适用,因为字段的内容可能包含空格.
The kv filter is also not appropriate because the content of the fields may contain whitespaces.
是否有任何插件/策略可以解决此问题?
Is there any plugin / strategy to fix this problem?
推荐答案
Logstash Ruby插件可以为您提供帮助. :)
Logstash Ruby Plugin can help you. :)
这是配置:
input {
stdin {}
}
filter {
ruby {
code => "
fieldArray = event['message'].split('] [')
for field in fieldArray
field = field.delete '['
field = field.delete ']'
result = field.split(': ')
event[result[0]] = result[1]
end
"
}
}
output {
stdout {
codec => rubydebug
}
}
使用您的日志:
[field1: content1] [field2: content2] [field3: content3]
这是输出:
{
"message" => "[field1: content1] [field2: content2] [field3: content3]",
"@version" => "1",
"@timestamp" => "2014-07-07T08:49:28.543Z",
"host" => "abc",
"field1" => "content1",
"field2" => "content2",
"field3" => "content3"
}
我尝试了4个字段,它也可以工作.
I have try with 4 fields, it also works.
请注意,红宝石代码中的event
是logstash事件.您可以使用它来获取所有事件字段,例如message, @timestamp
等.
Please note that the event
in the ruby code is logstash event. You can use it to get all your event field such as message, @timestamp
etc.
享受吧!
这篇关于Logstash grok过滤器-动态命名字段的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!