在logstash中用grok解析多行JSON [英] Parse multiline JSON with grok in logstash
问题描述
我有一个JSON格式:
I've got a JSON of the format:
{
"SOURCE":"Source A",
"Model":"ModelABC",
"Qty":"3"
}
我试图使用logstash解析这个JSON。基本上我希望logstash输出是一个可以使用kibana分析的key:值对的列表。我以为这可以开箱即可。从很多阅读,我明白我必须使用Grok插件(我仍然不知道json插件是什么)。但是我无法得到所有领域的事件。我得到多个事件(一个甚至对我的JSON的每个属性)。像这样:
I'm trying to parse this JSON using logstash. Basically I want the logstash output to be a list of key:value pairs that I can analyze using kibana. I thought this could be done out of the box. From a lot of reading, I understand I must use the grok plugin (I am still not sure what the json plugin is for). But I am unable to get an event with all the fields. I get multiple events (one even for each attribute of my JSON). Like so:
{
"message" => " \"SOURCE\": \"Source A\",",
"@version" => "1",
"@timestamp" => "2014-08-31T01:26:23.432Z",
"type" => "my-json",
"tags" => [
[0] "tag-json"
],
"host" => "myserver.example.com",
"path" => "/opt/mount/ELK/json/mytestjson.json"
}
{
"message" => " \"Model\": \"ModelABC\",",
"@version" => "1",
"@timestamp" => "2014-08-31T01:26:23.438Z",
"type" => "my-json",
"tags" => [
[0] "tag-json"
],
"host" => "myserver.example.com",
"path" => "/opt/mount/ELK/json/mytestjson.json"
}
{
"message" => " \"Qty\": \"3\",",
"@version" => "1",
"@timestamp" => "2014-08-31T01:26:23.438Z",
"type" => "my-json",
"tags" => [
[0] "tag-json"
],
"host" => "myserver.example.com",
"path" => "/opt/mount/ELK/json/mytestjson.json"
}
我应该使用多行编解码器或json_lines编解码器?如果是这样,我该怎么办?我需要编写我自己的格局模式,还是有一些通用的JSON,这将给我一个事件与键:值对,我得到一个事件上面?我找不到有关这方面的文件。任何帮助将不胜感激。我的配置文件如下所示:
Should I use the multiline codec or the json_lines codec? If so, how can I do that? Do I need to write my own grok pattern or is there something generic for JSONs that will give me ONE EVENT with key:value pairs that I get for one event above? I couldn't find any documentation that sheds light on this. Any help would be appreciated. My conf file is shown below:
input
{
file
{
type => "my-json"
path => ["/opt/mount/ELK/json/mytestjson.json"]
codec => json
tags => "tag-json"
}
}
filter
{
if [type] == "my-json"
{
date { locale => "en" match => [ "RECEIVE-TIMESTAMP", "yyyy-mm-dd HH:mm:ss" ] }
}
}
output
{
elasticsearch
{
host => localhost
}
stdout { codec => rubydebug }
}
推荐答案
一个工作的答案我的问题。我不知道这是否是一个干净的解决方案,但它有助于解析上述类型的多行JSON。
I think I found a working answer to my problem. I am not sure if it's a clean solution, but it helps parse multiline JSONs of the type above.
input
{
file
{
codec => multiline
{
pattern => '^\{'
negate => true
what => previous
}
path => ["/opt/mount/ELK/json/*.json"]
start_position => "beginning"
sincedb_path => "/dev/null"
exclude => "*.gz"
}
}
filter
{
mutate
{
replace => [ "message", "%{message}}" ]
gsub => [ 'message','\n','']
}
if [message] =~ /^{.*}$/
{
json { source => message }
}
}
output
{
stdout { codec => rubydebug }
}
我的mutliline编解码器不处理最后一个括号,因此它不不会显示为JSON到 json {source =>消息}
。因此,突变过滤器:
My mutliline codec doesn't handle the last brace and therefore it doesn't appear as a JSON to json { source => message }
. Hence the mutate filter:
replace => [ "message", "%{message}}" ]
添加缺失的大括号。和
gsub => [ 'message','\n','']
删除 \\\
引入的字符。最后,我有一个单行JSON,可以通过
json {source =>消息}
removes the \n
characters that are introduced. At the end of it, I have a one-line JSON that can be read by json { source => message }
如果将原始的多行JSON转换为单行JSON,请执行以下操作:POST我觉得以上不太干净。
If there's a cleaner/easier way to convert the original multi-line JSON to a one-line JSON, please do POST as I feel the above isn't too clean.
这篇关于在logstash中用grok解析多行JSON的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!