Logstash-将嵌套的JSON导入Elasticsearch [英] Logstash - import nested JSON into Elasticsearch

查看:593
本文介绍了Logstash-将嵌套的JSON导入Elasticsearch的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有大量(〜40000个)嵌套的JSON对象,我想将它们插入elasticsearch中一个索引.

JSON对象的结构如下:

    {
    "customerid": "10932"
    "date": "16.08.2006",
    "bez": "xyz",
    "birthdate": "21.05.1990",
    "clientid": "2",
    "address": [
        {
            "addressid": "1",
            "tile": "Mr",
            "street": "main str",
            "valid_to": "21.05.1990",
            "valid_from": "21.05.1990",
        },
        {
            "addressid": "2",
            "title": "Mr",
            "street": "melrose place",
            "valid_to": "21.05.1990",
            "valid_from": "21.05.1990",
        }
      ]
    }

因此JSON字段(在此示例中为地址)可以具有JSON对象数组.

logstash配置看起来像什么将这样的JSON文件/对象导入elasticsearch?该索引的elasticsearch映射应该看起来像JSON的结构. elasticsearch文档ID应设置为customerid.

input {
  stdin {
    id => "JSON_TEST"
  } 
}
filter {
    json{
        source => "customerid"
        ....
        ....    
    }

}
output {
       stdout{}
       elasticsearch {
          hosts => "https://localhost:9200/"
          index => "customers"           
          document_id => "%{customerid}"
       }                                               
}

解决方案

如果您可以控制正在生成的内容,最简单的方法是将输入格式设置为单行json,然后使用json_lines编解码器. /p>

只需将您的stdin更改为

stdin { codec => "json_lines" }

然后它将正常工作:

cat input_file.json | logstash -f json_input.conf

其中input_file.json具有类似

的行

{"customerid":1,"nested": {"json":"here"}}
{"customerid":2,"nested": {"json":there"}}

然后您就不需要json过滤器

I have a large amount (~40000) of nested JSON objects I want to insert into elasticsearch an index.

The JSON objects are structured like this:

    {
    "customerid": "10932"
    "date": "16.08.2006",
    "bez": "xyz",
    "birthdate": "21.05.1990",
    "clientid": "2",
    "address": [
        {
            "addressid": "1",
            "tile": "Mr",
            "street": "main str",
            "valid_to": "21.05.1990",
            "valid_from": "21.05.1990",
        },
        {
            "addressid": "2",
            "title": "Mr",
            "street": "melrose place",
            "valid_to": "21.05.1990",
            "valid_from": "21.05.1990",
        }
      ]
    }

So a JSON field (address in this example) can have an array of JSON objects.

What would a logstash config look like to import JSON files/objects like this into elasticsearch? The elasticsearch mapping for this index should just look like the structure of the JSON. The elasticsearch document id should be set to customerid.

input {
  stdin {
    id => "JSON_TEST"
  } 
}
filter {
    json{
        source => "customerid"
        ....
        ....    
    }

}
output {
       stdout{}
       elasticsearch {
          hosts => "https://localhost:9200/"
          index => "customers"           
          document_id => "%{customerid}"
       }                                               
}

解决方案

If you have control of what's being generated, the easiest thing to do is to format you input as single line json and then use the json_lines codec.

Just change your stdin to

stdin { codec => "json_lines" }

and then it'll just work:

cat input_file.json | logstash -f json_input.conf

where input_file.json has lines like

{"customerid":1,"nested": {"json":"here"}}
{"customerid":2,"nested": {"json":there"}}

And then you won't need the json filter

这篇关于Logstash-将嵌套的JSON导入Elasticsearch的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆