Logstash XML解析失败 [英] Logstash XML Parse Failed

查看:68
本文介绍了Logstash XML解析失败的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在deviantony/docker-elk映像上运行最新的ELK堆栈6.6.我有以下XML文件,尝试将其解析为ES JSON对象:

I'm running latest ELK stack 6.6 on deviantony/docker-elk image. I have the following XML file which I try to parse into ES JSON object:

<?xml version="1.0" encoding="UTF-8"?>
<root>
    <ChainId>7290027600007</ChainId>
    <SubChainId>001</SubChainId>
    <StoreId>001</StoreId>
    <BikoretNo>9</BikoretNo>
    <DllVerNo>8.0.1.3</DllVerNo>
</root>

我的conf文件是:

input {
  file {
    path => "/usr/share/logstash/logs/example1.xml"
    type => "xml"
    start_position => "beginning"
    sincedb_path => "/dev/null"
    codec => multiline {
      pattern => "<?xml version"
      negate => true
      what => "previous"
    }
  }
}

filter {
    xml {
        source => "message"
        store_xml => false
        xpath => [ "/root/ChainId/text()", "ChainId" ]
    }
}

output {
  elasticsearch {
    hosts => "elasticsearch:9200"
    index => "xml_index"
    manage_template => false
  }
}

我的Logstash输出为:

My Logstash output is:

{
logstash_1       |     "@timestamp" => 2019-03-26T06:45:27.941Z,
logstash_1       |           "tags" => [
logstash_1       |         [0] "multiline"
logstash_1       |     ],
logstash_1       |           "host" => "751b3a8bf341",
logstash_1       |        "ChainId" => [],
logstash_1       |        "message" => "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\r\n<root>\r\n    <ChainId>7290027600007</ChainId>\r\n    <SubChainId>001</SubChainId>\r\n    <StoreId>001</StoreId>\r\n    <BikoretNo>9</BikoretNo>\r\n    <DllVerNo>8.0.1.3</DllVerNo>\r\n</root>\r",
logstash_1       |           "path" => "/usr/share/logstash/logs/example1.xml",
logstash_1       |       "@version" => "1",
logstash_1       |           "type" => "xml"
logstash_1       | }

消息下的

XML正文显示为带有转义符和 \ r \ n 的字符串.XPath ChainId 字段返回空数组.我也尝试了其他XML文件,但结果相同.

XML body under message is showing as a string with escaping and \r\n. XPathChainId field returns empty array. I tried with other XML files as well with same results.

更新:尝试删除 \ r \ n 后,仍然无法获取XPath解析的字段.我的输出是:

Update: After trying to remove \r\n still not getting XPath parsed fields. My output is:

logstash_1       |        "message" => "<?xml version=\"1.0\" encoding=\"UTF-8\"?><root>    <ChainId>7290027600007</ChainId>    <SubChainId>001</SubChainId>    <StoreId>001</StoreId>    <BikoretNo>9</BikoretNo>    <DllVerNo>8.0.1.3</DllVerNo>",
logstash_1       |        "StoreId" => [],
logstash_1       |      "BikoretNo" => [],
logstash_1       |        "ChainId" => [],
logstash_1       |           "type" => "xml",
logstash_1       |           "tags" => [
logstash_1       |         [0] "multiline"
logstash_1       |     ],
logstash_1       |     "@timestamp" => 2019-03-27T20:51:09.575Z,
logstash_1       |       "DllVerNo" => [],
logstash_1       |           "path" => "/usr/share/logstash/logs/example1.xml",
logstash_1       |           "host" => "751b3a8bf341",
logstash_1       |     "SubChainId" => [],
logstash_1       |       "@version" => "1"
logstash_1       | }

推荐答案

请使用 gsub mutate过滤器从邮件中删除特殊字符.

Please use gsub mutate filter to remove special character from message.

mutate { 
        gsub => [ "message", "[\r\n]", "" ] 
    }

将目标设置添加到xml过滤器以放置数据.

Add target setting to xml filter for placing the data.

filter {

    xml{
        source => "message"
        store_xml => false
        target => "root"

    }

}

这是完整的工作logstash conf文件.

Here is the complete working logstash conf file.

input
{
    file
        {
            path => "C:\Users\KZAPAGOL\Desktop\CSV\XMLFile.xml"
            start_position => "beginning"
            sincedb_path => "/dev/null"
            exclude => "*.gz"
            type => "xml"
            codec => multiline {
                    pattern => "<?xml " 
                    negate => "true"
                    what => "previous"
                }
        }
}

filter {

    xml{
        source => "message"
        store_xml => false
        target => "root"
        xpath => [
            "/root/ChainId/text()", "ChainId",
            "/root/SubChainId/text()", "SubChainId",
            "/root/StoreId/text()", "StoreId",
            "/root/BikoretNo/text()", "BikoretNo",
            "/root/DllVerNo/text()", "DllVerNo"
        ]
    }

    mutate { 
        gsub => [ "message", "[\r\n]", "" ] 
    }
}

output{

elasticsearch{
        hosts => ["http://localhost:9200/"]
        index => "parse_xml"
    }

    stdout
    {
        codec => rubydebug
    }
}

输出

{
  "_index": "parse_xml",
  "_type": "doc",
  "_id": "vNj4v2kBZ2Q_C9FO94eF",
  "_version": 1,
  "_score": null,
  "_source": {
    "@timestamp": "2019-03-27T16:25:58.379Z",
    "path": "filePath",
    "tags": [
      "multiline"
    ],
    "ChainId": [
      "7290027600007"
    ],
    "BikoretNo": [
      "9"
    ],
    "DllVerNo": [
      "8.0.1.3"
    ],
    "host": "xxxx",
    "@version": "1",
    "SubChainId": [
      "001"
    ],
    "message": "<?xml version=\"1.0\" encoding=\"UTF-8\"?><root>    <ChainId>7290027600007</ChainId>    <SubChainId>001</SubChainId>    <StoreId>001</StoreId>    <BikoretNo>9</BikoretNo>    <DllVerNo>8.0.1.3</DllVerNo></root>",
    "type": "xml",
    "StoreId": [
      "001"
    ]
  },
  "fields": {
    "@timestamp": [
      "2019-03-27T16:25:58.379Z"
    ]
  },
  "sort": [
    1553703958379
  ]
}

这篇关于Logstash XML解析失败的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆