使用perl XML :: LibXML进行解析 [英] using perl XML::LibXML to parse

查看:136
本文介绍了使用perl XML :: LibXML进行解析的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用perl的XML :: LibXML模块来解析来自设备的XML响应. 看来,成功获取数据的唯一方法是修改设备的XML响应. 这是我从设备获得的XML响应:

I am using perl's XML::LibXML module to parse an XML response from a device. It appears that the only way I can successfully get my data is by modifying the XML response from the device. Here is my XML response from the device:

<chassis-inventory xmlns="http://xml.juniper.net/junos/10.3D0/junos-chassis">

<chassis junosstyle="inventory">

<name>Chassis</name>

<serial-number>JN111863EAFF</serial-number>

<description>VJX1000</description>

<chassis-module>

<name>Midplane</name>

</chassis-module>

<chassis-module>

<name>System IO</name>

</chassis-module>

<chassis-module>

<name>Routing Engine</name>

<description>VJX1000</description>

<chassis-re-disk-module>

<name>ad0</name>

<disk-size>1953</disk-size>

<model>QEMU HARDDISK</model>

<serial-number>QM00001</serial-number>

<description>Hard Disk</description>

</chassis-re-disk-module>

</chassis-module>

<chassis-module>

<name>FPC 0</name>

<chassis-sub-module>

<name>PIC 0</name>

</chassis-sub-module>

</chassis-module>

<chassis-module>

<name>Power Supply 0</name>

</chassis-module>

</chassis>

</chassis-inventory>

这是我用来解析并找到序列号的perl代码,例如:

Here is the perl code I am using to parse and find the serial number for example:

#!/bin/env perl
use strict;
use warnings;
use XML::LibXML;
my $f = ("/var/working/xmlstuff");
sub yeah {
my $ff;
my $f = shift;
open(my $fff,$f);
while(<$fff>) {
$_ =~ s/^\s+$//; 
$_ =~ s/^(<\S+)\s.*?=.*?((?:\/)?>)/$1$2/g;
$ff .= $_;
}
close($fff);
return $ff
}
my $tparse = XML::LibXML->new();
my $ss = $tparse->load_xml( string => &yeah($f));
print map $_->to_literal,$ss->findnodes('/chassis-inventory/chassis/serial-number');

如果我不使用正则表达式替换,则脚本将不会加载任何内容. 我能理解换行符的剥离,但是为什么我必须从XML响应中删除属性,因此仅在以下行时有效:

If I do not use the regex substitution nothing is loaded for the script to parse. I can understand the stripping of newlines, but why do I have to remove the attributes from the XML response, so it only works if these lines:

<chassis-inventory xmlns="http://xml.juniper.net/junos/10.3D0/junos-chassis">

<chassis junosstyle="inventory">

成为这个:

<chassis-inventory>
<chassis>

  1. XML响应或XML :: LibXML模块是否有问题?

  1. Is this a problem with the XML response or with the XML::LibXML module?

是否有一种方法可以忽略文件中没有空行而不使用正则表达式的事实?

Is there a way to have it ignore the fact that there is empty lines in the file without using a regex substitution?

感谢您的帮助.

推荐答案

您的XPATH表达式失败的原因是由于名称空间;您需要根据上下文进行搜索.这是 XML :: libXML文档中的解释:

The reason your XPATH expression is failing is because of the namespace; you need to search in context to that. Here's an explanation from the XML::libXML documentation:

注意名称空间和XPATH:

NOTE ON NAMESPACES AND XPATH:

关于XPath的一个常见错误是假设节点测试包含 默认情况下,没有前缀匹配的元素名称的名称 命名空间.这种假设是错误的-根据XPath规范,例如 节点测试只能匹配不存在的元素(即null) 命名空间.

A common mistake about XPath is to assume that node tests consisting of an element name with no prefix match elements in the default namespace. This assumption is wrong - by XPath specification, such node tests can only match elements that are in no (i.e. null) namespace.

例如,一个不能匹配XHTML的根元素 具有$ node-> find('/html')的文档,因为'/html'仅在以下情况下匹配 根元素没有名称空间,但是所有XHTML元素 属于名称空间 http://www.w3.org/1999/xhtml . (注意 xmlns ="..."名称空间声明也可以在DTD中指定, 由于XML文档看起来像 如果没有默认的名称空间).

So, for example, one cannot match the root element of an XHTML document with $node->find('/html') since '/html' would only match if the root element had no namespace, but all XHTML elements belong to the namespace http://www.w3.org/1999/xhtml. (Note that xmlns="..." namespace declarations can also be specified in a DTD, which makes the situation even worse, since the XML document looks as if there was no default namespace).

要解决此问题,请注册名称空间,然后使用名称空间搜索文档.这是一个适合您的示例:

To deal with this, register the namespace, then search your document using the namespace. Here's an example that should work for you:

#!/bin/env perl
use strict;
use warnings;
use XML::LibXML;

my $xml = XML::LibXML->load_xml( location => '/var/working/xmlstuff');
my $xpc = XML::LibXML::XPathContext->new($xml);
$xpc->registerNs('x', 'http://xml.juniper.net/junos/10.3D0/junos-chassis');

foreach my $node ($xpc->findnodes('/x:chassis-inventory/x:chassis/x:serial-number')) {

    print $node->textContent() . "\n";
}

这篇关于使用perl XML :: LibXML进行解析的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆