使用Perl的LibXML解析带有换行符/换行符的XML [英] Parsing XML with line breaks / newline characters with Perl's LibXML

查看:133
本文介绍了使用Perl的LibXML解析带有换行符/换行符的XML的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用Perl的 XML::LibXML 解析一系列XML文件.模块.

I'm trying to parse a series of XML files with Perl's XML::LibXML module.

<log date="2012-08-07 18:05:44.0" level="unit" label="2G-or-3G-server" name="unitnote" value="# Firmware level after downgrade
#
-&amp;gt; show /HOST

 /HOST
    Targets:
        bootmode
        diag
        domain ...."

其中某些值包含脚本执行的输出.当我尝试解析这些值时,最终会得到如下所示的内容:

Where some of the values contain output from the execution of scripts. When I try to parse these values, I end up with something like the following:

my $value  = $log->findvalue('@value');
print "value: $value\n";

输出:

# Firmware level after downgrade    #   -&amp;gt; show /HOST  /HOST  Targets:      bootmode        diag        domain ....

我似乎找不到任何使LibXML尊重换行符的方法.有什么主意吗?

I can't seem to find any way to have LibXML respect newlines. Any idea?

推荐答案

The XML 1.0 Specification says that any whitespace characters in attribute values (space, CR, LF, tab) must be converted to a space before processing

不幸的是,任何运行正常的XML处理器都会给您带来同样的问题

Unfortunately any properly-working XML processor will give you the same problem

这是非常奇怪的XML.它从哪里来的? value属性应真正以PCDATA的形式出现,以便可以对其进行正确处理.有什么方法可以更改获取的数据?

This is very odd XML. Where did it come from? The value attribute should really be presented as PCDATA so that it can be processed properly. Is there any way you can change the data you are getting?

如果有任何方法可以预处理数据,以便用字符引用&#xA;替换换行符,则在处理数据时会将它们转换为LF字符.确实应该通过生成XML的任何方式完成

If there is any way you could preprocess the data so that your newlines are replaced with character references &#xA; then they will be translated to LF characters when the data is processed. This really should be done by whatever is generating the XML

这篇关于使用Perl的LibXML解析带有换行符/换行符的XML的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆