使用Perl的LibXML解析带有换行符/换行符的XML [英] Parsing XML with line breaks / newline characters with Perl's LibXML
问题描述
我正在尝试使用Perl的 XML::LibXML
解析一系列XML文件.模块.
I'm trying to parse a series of XML files with Perl's XML::LibXML
module.
<log date="2012-08-07 18:05:44.0" level="unit" label="2G-or-3G-server" name="unitnote" value="# Firmware level after downgrade
#
-&gt; show /HOST
/HOST
Targets:
bootmode
diag
domain ...."
其中某些值包含脚本执行的输出.当我尝试解析这些值时,最终会得到如下所示的内容:
Where some of the values contain output from the execution of scripts. When I try to parse these values, I end up with something like the following:
my $value = $log->findvalue('@value');
print "value: $value\n";
输出:
# Firmware level after downgrade # -&gt; show /HOST /HOST Targets: bootmode diag domain ....
我似乎找不到任何使LibXML尊重换行符的方法.有什么主意吗?
I can't seem to find any way to have LibXML respect newlines. Any idea?
推荐答案
The XML 1.0 Specification says that any whitespace characters in attribute values (space, CR, LF, tab) must be converted to a space before processing
不幸的是,任何运行正常的XML处理器都会给您带来同样的问题
Unfortunately any properly-working XML processor will give you the same problem
这是非常奇怪的XML.它从哪里来的? value
属性应真正以PCDATA的形式出现,以便可以对其进行正确处理.有什么方法可以更改获取的数据?
This is very odd XML. Where did it come from? The value
attribute should really be presented as PCDATA so that it can be processed properly. Is there any way you can change the data you are getting?
如果有任何方法可以预处理数据,以便用字符引用

替换换行符,则在处理数据时会将它们转换为LF字符.确实应该通过生成XML的任何方式完成
If there is any way you could preprocess the data so that your newlines are replaced with character references 

then they will be translated to LF characters when the data is processed. This really should be done by whatever is generating the XML
这篇关于使用Perl的LibXML解析带有换行符/换行符的XML的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!