存在xmlns时,XML :: LibXML findnodes()不返回结果 [英] XML::LibXML findnodes() does not return results when xmlns is present

查看:97
本文介绍了存在xmlns时,XML :: LibXML findnodes()不返回结果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用XML :: LibXML :: Reader解析大型文档,并遇到了一个问题,即xmlns属性导致findnodes()失败.我通过添加一个正则表达式来删除xmls属性来修复它,但是我想知道是否有一种不涉及正则表达式的更优雅的解决方案.如果删除正则表达式行($ xml =〜s {xmlns ...),您会看到说"Loc = $ loc"不会产生任何结果.

I'm using XML::LibXML::Reader to parse a large document and have run into an issue whereby the attribute xmlns causes findnodes() to fail. I fixed it by added a regex to remove the xmls attribute but I was wondering if there was a more elegant solution involving no regexes. If you remove the regex line ($xml =~ s{xmlns...) you'll see that say "Loc = $loc" produces no results.

代码如下:

use strict;
use warnings;
use feature qw( say );
use XML::LibXML::Reader qw( XML_READER_TYPE_ELEMENT );

my $xml = <<'__EOI__';
<url xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
    <loc>http://example.com</loc>
    <lastmod>2018-10-19</lastmod>
</url>
__EOI__


$xml =~ s{xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"}{};

my $reader = XML::LibXML::Reader->new( string => $xml);
while ( $reader->read ) {
    next unless $reader->nodeType == XML_READER_TYPE_ELEMENT;
    next unless $reader->name eq 'url';
    my $xml = $reader->readOuterXml;
    my $doc = XML::LibXML->load_xml(string => $xml);
    say "Doc = $doc";
    my ($loc) = $doc->findnodes('//loc');
    say "Loc = $loc";
}

推荐答案

您要求查找名称空间为null且名称为loc的节点.文档中没有此类节点,因此findnodes正确不返回任何内容.

You ask to find nodes with namespace null and with name loc. There are no such nodes in the document, so findnodes correctly returns nothing.

您要查找名称空间为http://www.sitemaps.org/schemas/sitemap/0.9且名称为loc的节点.您可以使用以下方法实现这一目标:

You want to find the nodes with namespace http://www.sitemaps.org/schemas/sitemap/0.9 and with name loc. You can use the following to achieve that:

my $doc = XML::LibXML->load_xml( string => $xml );

my $xpc = XML::LibXML::XPathContext->new();
$xpc->registerNs( sm => 'http://www.sitemaps.org/schemas/sitemap/0.9' );

my ($loc) = $xpc->findnodes('//sm:loc', $doc);

这篇关于存在xmlns时,XML :: LibXML findnodes()不返回结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆