PHP DOMDocument:如何使用CUSTOM字段名称解析xml / rss标签? [英] PHP DOMDocument : How to parse xml/rss Tags with CUSTOM field names?

查看:51
本文介绍了PHP DOMDocument:如何使用CUSTOM字段名称解析xml / rss标签?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我要解析以下RSS,例如:

I have the below RSS to parse, something like:

<?xml version="1.0" encoding="utf-8"?>
<rss xmlns:x-wr="http://www.w3.org/2002/12/cal/prod/Apple_Comp_628d9d8459c556fa#" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:x-example="http://www.example.com/rss/x-example" xmlns:x-microsoft="http://schemas.microsoft.com/x-microsoft" xmlns:xCal="urn:ietf:params:xml:ns:xcal" version="2.0">
    <channel>
        <item>
            <title>About Apples</title>
            <author>David K. Lowie</title>
            <x-trumba:customfield name="description">This is the description about apples</xCal:customfield>
            <x-trumba:customfield name="category">Fruits,Food,Apple</xCal:customfield>
        </item>
        <item>
            <title>About Oranges</title>
            <author>Marry L. Jones</title>
            <x-trumba:customfield name="description">This is the description about oranges</xCal:customfield>
            <x-trumba:customfield name="category">Fruits,Food,Orange</xCal:customfield>
        </item>
    </channel>
</rss>

在PHP中,我只知道如何读取前两个节点,例如:

In PHP, I only know how to read first two nodes, something like:

$rss = new DOMDocument();
$rss->load( "http://www.example.com/books.rss" );

foreach( $rss->getElementsByTagName("item") as $node ) {
    echo $node->getElementsByTagName("title")->item(0)->nodeValue,
    echo $node->getElementsByTagName("author")->item(0)->nodeValue,
}

但是,这些是问题

<x-trumba:customfield name="description">This is the description about apples</xCal:customfield>
<x-trumba:customfield name="category">Fruits,Food,Apple</xCal:customfield>

请帮助:


  • 如何解析最后一个节点,例如 < x-trumba:customfield name = description>

  • How to parse the last nodes like <x-trumba:customfield name="description"> ?

(我无法更改RSS源,因为它不受我的控制。)

请帮助。

推荐答案

您的XML无效,前缀 x-trumba为未定义,并且元素的结束标记使用'xCal'前缀,指的是 urn:ietf:params:xml:ns:xcal

You XML is invalid, the 'x-trumba' prefix is not defined, and the closing tags of the elements use the 'xCal' prefix, refering to urn:ietf:params:xml:ns:xcal.

因此,用'xCal'替换开头标签的前缀并为'author'固定结尾标签使XML有效。

So replacing the prefix of the opening tags with 'xCal' and fixing the closing tags for 'author' makes the XML valid.

然后可以注册xCalendar命名空间并使用Xpath来获取自定义字段内容:

Then it is possible to register the xCalendar namespace and use Xpath to fetch the custom field contents:

$rss = new DOMDocument();
$rss->load( "http://www.example.com/books.rss" );
$xpath = new DOMXpath($rss);
$xpath->registerNamespace('x', 'urn:ietf:params:xml:ns:xcal');

foreach( $xpath->evaluate("//item") as $item ) {
    echo $xpath->evaluate('string(title)', $item), "\n";
    echo $xpath->evaluate('string(x:customfield[@name="description"])', $item), "\n";
}

输出:

About Apples
This is the description about apples
About Oranges
This is the description about oranges

Xpath表达式使用条件( [@ name = description] )过滤 customfield 元素节点。

The Xpath expression use a condition ([@name="description"]) to filter the customfield element nodes.

这篇关于PHP DOMDocument:如何使用CUSTOM字段名称解析xml / rss标签?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆