PHP DOMDocument:如何使用CUSTOM字段名称解析xml / rss标签? [英] PHP DOMDocument : How to parse xml/rss Tags with CUSTOM field names?
问题描述
我要解析以下RSS,例如:
I have the below RSS to parse, something like:
<?xml version="1.0" encoding="utf-8"?>
<rss xmlns:x-wr="http://www.w3.org/2002/12/cal/prod/Apple_Comp_628d9d8459c556fa#" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:x-example="http://www.example.com/rss/x-example" xmlns:x-microsoft="http://schemas.microsoft.com/x-microsoft" xmlns:xCal="urn:ietf:params:xml:ns:xcal" version="2.0">
<channel>
<item>
<title>About Apples</title>
<author>David K. Lowie</title>
<x-trumba:customfield name="description">This is the description about apples</xCal:customfield>
<x-trumba:customfield name="category">Fruits,Food,Apple</xCal:customfield>
</item>
<item>
<title>About Oranges</title>
<author>Marry L. Jones</title>
<x-trumba:customfield name="description">This is the description about oranges</xCal:customfield>
<x-trumba:customfield name="category">Fruits,Food,Orange</xCal:customfield>
</item>
</channel>
</rss>
在PHP中,我只知道如何读取前两个节点,例如:
In PHP, I only know how to read first two nodes, something like:
$rss = new DOMDocument();
$rss->load( "http://www.example.com/books.rss" );
foreach( $rss->getElementsByTagName("item") as $node ) {
echo $node->getElementsByTagName("title")->item(0)->nodeValue,
echo $node->getElementsByTagName("author")->item(0)->nodeValue,
}
但是,这些是问题:
<x-trumba:customfield name="description">This is the description about apples</xCal:customfield>
<x-trumba:customfield name="category">Fruits,Food,Apple</xCal:customfield>
请帮助:
- 如何解析最后一个节点,例如
< x-trumba:customfield name = description>
?
- How to parse the last nodes like
<x-trumba:customfield name="description">
?
(我无法更改RSS源,因为它不受我的控制。)
请帮助。
推荐答案
您的XML无效,前缀 x-trumba为未定义,并且元素的结束标记使用'xCal'前缀,指的是 urn:ietf:params:xml:ns:xcal
。
You XML is invalid, the 'x-trumba' prefix is not defined, and the closing tags of the elements use the 'xCal' prefix, refering to urn:ietf:params:xml:ns:xcal
.
因此,用'xCal'替换开头标签的前缀并为'author'固定结尾标签使XML有效。
So replacing the prefix of the opening tags with 'xCal' and fixing the closing tags for 'author' makes the XML valid.
然后可以注册xCalendar命名空间并使用Xpath来获取自定义字段内容:
Then it is possible to register the xCalendar namespace and use Xpath to fetch the custom field contents:
$rss = new DOMDocument();
$rss->load( "http://www.example.com/books.rss" );
$xpath = new DOMXpath($rss);
$xpath->registerNamespace('x', 'urn:ietf:params:xml:ns:xcal');
foreach( $xpath->evaluate("//item") as $item ) {
echo $xpath->evaluate('string(title)', $item), "\n";
echo $xpath->evaluate('string(x:customfield[@name="description"])', $item), "\n";
}
输出:
About Apples
This is the description about apples
About Oranges
This is the description about oranges
Xpath表达式使用条件( [@ name = description]
)过滤 customfield
元素节点。
The Xpath expression use a condition ([@name="description"]
) to filter the customfield
element nodes.
这篇关于PHP DOMDocument:如何使用CUSTOM字段名称解析xml / rss标签?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!