PHP Simple HTML DOM Parser - RSS 中的链接元素 [英] PHP Simple HTML DOM Parser - Link element in RSS
问题描述
我刚开始使用 PHP Simple HTML DOM Parser (http://simplehtmldom.sourceforge.net/)并且在解析 XML 时遇到一些问题.
我可以完美解析 HTML 文档中的所有链接,但是解析来自 RSS 提要(XML 格式)的链接不起作用.例如,我想解析来自 的所有链接http://www.bing.com/search?q=ipod&count=50&first=0&format=rss 所以我使用这个代码:
$content = file_get_html('http://www.bing.com/search?q=ipod&count=50&first=0&format=rss');foreach($content->find('item') 作为 $entry){$item['title'] = $entry->find('title', 0)->plaintext;$item['description'] = $entry->find('description', 0)->plaintext;$item['link'] = $entry->find('link', 0)->plaintext;$parsed_results_array[] = $item;}print_r($parsed_results_array);
脚本解析标题和描述,但链接元素为空.有任何想法吗?我的猜测是链接"是保留字之类的,那么我如何让解析器工作?
我建议您使用正确的工具来完成这项工作.使用 SimpleXML
:加上它的内置 :)
$xml = simplexml_load_file('http://www.bing.com/search?q=ipod&count=50&first=0&format=rss');$parsed_results_array = array();foreach($xml 作为 $entry) {foreach($entry->item as $item) {//$parsed_results_array[] = json_decode(json_encode($item), true);$items['title'] = (string) $item->title;$items['description'] = (string) $item->description;$items['link'] = (string) $item->link;$parsed_results_array[] = $items;}}echo '';print_r($parsed_results_array);
应该产生类似:
数组([0] =>大批([标题] =>苹果 - iPod[说明] =>了解 iPod、Apple TV 等.免费下载 iTunes 并购买 iTunes 礼品卡.查看最受欢迎的电视节目、电影和音乐.[链接] =>http://www.apple.com/ipod/)[1] =>大批([标题] =>iPod - 维基百科,免费的百科全书[说明] =>iPod 是由 Apple Inc. 设计和销售的一系列便携式媒体播放器.第一条生产线于 2001 年 10 月 23 日发布,大约在...之后的 8.5 个月.[链接] =>http://en.wikipedia.org/wiki/iPod)
I just started using PHP Simple HTML DOM Parser (http://simplehtmldom.sourceforge.net/) and have some problems parsing XML.
I can perfectly parse all the links from HTML documents, but parsing links from RSS feeds (XML format) doesn't work. For example, I want to parse all the links from http://www.bing.com/search?q=ipod&count=50&first=0&format=rss so I use this code:
$content = file_get_html('http://www.bing.com/search?q=ipod&count=50&first=0&format=rss');
foreach($content->find('item') as $entry)
{
$item['title'] = $entry->find('title', 0)->plaintext;
$item['description'] = $entry->find('description', 0)->plaintext;
$item['link'] = $entry->find('link', 0)->plaintext;
$parsed_results_array[] = $item;
}
print_r($parsed_results_array);
The script parses title and description but link element is empty. Any ideas? My guess is that "link" is reserved word or something, so how do I get the parser to work?
I suggest you use the right tool for this job. Use SimpleXML
: Plus, its built-in :)
$xml = simplexml_load_file('http://www.bing.com/search?q=ipod&count=50&first=0&format=rss');
$parsed_results_array = array();
foreach($xml as $entry) {
foreach($entry->item as $item) {
// $parsed_results_array[] = json_decode(json_encode($item), true);
$items['title'] = (string) $item->title;
$items['description'] = (string) $item->description;
$items['link'] = (string) $item->link;
$parsed_results_array[] = $items;
}
}
echo '<pre>';
print_r($parsed_results_array);
Should yield something like:
Array
(
[0] => Array
(
[title] => Apple - iPod
[description] => Learn about iPod, Apple TV, and more. Download iTunes for free and purchase iTunes Gift Cards. Check out the most popular TV shows, movies, and music.
[link] => http://www.apple.com/ipod/
)
[1] => Array
(
[title] => iPod - Wikipedia, the free encyclopedia
[description] => The iPod is a line of portable media players designed and marketed by Apple Inc. The first line was released on October 23, 2001, about 8½ months after ...
[link] => http://en.wikipedia.org/wiki/IPod
)
这篇关于PHP Simple HTML DOM Parser - RSS 中的链接元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!