PHP Simple HTML DOM Parser - RSS 中的链接元素 [英] PHP Simple HTML DOM Parser - Link element in RSS

查看:25
本文介绍了PHP Simple HTML DOM Parser - RSS 中的链接元素的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我刚开始使用 PHP Simple HTML DOM Parser (http://simplehtmldom.sourceforge.net/)并且在解析 XML 时遇到一些问题.

我可以完美解析 HTML 文档中的所有链接,但是解析来自 RSS 提要(XML 格式)的链接不起作用.例如,我想解析来自 的所有链接http://www.bing.com/search?q=ipod&count=50&first=0&format=rss 所以我使用这个代码:

$content = file_get_html('http://www.bing.com/search?q=ipod&count=50&first=0&format=rss');foreach($content->find('item') 作为 $entry){$item['title'] = $entry->find('title', 0)->plaintext;$item['description'] = $entry->find('description', 0)->plaintext;$item['link'] = $entry->find('link', 0)->plaintext;$parsed_results_array[] = $item;}print_r($parsed_results_array);

脚本解析标题和描述,但链接元素为空.有任何想法吗?我的猜测是链接"是保留字之类的,那么我如何让解析器工作?

解决方案

我建议您使用正确的工具来完成这项工作.使用 SimpleXML:加上它的内置 :)

$xml = simplexml_load_file('http://www.bing.com/search?q=ipod&count=50&first=0&format=rss');$parsed_results_array = array();foreach($xml 作为 $entry) {foreach($entry->item as $item) {//$parsed_results_array[] = json_decode(json_encode($item), true);$items['title'] = (string) $item->title;$items['description'] = (string) $item->description;$items['link'] = (string) $item->link;$parsed_results_array[] = $items;}}echo '

';print_r($parsed_results_array);

应该产生类似:

数组([0] =>大批([标题] =>苹果 - iPod[说明] =>了解 iPod、Apple TV 等.免费下载 iTunes 并购买 iTunes 礼品卡.查看最受欢迎的电视节目、电影和音乐.[链接] =>http://www.apple.com/ipod/)[1] =>大批([标题] =>iPod - 维基百科,免费的百科全书[说明] =>iPod 是由 Apple Inc. 设计和销售的一系列便携式媒体播放器.第一条生产线于 2001 年 10 月 23 日发布,大约在...之后的 8.5 个月.[链接] =>http://en.wikipedia.org/wiki/iPod)

I just started using PHP Simple HTML DOM Parser (http://simplehtmldom.sourceforge.net/) and have some problems parsing XML.

I can perfectly parse all the links from HTML documents, but parsing links from RSS feeds (XML format) doesn't work. For example, I want to parse all the links from http://www.bing.com/search?q=ipod&count=50&first=0&format=rss so I use this code:

$content = file_get_html('http://www.bing.com/search?q=ipod&count=50&first=0&format=rss');

foreach($content->find('item') as $entry)
{
$item['title']     = $entry->find('title', 0)->plaintext;
$item['description']    = $entry->find('description', 0)->plaintext;
$item['link'] = $entry->find('link', 0)->plaintext;
$parsed_results_array[] = $item;
}

print_r($parsed_results_array);

The script parses title and description but link element is empty. Any ideas? My guess is that "link" is reserved word or something, so how do I get the parser to work?

解决方案

I suggest you use the right tool for this job. Use SimpleXML: Plus, its built-in :)

$xml = simplexml_load_file('http://www.bing.com/search?q=ipod&count=50&first=0&format=rss');
$parsed_results_array = array();
foreach($xml as $entry) {
    foreach($entry->item as $item) {
        // $parsed_results_array[] = json_decode(json_encode($item), true);
        $items['title'] = (string) $item->title;
        $items['description'] = (string) $item->description;
        $items['link'] = (string) $item->link;
        $parsed_results_array[] = $items;
    }
}

echo '<pre>';
print_r($parsed_results_array);

Should yield something like:

Array
(
    [0] => Array
        (
            [title] => Apple - iPod
            [description] => Learn about iPod, Apple TV, and more. Download iTunes for free and purchase iTunes Gift Cards. Check out the most popular TV shows, movies, and music.
            [link] => http://www.apple.com/ipod/
        )

    [1] => Array
        (
            [title] => iPod - Wikipedia, the free encyclopedia
            [description] => The iPod is a line of portable media players designed and marketed by Apple Inc. The first line was released on October 23, 2001, about 8½ months after ...
            [link] => http://en.wikipedia.org/wiki/IPod
        )

这篇关于PHP Simple HTML DOM Parser - RSS 中的链接元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆