从远程网站拉取HTML内容并显示在页面上 [英] Pull HTML content from remote website and display on page

查看:269
本文介绍了从远程网站拉取HTML内容并显示在页面上的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

现在在这个工作了一段时间,我被骗了。我试图从远程网站页面上的特定div中拉内容,然后将该html插入到我自己的网站上的div。我知道你不能只为这种类型的操作使用jQuery的.ajax,.load或.get方法。

Been working on this for a little while now and am stumped. I am attempting to pull the content from within a specific div on a remote website page and then insert that html into a div on my own website. I know that you cannot solely use jQuery's .ajax, .load, or .get methods for this type of operation.

这是远程页面的HTML:

Here's the remote page's HTML:

<html>
    <body>
        <div class="entry-content">
            <table class="table">
                ...table #1 content...
                ...More table content...
            </table>
            <table class="table">
                ...table #2 content...
            </table>
            <table class="table">
                ...table #3 content...
            </table>
        </div>
    </body>
</html>

目标:
我尝试从远程页面的第一个表。所以,在我的网站上,我想获取以下html并放置在id =fetched-html的div中:

Goal: I am attempting to fetch the html from the remote page's first table. So, on my website, I would like the following html to be fetched and placed in a div of id="fetched-html":

<table class="table">
    ...table #1 content...
    ...More table content...
</table>

这里是我的PHP函数到目前为止:

Here's where I'm at with my PHP function thus far:

<?php
function pullRaspi_SDImageTable() {
    $url = "http://www.raspberrypi.org/downloads";
    $curl = curl_init($url);
    curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
    $output = curl_exec($curl);
    curl_close($curl);

    // Create new PHP DOM document
    $DOM = new DOMDocument;
    // Load html from curl request into document model
    $DOM->loadHTML($output);

    // Get 1st table
    $output = $DOM->firstChild->getElementsByTagName('table');

    return $output;
}
?>

最终结果在我的本地网站页面上应如下所示:

The final result should look like this on my local website page:

<div id="fetched-html">
    <table class="table">
        ...table #1 content...
        ...More table content...
    </table>
</div>

这是另一个PHP函数的可能性吗?

Here's another PHP function possibility?

<?php
function pullRaspPi_SDImageTable() {
    // Url to fetch
    $url = "http://www.raspberrypi.org/downloads";

    $ch = curl_init($url);
    $fp = fopen("raspberrypi_sdimagetable.txt", "w");
    curl_setopt($ch, CURLOPT_FILE, $fp);
    curl_setopt($ch, CURLOPT_HEADER, 0);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    curl_setopt($ch, CURLOPT_BINARYTRANSFER, true);

    // Write html source to variable
    $rasp_sdimagetable = curl_exec($ch);

    // Close curl request
    curl_close($ch);

    return $rasp_sdimagetable;
}

// Then in the head of the html, add this jQuery:
<script type="text/javascript">
    $("#fetched-html").load("<?php pullRaspPi_SDImageTable(); ?> table.table:first");
</script>

问题是,这两个函数都不工作。 (任何想法?

Problem is, neither function works. :( Any thoughts?

推荐答案

从网站中提取HTML的片段很容易, simplehtmldom 你可以这样做:

Extracting a fragment of HTML from a website is a breeze with simplehtmldom you can then do something like:

function pullRaspi_SDImageTable() {
    $filename = '/tmp/downloads.html';  /// Where you want to cache the result
    $expiry = 600;  // 10 minutes
    $output = '';

    if (!file_exists($filename) ||  time() - $expiry > filemtime($filename)) {
        // There is no cache, so fetch the results from remote server
        require_once('simple_html_dom.php');
        $html = file_get_html('http://www.raspberrypi.org/downloads');
        foreach($html->find('div.entry-content table.table') as $elem) {
                $output .= (string)$elem;
        }

        // Store the cache
        file_put_contents($filename, $output);
    } else {
        // Pull the content from the cahce
        $output = file_get_contents($filename);
    }

    return $output;
}

这将给你 table.table HTML

Which will give you the table.table HTML

这篇关于从远程网站拉取HTML内容并显示在页面上的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆