php:使用 cURL 获取 html 源代码 [英] php: Get html source code with cURL

查看:67
本文介绍了php:使用 cURL 获取 html 源代码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何在不使用file_get_contents()的情况下获取http://www.example-webpage.com/file.html的html源代码?

How can I get the html source code of http://www.example-webpage.com/file.html without using file_get_contents()?

我需要知道这一点,因为在某些虚拟主机上 allow_url_fopen 已禁用,因此您无法使用 file_get_contents().是否可以使用 cURL 获取 html 文件的源代码(如果启用了 cURL 支持)?如果是这样,如何?谢谢.

I need to know this because on some webhosts allow_url_fopen is disabled so you can't use file_get_contents(). Is it possible to get the html file's source with cURL (if cURL support is enabled)? If so, how? Thanks.

推荐答案

尝试以下操作:

$ch = curl_init("http://www.example-webpage.com/file.html");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_BINARYTRANSFER, true);
$content = curl_exec($ch);
curl_close($ch);

我只推荐小文件使用.大文件整体读取,容易产生内存错误.

I would only recommend this for small files. Big files are read as a whole and are likely to produce a memory error.

在评论中进行了一些讨论后,我们发现问题在于服务器无法解析主机名,并且该页面另外是一个 HTTPS 资源,因此您的临时解决方案来了(直到您的服务器管理员修复了名称解析).

after some discussion in the comments we found out that the problem was that the server couldn't resolve the host name and the page was in addition a HTTPS resource so here comes your temporary solution (until your server admin fixes the name resolving).

我所做的只是 ping graph.facebook.com 以查看 IP 地址,将主机名替换为 IP 地址,然后手动指定标头.然而,这会导致 SSL 证书无效,因此我们必须禁止对等验证.

what i did is just pinging graph.facebook.com to see the IP address, replace the host name with the IP address and instead specify the header manually. This however renders the SSL certificate invalid so we have to suppress peer verification.

//$url = "https://graph.facebook.com/19165649929?fields=name";
$url = "https://66.220.146.224/19165649929?fields=name";
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_BINARYTRANSFER, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($ch, CURLOPT_HTTPHEADER, array('Host: graph.facebook.com'));
$output = curl_exec($ch);
curl_close($ch); 

请记住,IP 地址可能会更改,这是错误来源.您还应该使用 curl_error(); 进行一些错误处理.

Keep in mind that the IP address might change and this is an error source. you should also do some error handling using curl_error();.

这篇关于php:使用 cURL 获取 html 源代码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆