从URL的file_get_contents即登录网站后,才可访问 [英] file_get_contents from url that is only accessible after log-in to website

查看:310
本文介绍了从URL的file_get_contents即登录网站后,才可访问的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想提出一个PHP脚本,可以从网站获取的页面。想想*的file_get_contents($网址)*。

I would like to make a php script that can capture a page from a website. Think *file_get_contents($url)*.

不过,这个网站需要您的用户名/密码填写登录表格之前,您可以访问任何页面。我想象,一旦登录后,网站将您的浏览器身份验证Cookie,并与每一个浏览器随之而来的请求,会话信息被传递回该网站进行身份验证的访问。

However, this website requires that you fill in a username/password log-in form before you can access any page. I imagine that once logged-in, the website sends your browser an authentication cookie and with every consequent browser request, the session info is passed back to the website to authenticate access.

我想知道我怎么才能从本网站获取并捕获网页模拟浏览器的这种行为与PHP脚本。

I want to know how i can simulate this behavior of the browser with a php script in order to gain access and capture a page from this website.

更具体地讲,我的问题是:

More specifically, my questions are:


  1. 如何发送一个请求
    包含我的登录细节,使
    该网站与会话回复
    信息/饼干

  2. 如何阅读会议
    信息/饼干

  3. 如何回传本次会议
    每随之而来的信息
    请求(* *的file_get_contents,卷曲的),以
    该网站。

感谢。

推荐答案

卷曲,非常适合做pretty。你不需要做什么特别的东西以外设置CURLOPT_COOKIEJAR和CURLOPT_COOKIEFILE选项。一旦你从网站传递的表单域登录cookie将被保存,卷曲将自动使用后续请求相同的cookie如下例子说明。

Curl is pretty well suited to do it. You don't need to do anything special other than set the CURLOPT_COOKIEJAR and CURLOPT_COOKIEFILE options. Once you've logged in by passing the form fields from the site the cookie will be saved and Curl will use that same cookie for subsequent requests automatically as the example below illustrates.

请注意,下面的函数保存cookie来'曲奇/ cookie.txt',所以要确保目录/文件存在并且可以写入。

Note that the function below saves the cookies to 'cookies/cookie.txt' so make sure that directory/file exists and can be written to.

$loginUrl = 'http://example.com/login'; //action from the login form
$loginFields = array('username'=>'user', 'password'=>'pass'); //login form field names and values
$remotePageUrl = 'http://example.com/remotepage.html'; //url of the page you want to save  

$login = getUrl($loginUrl, 'post', $loginFields); //login to the site

$remotePage = getUrl($remotePageUrl); //get the remote page

function getUrl($url, $method='', $vars='') {
    $ch = curl_init();
    if ($method == 'post') {
        curl_setopt($ch, CURLOPT_POST, 1);
        curl_setopt($ch, CURLOPT_POSTFIELDS, $vars);
    }
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
    curl_setopt($ch, CURLOPT_COOKIEJAR, 'cookies/cookies.txt');
    curl_setopt($ch, CURLOPT_COOKIEFILE, 'cookies/cookies.txt');
    $buffer = curl_exec($ch);
    curl_close($ch);
    return $buffer;
}

这篇关于从URL的file_get_contents即登录网站后,才可访问的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆