PHP curl登录无法通过登录页面 [英] PHP curl login couldn't pass login page

查看:121
本文介绍了PHP curl登录无法通过登录页面的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

  $ username ='emp'; 
$ pass ='emp';

$ login = array(
'username'=> $ username,
'password'=> $ pass
);

$ loginUrl ='http://demo.smartjobboard.com/login';

$ ch = curl_init();
curl_setopt($ ch,CURLOPT_SSL_VERIFYPEER,0);
curl_setopt($ ch,CURLOPT_URL,$ loginUrl);
curl_setopt($ ch,CURLOPT_POST,1);
curl_setopt($ ch,CURLOPT_POSTFIELDS,http_build_query($ login));
curl_setopt($ ch,CURLOPT_RETURNTRANSFER,1);

$ content = curl_exec($ ch);


echo $ content;



我使用smartjobboard.com作为例子来测试我的代码,我得到的是登录页面,为什么我无法获取登录页面?我想要抓取需要用户登录的内容。用户名和密码是正确的,但不知道为什么我不能通过。

解决方案

在网站中手动登录,并检查通过浏览器网络监视器发布的内容。也许在你的参数有一个简单的错字?您可以使用F12(Google CHrome或IE)打开网络监视器。然后通过按相应的按钮开始记录(确保它在加载新页面时保留日志),并观看滚动的条目。然后通过打开详细视图登录并查看记录的内容,并观察标题和响应。



在加载登录页面之前,开始记录HTTP请求很重要。有时在登录之前创建一个cookie。



请记住,当不使用浏览器时,需要手动发送Cookie。因此,当您已登录时,请记住在使用CURL时发送Cookie等其他信息。



创建Cookie但是查看网络监视器会发送更多参数:
return_url =& action = login& username = emp& password = emp



尝试:

 <?php 
$ username ='emp';
$ pass ='emp';

$ login = array(
'username'=> $ username,
'password'=> $ pass,
'action'=> login',
'return_url'=>'/ my-account /'
);

$ loginUrl ='http://demo.smartjobboard.com/login';

$ ch = curl_init();
curl_setopt($ ch,CURLOPT_SSL_VERIFYPEER,0);
curl_setopt($ ch,CURLOPT_URL,$ loginUrl);
curl_setopt($ ch,CURLOPT_POST,1);
curl_setopt($ ch,CURLOPT_POSTFIELDS,http_build_query($ login));
curl_setopt($ ch,CURLOPT_RETURNTRANSFER,1);
curl_setopt($ ch,CURLOPT_VERBOSE,1);
curl_setopt($ ch,CURLOPT_HEADER,1);
curl_setopt($ ch,CURLOPT_COOKIEJAR,'cookie.txt');
curl_setopt($ ch,CURLOPT_COOKIEFILE,'cookie.txt');

$ content1 = curl_exec($ ch);

curl_setopt($ ch,CURLOPT_URL,http://demo.smartjobboard.com/my-account/);
curl_setopt($ ch,CURLOPT_COOKIEJAR,'cookie.txt');
curl_setopt($ ch,CURLOPT_COOKIEFILE,'cookie.txt');

$ content2 = curl_exec($ ch);

curl_close($ ch);

echo $ content2;

?>

如果可以,从命令行尝试。然而,返回状态303(参见其他位置)。可以使用CURL的选项CURLOPT_COOKIEJAR和CURLOPT_COOKIEFILE来获取Cookie。请查看手册。



因此,您需要手动执行另一个curl调用,发送接收到的cookie。



我的回应:

$ b $

  HTTP / 1.1 303查看其他
服务器:nginx
日期:Fri,06 Feb 2015 15:53:16 GMT
Content-Type:text / html; charset = utf-8
Content-Length:0
Connection:keep-alive
Keep-Alive:timeout = 35
-By:PHP / 5.3.28
Set-Cookie:PHPSESSID = b33b1a0bd7a3bcd50e5e73671c383182;路径= /
过期时间:Thu,19 Nov 1981 08:52:00 GMT
高速缓存控制:无存储,无高速缓存,必须重新生效,后检查= 0,预检查= 0
Pragma:no-cache
Set-Cookie:PHPSESSID = baf0d249c8fd7795fa1234cbaf16995e; path = /
位置:http://demo.smartjobboard.com/my-account/

  *在DNS缓存中找不到主机名
*尝试96.30.31.40 ...
*连接到demo.smartjobboard.com(96.30.31.40)端口80(#0)
> POST / login HTTP / 1.1
Host:demo.smartjobboard.com
接受:* / *
Content-Length:66
Content-Type:application / x-www-form -urlencoded

*上传完全发送:66个字节
< HTTP / 1.1 303参见其他
*服务器nginx不列入黑名单
<服务器:nginx
<日期:Fri,06 Feb 2015 15:53:16 GMT
< Content-Type:text / html; charset = utf-8
< Content-Length:0
<连接:keep-alive
< Keep-Alive:timeout = 35
< X-Powered-By:PHP / 5.3.28
<设置Cookie:PHPSESSID = b33b1a0bd7a3bcd50e5e73671c383182; path = /
<到期日:Thu,19 Nov 1981 08:52:00 GMT
< Cache-Control:no-store,no-cache,must-revalidate,post-check = 0,pre-check = 0
& Pragma:no-cache
<设置Cookie:PHPSESSID = baf0d249c8fd7795fa1234cbaf16995e; path = /
<位置:http://demo.smartjobboard.com/my-account/
<
* Connection#0 to host demo.smartjobboard.com left intact

位乱码,但不知道为什么)。重定向位置= http://demo.smartjobboard.com/my-account/ 。但您应该解析输出以检测此地址,因此它也适用于其他位置。



我也学习了一些东西。


$username = 'emp';
$pass = 'emp';

$login = array(
    'username' => $username,
    'password' => $pass
);

$loginUrl = 'http://demo.smartjobboard.com/login';

$ch = curl_init();
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_URL, $loginUrl);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, http_build_query($login));
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);

$content = curl_exec($ch);


echo $content;

I used smartjobboard.com as an example to test my code, what I got is the login page, why couldn't I get the login-ed page? I want to scrape content that requires user to login. The username and password is correct but have no idea why I can't get through.

解决方案

Log in manually in the website and check what is exactly posted through the Browsers Network monitor. Maybe there is a simple typo in your parameters? You can open the Network monitor with F12 (Google CHrome or IE). Then start logging by pressing the appropriate button (make sure it preserves the log when a new page is loaded) and watch the entries roll by. Then login and see what is logged by opening the detailed view and watch the headers and response.

It is important that you start logging the HTTP requests before loading the login page. Sometimes a cookie is created before you login. That could give you a hint of what to send.

Remember that cookies need to be sent manually when not using a browser. So when you are logged on, remember to be sending additional information like cookies when using CURL.

Cookies are created but having a look at the network monitor is sends more parameters: return_url=&action=login&username=emp&password=emp

Try this:

<?php
$username = 'emp';
$pass = 'emp';

$login = array(
    'username' => $username,
    'password' => $pass,
    'action' =>  'login',
    'return_url' => '/my-account/'
);

$loginUrl = 'http://demo.smartjobboard.com/login';

$ch = curl_init();
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_URL, $loginUrl);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, http_build_query($login));
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_VERBOSE, 1);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_COOKIEJAR, 'cookie.txt');
curl_setopt($ch, CURLOPT_COOKIEFILE, 'cookie.txt');

$content1 = curl_exec($ch);

curl_setopt($ch, CURLOPT_URL, "http://demo.smartjobboard.com/my-account/");
curl_setopt($ch, CURLOPT_COOKIEJAR, 'cookie.txt');
curl_setopt($ch, CURLOPT_COOKIEFILE, 'cookie.txt');

$content2 = curl_exec($ch);

curl_close($ch);

echo $content2;

?>

This works; try it from a command line if you can. However, a status 303 (see other locatoin) is returned. Retrieving cookies can be done using CURL's option CURLOPT_COOKIEJAR and CURLOPT_COOKIEFILE. Have a look at the manual.

So you need to manually do another curl call probably, sending the received cookie.

Notice the extra options to retrieve the full verbose headers to learn what's happening!

My response:

HTTP/1.1 303 See Other
Server: nginx
Date: Fri, 06 Feb 2015 15:53:16 GMT
Content-Type: text/html;charset=utf-8
Content-Length: 0
Connection: keep-alive
Keep-Alive: timeout=35
X-Powered-By: PHP/5.3.28
Set-Cookie: PHPSESSID=b33b1a0bd7a3bcd50e5e73671c383182; path=/
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Pragma: no-cache
Set-Cookie: PHPSESSID=baf0d249c8fd7795fa1234cbaf16995e; path=/
Location: http://demo.smartjobboard.com/my-account/

and

* Hostname was NOT found in DNS cache
*   Trying 96.30.31.40...
* Connected to demo.smartjobboard.com (96.30.31.40) port 80 (#0)
> POST /login HTTP/1.1
Host: demo.smartjobboard.com
Accept: */*
Content-Length: 66
Content-Type: application/x-www-form-urlencoded

* upload completely sent off: 66 out of 66 bytes
< HTTP/1.1 303 See Other
* Server nginx is not blacklisted
< Server: nginx
< Date: Fri, 06 Feb 2015 15:53:16 GMT
< Content-Type: text/html;charset=utf-8
< Content-Length: 0
< Connection: keep-alive
< Keep-Alive: timeout=35
< X-Powered-By: PHP/5.3.28
< Set-Cookie: PHPSESSID=b33b1a0bd7a3bcd50e5e73671c383182; path=/
< Expires: Thu, 19 Nov 1981 08:52:00 GMT
< Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
< Pragma: no-cache
< Set-Cookie: PHPSESSID=baf0d249c8fd7795fa1234cbaf16995e; path=/
< Location: http://demo.smartjobboard.com/my-account/
< 
* Connection #0 to host demo.smartjobboard.com left intact

(location is a bit garbled, but don't know why). Redirect location = http://demo.smartjobboard.com/my-account/. But you should parse the output to detect this address, so it works for other locations as well.

And I learned something as well ;).

这篇关于PHP curl登录无法通过登录页面的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆