在登录后使用cURL从网站抓取数据? [英] Grabbing data from a website with cURL after logging in?

查看:156
本文介绍了在登录后使用cURL从网站抓取数据?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想做的是登录网站,然后从表中抓取数据,因为它们没有导出功能。到目前为止,我设法登录,它显示我的用户主页。

What I am trying to do is login to a website and then go and grab data from a table since they do not have an export feature. So far I've managed to login and it shows me the user homepage. However I need to navigate to a different page or somehow grab that page while still being logged in with curl.

我的代码到目前为止:

$username="email"; 
$password="password"; 
$url="https://jiltapp.com/sessions"; 
$cookie="cookie.txt";
$url2 = "https://jiltapp.com/shops/shopname/orders";

$postdata = "email=".$username."&password=".$password; 

$ch = curl_init(); 
curl_setopt ($ch, CURLOPT_URL, $url); 
curl_setopt ($ch, CURLOPT_SSL_VERIFYPEER, FALSE); 
curl_setopt ($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6"); 
curl_setopt ($ch, CURLOPT_TIMEOUT, 60); 
curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, 1); 
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1); 
curl_setopt ($ch, CURLOPT_COOKIEJAR, $cookie); 
curl_setopt ($ch, CURLOPT_REFERER, $url); 

curl_setopt ($ch, CURLOPT_POSTFIELDS, $postdata); 
curl_setopt ($ch, CURLOPT_POST, 1); 
$result = curl_exec ($ch); 

echo $result;  
curl_close($ch);

如前所述,我可以访问主用户页面,但是我需要抓取$ url2变量,而不是$ url。

As I mentioned i get access to the main user page, but I need to grab the contents of the $url2 variable, not $url. How can I accomplish something like that?

谢谢!

推荐答案

对于后续的请求,必须设置选项 CURLOPT_COOKIEFILE ,它指向与 CURLOPT_COOKIEJAR 相同的文件。 cURL将从此文件中读取Cookie并将其与请求一起发送。

For subsequent requets, you must set the option CURLOPT_COOKIEFILE which points to the same file as CURLOPT_COOKIEJAR. cURL will read cookies from this file and send them with the request.

$username="email"; 
$password="password"; 
$url="https://jiltapp.com/sessions"; 
$cookie="cookie.txt";
$url2 = "https://jiltapp.com/shops/shopname/orders";

$postdata = "email=".$username."&password=".$password; 

$ch = curl_init(); 
curl_setopt ($ch, CURLOPT_URL, $url); 
curl_setopt ($ch, CURLOPT_SSL_VERIFYPEER, FALSE); 
curl_setopt ($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6"); 
curl_setopt ($ch, CURLOPT_TIMEOUT, 60); 
curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, 1); 
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1); 
curl_setopt ($ch, CURLOPT_COOKIEJAR, $cookie); 
curl_setopt ($ch, CURLOPT_COOKIEFILE, $cookie);  // <-- add this line
curl_setopt ($ch, CURLOPT_REFERER, $url); 

curl_setopt ($ch, CURLOPT_POSTFIELDS, $postdata); 
curl_setopt ($ch, CURLOPT_POST, 1); 
$result = curl_exec ($ch); 

echo $result;  

// make second request

$url = 'page you want to get data from';
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_POST, 0);

$data = curl_exec($ch);

这篇关于在登录后使用cURL从网站抓取数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆