在登录后使用cURL从网站抓取数据? [英] Grabbing data from a website with cURL after logging in?
问题描述
我想做的是登录网站,然后从表中抓取数据,因为它们没有导出功能。到目前为止,我设法登录,它显示我的用户主页。
What I am trying to do is login to a website and then go and grab data from a table since they do not have an export feature. So far I've managed to login and it shows me the user homepage. However I need to navigate to a different page or somehow grab that page while still being logged in with curl.
我的代码到目前为止:
$username="email";
$password="password";
$url="https://jiltapp.com/sessions";
$cookie="cookie.txt";
$url2 = "https://jiltapp.com/shops/shopname/orders";
$postdata = "email=".$username."&password=".$password;
$ch = curl_init();
curl_setopt ($ch, CURLOPT_URL, $url);
curl_setopt ($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt ($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6");
curl_setopt ($ch, CURLOPT_TIMEOUT, 60);
curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_COOKIEJAR, $cookie);
curl_setopt ($ch, CURLOPT_REFERER, $url);
curl_setopt ($ch, CURLOPT_POSTFIELDS, $postdata);
curl_setopt ($ch, CURLOPT_POST, 1);
$result = curl_exec ($ch);
echo $result;
curl_close($ch);
如前所述,我可以访问主用户页面,但是我需要抓取$ url2变量,而不是$ url。
As I mentioned i get access to the main user page, but I need to grab the contents of the $url2 variable, not $url. How can I accomplish something like that?
谢谢!
推荐答案
对于后续的请求,必须设置选项 CURLOPT_COOKIEFILE
,它指向与 CURLOPT_COOKIEJAR
相同的文件。 cURL将从此文件中读取Cookie并将其与请求一起发送。
For subsequent requets, you must set the option CURLOPT_COOKIEFILE
which points to the same file as CURLOPT_COOKIEJAR
. cURL will read cookies from this file and send them with the request.
$username="email";
$password="password";
$url="https://jiltapp.com/sessions";
$cookie="cookie.txt";
$url2 = "https://jiltapp.com/shops/shopname/orders";
$postdata = "email=".$username."&password=".$password;
$ch = curl_init();
curl_setopt ($ch, CURLOPT_URL, $url);
curl_setopt ($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt ($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6");
curl_setopt ($ch, CURLOPT_TIMEOUT, 60);
curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_COOKIEJAR, $cookie);
curl_setopt ($ch, CURLOPT_COOKIEFILE, $cookie); // <-- add this line
curl_setopt ($ch, CURLOPT_REFERER, $url);
curl_setopt ($ch, CURLOPT_POSTFIELDS, $postdata);
curl_setopt ($ch, CURLOPT_POST, 1);
$result = curl_exec ($ch);
echo $result;
// make second request
$url = 'page you want to get data from';
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_POST, 0);
$data = curl_exec($ch);
这篇关于在登录后使用cURL从网站抓取数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!