使用curl进行分页的asp .net页面的爬取 [英] Scraping of asp .Net Page Having Pagination Using Curl
问题描述
感谢您的帮助.
我是第一次抓屏并在asp .Net页面上工作.在页面上,有分页.我可以抓取第一页,但不能下一页.分页基于__doPostBack函数.我使用以下代码获取__VIEWSTATE的值:
Hi,
Thanks for help.
I am new to screen scraping and working on an asp .Net page. On the page, there is pagination. I can scrap first page but can''t next ones. Pagination is based on __doPostBack function. I get the value for for __VIEWSTATE using this code:
$file =file_get_contents($url);
preg_match("#<input.*?name=\"__viewstate\".*?value=\"(.*?)\".*?>.*?"
."<input.*?name=\"__eventvalidation\".*?value=\"(.*?)\".*?>#mis",
$file,
$arr_viewstate);
$viewstate = urlencode($arr_viewstate[1]);
$eventvalidation = urlencode($arr_viewstate[2]);
并使用
在curl中发送发帖请求
and Sending post request in curl using
CURLOPT_POSTFIELDS =>
'__EVENTTARGET='
.urlencode('ctl00$ContentPlaceHolderBody$SearchPageNavigationTop$rptPager$ctl01')
.'&__EVENTARGUMENT='
.urlencode('')
.'&__VIEWSTATE='
.$viewstate
.'&__EVENTVALIDATION='
.$eventvalidation
.'&__LASTFOCUS='
.urlencode(''));
因为
because
ctl00$ContentPlaceHolderBody$SearchPageNavigationTop$rptPager$ctl01
当我们单击第2页时,值将传递给__doPostBack函数.
但是,即使我传递第二页或其他页面的值,它也无法正常工作并提供第一页的内容.
所以请引导我.
再次感谢.
value is passed to __doPostBack function when we click on page 2.
But its not working and giving content of 1st page even i pass value for 2nd or other pages.
So please guide me.
Thanks again.
推荐答案
file = file_get_contents(
file =file_get_contents(
url); preg_match(# input.*?name \" = \" > .*?" ." < 输入.*?name = \" = \"
url); preg_match("#<input.*?name=\"__viewstate\".*?value=\"(.*?)\".*?>.*?" ."<input.*?name=\"__eventvalidation\".*?value=\"(.*?)\".*?>#mis",
文件,
这篇关于使用curl进行分页的asp .net页面的爬取的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!