模仿与卷曲PHP Ajax调用 [英] Mimicking an ajax call with Curl PHP
问题描述
我刮网站使用curl(通过PHP)和一些信息,我想是这是在默认情况下只显示前几的人的产品清单。其余的传递给用户,当他们点击一个按钮来获得产品的完整列表,这将触发一个AJAX调用返回该列表。
下面是一个简单地说,他们使用JS:
标题[__ RequestVerificationToken'] =令牌;
$阿贾克斯({
类型:后,
网址:/ AJAX / getProductList
数据类型:HTML,
数据:JSON.stringify({historyPageIndex:1,displayPeriod:0,productsType:所有}),
的contentType:应用/ JSON的;字符集= UTF-8,
成功:函数(结果){
$(目标)。html的();
$(目标)。html的(结果);
},
beforeSend:功能(XMLHtt prequest){
如果(标题[__ RequestVerificationToken']){
XMLHtt prequest.setRequestHeader(__ RequestVerificationToken,标题[__ RequestVerificationToken']);
}
}
});
下面是我的PHP脚本:
curl_setopt($ CH,CURLOPT_USERAGENT,$的userAgent);
curl_setopt($ CH,CURLOPT_RETURNTRANSFER,真正的);
curl_setopt($ CH,CURLOPT_FOLLOWLOCATION,真正的);
curl_setopt($沟道,CURLOPT_MAXREDIRS,10);
curl_setopt($ CH,CURLOPT_COOKIEFILE,$ cookieLocation);
curl_setopt($ CH,CURLOPT_COOKIEJAR,$ cookieLocation);
curl_setopt($ CH,CURLOPT_POST,假);
curl_setopt($沟道,CURLOPT_URL,'https://www.domain.com/Applications/ViewProducts');
curl_setopt($沟道,CURLOPT_REFERER,'https://www.domain.com/');
$网页= curl_exec($ CH);
$ productsType =修剪(find_by_pattren($网页,'<输入ID =productsTypeNAME =productsType类型=隐藏值=。(*)?'));
$令牌=修剪(find_by_pattren($网页,'<输入名称=__ RequestVerificationToken类型=隐藏值=(*)。?'));
$ postVariables ='productsType ='$ productsType。
'&安培; historyPageIndex = 1
&功放; displayPeriod = 0
&功放; __ RequestVerificationToken ='$令牌。
curl_setopt($ CH,CURLOPT_POST,真正的);
curl_setopt($ CH,CURLOPT_POSTFIELDS,$ postVariables);
curl_setopt($沟道,CURLOPT_URL,'https://www.domain.com/ajax/getProductList');
curl_setopt($沟道,CURLOPT_REFERER,'https://www.domain.com/Applications/ViewProducts');
$网页= curl_exec($ CH);
这将产生一个错误页面的网站。我认为主要的原因可能是:
-
他们检查它是否是一个Ajax请求(不知道如何解决这个问题)
-
该标记需要在头部,而不是在后的变量
你知道吗?
编辑:这里是工作code:
curl_setopt($ CH,CURLOPT_USERAGENT,$的userAgent);
curl_setopt($ CH,CURLOPT_RETURNTRANSFER,真正的);
curl_setopt($ CH,CURLOPT_FOLLOWLOCATION,真正的);
curl_setopt($沟道,CURLOPT_MAXREDIRS,10);
curl_setopt($ CH,CURLOPT_COOKIEFILE,$ cookieLocation);
curl_setopt($ CH,CURLOPT_COOKIEJAR,$ cookieLocation);
curl_setopt($沟道,CURLOPT_URL,'https://www.domain.com/Applications/ViewProducts');
curl_setopt($沟道,CURLOPT_REFERER,'https://www.domain.com/');
$网页= curl_exec($ CH);
$ productsType =修剪(find_by_pattren($网页,'<输入ID =productsTypeNAME =productsType类型=隐藏值=。(*)?'));
$令牌=修剪(find_by_pattren($网页,'<输入名称=__ RequestVerificationToken类型=隐藏值=(*)。?'));
$ postVariables = json_en code(阵列('productsType'=> $ productsType,
historyPageIndex'=> 1,
displayPeriod'=> 0));
curl_setopt($ CH,CURLOPT_POST,真正的);
curl_setopt($ CH,CURLOPT_HTTPHEADER,阵列(X-要求,通过:XMLHtt prequest,内容类型:应用程序/ JSON;字符集= UTF-8,__RequestVerificationToken:$令牌));
curl_setopt($ CH,CURLOPT_POSTFIELDS,$ postVariables);
curl_setopt($沟道,CURLOPT_URL,'https://www.domain.com/ajax/getProductList');
curl_setopt($沟道,CURLOPT_REFERER,'https://www.domain.com/Applications/ViewProducts');
$网页= curl_exec($ CH);
要设置请求验证令牌作为标题,更加紧密地模仿一个AJAX请求,并设置内容类型为JSON,使用CURLOPT_HEADER。
curl_setopt($ CH,CURLOPT_HTTPHEADER,阵列(X-要求,通过:XMLHtt prequest,内容类型:应用程序/ JSON;字符集= UTF-8 __RequestVerificationToken:$令牌));
我也注意到你过多地设置CURLOPT_POST为false,在您的code 7行,那你发送后的数据不是JSON格式。你应该有:
$ postVariables ='{historyPageIndex:1,displayPeriod:0,productsType:全部};
I'm scraping a site using curl (via PHP) and some information I want is a list of products which is by default only showing the first few ones. The rest is passed to the user when they click a button to get the full list of products, which triggers an ajax call to return that list.
Here is in a nutshell the JS they use:
headers['__RequestVerificationToken'] = token;
$.ajax({
type: "post",
url: "/ajax/getProductList",
dataType: 'html',
data: JSON.stringify({ historyPageIndex: 1, displayPeriod: 0, productsType: All }),
contentType: 'application/json; charset=utf-8',
success: function (result) {
$(target).html("");
$(target).html(result);
},
beforeSend: function (XMLHttpRequest) {
if (headers['__RequestVerificationToken']) {
XMLHttpRequest.setRequestHeader("__RequestVerificationToken", headers['__RequestVerificationToken']);
}
}
});
Here is my PHP script:
curl_setopt($ch, CURLOPT_USERAGENT, $userAgent);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_MAXREDIRS, 10);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookieLocation);
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookieLocation);
curl_setopt($ch, CURLOPT_POST, false);
curl_setopt($ch, CURLOPT_URL, 'https://www.domain.com/Applications/ViewProducts');
curl_setopt($ch, CURLOPT_REFERER, 'https://www.domain.com/');
$webpage = curl_exec($ch);
$productsType = trim(find_by_pattren($webpage, '<input id="productsType" name="productsType" type="hidden" value="(.*?)"'));
$token = trim(find_by_pattren($webpage, '<input name="__RequestVerificationToken" type="hidden" value="(.*?)"'));
$postVariables = 'productsType='.$productsType.
'&historyPageIndex=1
&displayPeriod=0
&__RequestVerificationToken='.$token;
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, $postVariables);
curl_setopt($ch, CURLOPT_URL, 'https://www.domain.com/ajax/getProductList');
curl_setopt($ch, CURLOPT_REFERER, 'https://www.domain.com/Applications/ViewProducts');
$webpage = curl_exec($ch);
This produces an error page with the site. I think the main reasons could be that:
They check whether it's an ajax request (no clue how to fix that)
The token needs to be in the header and not in the post variables
Any idea?
EDIT: here is the working code:
curl_setopt($ch, CURLOPT_USERAGENT, $userAgent);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_MAXREDIRS, 10);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookieLocation);
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookieLocation);
curl_setopt($ch, CURLOPT_URL, 'https://www.domain.com/Applications/ViewProducts');
curl_setopt($ch, CURLOPT_REFERER, 'https://www.domain.com/');
$webpage = curl_exec($ch);
$productsType = trim(find_by_pattren($webpage, '<input id="productsType" name="productsType" type="hidden" value="(.*?)"'));
$token = trim(find_by_pattren($webpage, '<input name="__RequestVerificationToken" type="hidden" value="(.*?)"'));
$postVariables = json_encode(array('productsType' => $productsType,
'historyPageIndex' => 1,
'displayPeriod' => 0));
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_HTTPHEADER, array("X-Requested-With: XMLHttpRequest", "Content-Type: application/json; charset=utf-8", "__RequestVerificationToken: $token"));
curl_setopt($ch, CURLOPT_POSTFIELDS, $postVariables);
curl_setopt($ch, CURLOPT_URL, 'https://www.domain.com/ajax/getProductList');
curl_setopt($ch, CURLOPT_REFERER, 'https://www.domain.com/Applications/ViewProducts');
$webpage = curl_exec($ch);
To set the request verification token as a header, more closely mimic an AJAX request, and set the content-type to JSON, use CURLOPT_HEADER.
curl_setopt($ch, CURLOPT_HTTPHEADER, array("X-Requested-With: XMLHttpRequest", "Content-Type: application/json; charset=utf-8", "__RequestVerificationToken: $token"));
I also notice that you're superfluously setting CURLOPT_POST to false on line 7 of your code, and that the post data you're sending isn't in JSON format. You should have:
$postVariables = '{"historyPageIndex":1,"displayPeriod":0,"productsType":"All"}';
这篇关于模仿与卷曲PHP Ajax调用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!