多线程链接检查 [英] Link-Checking with Multi-Curl
问题描述
我建立一个链接检查器功能,检查链接是否有代码200/301/302。
Im building a Link Checker function that checks if the link has code 200/301/302.
我想检查大约1000个链接,所以我使用多-CURL功能,我得到所有的标题,代码,URL重定向的URL。
I want to check about 1000 links so i I used the Multi-CURL functionality and i do get all the headers, codes, the URL to which a URL redirected.
问题是Multi-CURL并行执行添加所有的URL到curl_multi_add_handle并返回它获得的结果,忽略其余的结果。
The Problem is that Multi-CURL executes in parallel adding all the URLs to curl_multi_add_handle and returns the results it gets and ignores the rest.
我知道从标题,我得到的结果,但我不知道哪个URL带来了。是否有一个标识符,我针对特定的已执行句柄请求的URL(可能与curl_multi_info_read有关)?
I know from the header which result i got back but i dont know which URL brought it. Is there an Identifier which URL i requested against a specific executed handle (may be something to do with curl_multi_info_read )?
这里我的代码:
$curls = $listofurls;
$curl_arr = array();
$master = curl_multi_init();
for($i = 0; $i < $node_count; $i++) {
$curl_arr[$i] = curl_init($curls[$i][0]);
curl_setopt($curl_arr[$i],CURLOPT_FRESH_CONNECT,true);
curl_setopt($curl_arr[$i],CURLOPT_CONNECTTIMEOUT,10);
curl_setopt($curl_arr[$i],CURLOPT_HEADER,true);
curl_setopt($curl_arr[$i],CURLOPT_CUSTOMREQUEST,'HEAD');
curl_setopt($curl_arr[$i],CURLOPT_RETURNTRANSFER,true);
curl_setopt($curl_arr[$i],CURLOPT_NOBODY,true);
curl_setopt($curl_arr[$i],CURLOPT_AUTOREFERER, 1);
curl_setopt($curl_arr[$i],CURLOPT_TIMEOUT,30);
curl_multi_add_handle($master, $curl_arr[$i]);
}
$finalresult = array();
do{
curl_multi_exec($master, $running);
$info = curl_multi_info_read($master);
if($info['handle']) {
$finalresult[] = curl_multi_getcontent($info['handle']);
curl_multi_remove_handle($master, $info['handle']);
}
$previousActive = $running;
}
while($running > 0);
curl_multi_close($master);
欣赏Help.Thanx。
Appreciate the Help.Thanx.
推荐答案
我得到它解决 - 它的返回Order很重要的理解和结果与结果。对于可能正在寻找答案的人:
I got it solved-Its the returned Order thats important to understand and combine with the result. For someone who may be looking for the answer:
$curls = $listofurls;
$curl_arr = array();
$master = curl_multi_init();
for($i = 0; $i < $node_count; $i++) {
$curl_arr[$i] = curl_init($curls[$i][0]);
curl_setopt($curl_arr[$i],CURLOPT_FRESH_CONNECT,true);
curl_setopt($curl_arr[$i],CURLOPT_CONNECTTIMEOUT,10);
curl_setopt($curl_arr[$i],CURLOPT_HEADER,true);
curl_setopt($curl_arr[$i],CURLOPT_CUSTOMREQUEST,'HEAD');
curl_setopt($curl_arr[$i],CURLOPT_RETURNTRANSFER,true);
curl_setopt($curl_arr[$i],CURLOPT_NOBODY,true);
curl_setopt($curl_arr[$i],CURLOPT_AUTOREFERER, 1);
curl_setopt($curl_arr[$i],CURLOPT_TIMEOUT,30);
curl_multi_add_handle($master, $curl_arr[$i]);
}
$finalresult = array();
$returnedOrder = array();
do{
curl_multi_exec($master, $running);
$info = curl_multi_info_read($master);
if($info['handle']) {
$finalresult[] = curl_multi_getcontent($info['handle']);
$returnedOrder[] = array_search($info['handle'], $curl_arr, true);
curl_multi_remove_handle($master, $info['handle']);
curl_close($curl_arr[end($returnedOrder)]);
}
$previousActive = $running;
}
while($running > 0);
$res = array_combine($returnedOrder, $finalresult);
curl_multi_close($master);
这篇关于多线程链接检查的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!