使用CURL和PHP的麻烦,通过代理获取Google搜索结果 [英] Having Trouble Using CURL and PHP to Get Google Search Results Through a Proxy
本文介绍了使用CURL和PHP的麻烦,通过代理获取Google搜索结果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
此脚本在获取google.com时可正常运行,但不能通过google.com/search?q=test运行。当我不使用CURLOPT_FOLLOWLOCATION,我得到一个302移动。当我使用它,我得到一个页面,要求我输入一个验证码。我试过几个不同的美国代理,并改变了用户代理字符串。有没有什么我在这里失踪?
This script works fine when getting google.com but not with google.com/search?q=test. When I don't use CURLOPT_FOLLOWLOCATION, I get a 302 Moved. When I do use it, I get a page asking me to input a captcha. I've tried several different U.S. based proxies and have varied the user agent string. Is there something I'm missing here?
function my_fetch($url,$proxy,$user_agent='Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.8) Gecko/2009032609 Firefox/3.0.8')
{
$ch = curl_init();
curl_setopt ($ch, CURLOPT_URL, $url);
curl_setopt ($ch, CURLOPT_PROXY, $proxy);
curl_setopt ($ch, CURLOPT_USERAGENT, $user_agent);
curl_setopt ($ch, CURLOPT_HEADER, 0);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_REFERER, 'http://www.google.com/');
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
curl_setopt ($ch, CURLOPT_TIMEOUT, 20);
$result = curl_exec ($ch);
curl_close ($ch);
return $result;
}
$url = 'http://www.google.com/search?q=test';
$proxy = '152.26.53.4:80';
echo my_fetch($url,$proxy);
请勿回应建议使用API。
Please don't respond with suggestions to use the API instead. The API is not sufficient for my needs.
推荐答案
您可以尝试使用PhantomJS:
You can try to do that with PhantomJS:
var page = require("webpage").create();
var homePage = "http://www.google.com/";
page.open(homePage);
page.onLoadFinished = function(status) {
var url = page.url;
console.log("Status: " + status);
console.log("Loaded: " + url);
page.includeJs("http://code.jquery.com/jquery-1.8.3.min.js", function() {
console.log("Loaded jQuery!");
page.evaluate(function() {
var searchBox = $(".lst");
var searchForm = $("form");
searchBox.val("your query");
searchForm.submit();
});
});
window.setTimeout(
function () {
page.render( 'google.png' );
phantom.exit(0);
},
1000 // wait 5,000ms (5s)
);
};
这篇关于使用CURL和PHP的麻烦,通过代理获取Google搜索结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文