阻止来自我的网站的cURL请求 [英] Prevent cURL requests from my website
问题描述
我有一个包含大量产品和价格数据库的网站。
我一直在不断搜寻价格。
I have a website containing a large DB of products and prices.
I am being constantly cURLed for prices.
我想避免它带有< noscript>
标记,但是我所能做的就是隐藏内容,机器人仍然可以抓取我的内容。
I thought of preventing it with a <noscript>
tag but all I can do with this is hide the content, bots would still be able to scrape my content.
有没有一种方法可以运行JS测试,以查看是否禁用了JS(以检测机器人)并重定向这些请求(可能在黑名单中)。
Is there a way of running a JS test to see if js is disabled (to detect bots) and redirect these requests, maybe in a blacklist.
这样做会阻止google访问我的网站吗?
Will doing so block google from going through my website?
推荐答案
您需要创建阻止列表并阻止通过访问ips,可以使用以下简单代码轻松地在curl中设置所有标头(包括引荐来源网址和用户代理)
You would need to create a block list and block the ips from accessing the content, all headers including referrer and user agent can be set in curl very easily with the simple following code
$agent = 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)';
$ch = curl_init();
curl_setopt($ch, CURLOPT_USERAGENT, $agent);
curl_setopt($ch, CURLOPT_URL, 'http://www.yoursite.com?data=anydata');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_REFERER, 'http://www.yoursite.com');
$html = curl_exec($ch);
以上内容将使curl请求看起来像来自使用firefox的浏览器的正常连接。
the above will make the curl request look like a normal connection from a browser using firefox.
这篇关于阻止来自我的网站的cURL请求的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!