HEAD 请求收到“403 禁止"而获得“200 ok"? [英] HEAD request receives "403 forbidden" while GET "200 ok"?

查看:39
本文介绍了HEAD 请求收到“403 禁止"而获得“200 ok"?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

该网站在各大搜索引擎的搜索结果中消失了几个月后,我终于找到了一个可能的原因.

after several months having the site disappear from search results in every major search engine, I finally found out a possible reason.

我使用 WebBug 来调查服务器标头.查看请求是 HEAD 还是 GET 的区别.

I used WebBug to investigate server header. See the difference if the request is HEAD or GET.

HEAD 发送数据:

HEAD / HTTP/1.1
Host: www.attu.it
Connection: close
Accept: */*
User-Agent: WebBug/5.0

HEAD 收到数据:

HTTP/1.1 403 Forbidden
Date: Tue, 10 Aug 2010 23:01:00 GMT
Server: Apache/2.2
Connection: close
Content-Type: text/html; charset=iso-8859-1

获取发送的数据:

GET / HTTP/1.1
Host: www.attu.it
Connection: close
Accept: */*
User-Agent: WebBug/5.0

GET 接收到的数据:

HTTP/1.1 200 OK
Date: Tue, 10 Aug 2010 23:06:15 GMT
Server: Apache/2.2
Last-Modified: Fri, 08 Jan 2010 08:58:01 GMT
ETag: "671f91b-2d2-47ca362815840"
Accept-Ranges: bytes
Content-Length: 722
Connection: close
Content-Type: text/html

// HTML code here

现在,浏览器默认发送一个 GET 请求(至少萤火虫是这么说的).爬虫是否可以发送 HEAD 请求?如果是这样,为什么只有这台服务器响应 403,而我正在维护的其他站点的其他服务器没有?

Now, browsers by default send a GET request (at least this is what firebug says). Is it possible that crawlers send a HEAD request instead? If so, why only this server responds with a 403, while other servers from other sites I'm mantaining do not?

如果这很重要,.htaccess 中唯一的一行是(除非我的客户更改了它,因为他们不想让我访问他们的服务器)

In case it's important, the only line present in .htaccess is (unless my client changed it, as they don't want to give me access to their server)

AddType text/x-component .htc

更新
谢谢@Ryk.FireBug 和 Fiddler 都发送 GET 请求,得到 200(或 300)个响应.正如预期的那样.所以我猜这要么是服务器设置错误(尽管这很奇怪,因为托管来自一家拥有数百万客户的大公司)或者他们放在 .htaccess 中的东西.他们将不得不让我查看他们的帐户.

UPDATE
Thanks @Ryk. FireBug and Fiddler both send GET requests, which get 200 (or 300) responses. As expected. So I guess it's either a server bad setting (even though it's strange as the hosting is from a major company with millions of clients) or something they put in the .htaccess. They will have to let me look into their account.

我的问题的第二部分是这是否可能是该网站没有出现在任何搜索引擎中的原因(site:www.attu.it 没有给出任何结果).有什么想法吗?

The second part of my question was if that could be the cause of the website not appearing in any search engine (site:www.attu.it gives no results). Any thought?

更新 2
经过一番摆弄,结果发现根目录中有 phpMyAdmin robots-blocking .htaccess,这导致任何来自 robots 的请求都被发送回 403 Forbidden

UPDATE 2
After some fiddling around, it turns out there was the phpMyAdmin robots-blocking .htaccess in the root directory, that caused any request from robots to be sent back with a 403 Forbidden

推荐答案

我建议安装 Fiddler 并仔细查看请求.我有时会看到页面上需要身份验证的文件夹中的图标会导致返回 403.

I would suggest installing Fiddler and looking carefully at the request. I have seen sometimes that an icon on the page that is in a folder that requires authentication causes a 403 to be returned.

Fiddler 会给你一个好主意,你也可以试试 Firefox 并安装 FireBug 插件并检查页面是否有错误.

Fiddler will give you a good idea, and you can also try Firefox and install FireBug add-on and inspecting the page for errors.

查看该站点,我得到了 favicon.ico 的一堆 404,但除此之外,当我执行简单的 GET 请求时,我得到 200 OK,但当我执行 HEAD 时,我也得到 403.现在进入它.

Looking at the site I get a bunch of 404's for the favicon.ico, but apart from that when I do a simple GET request I get a 200 OK, but when I do a HEAD, I also get a 403. Looking into it now.

更新:我认为这可能是 Apache 服务器上的配置,但不能 100% 确定.http://hc.apache.org/httpclient-3.x/方法/head.html

UPDATE: I think it might be a configuration on the Apache server, but not 100% sure. http://hc.apache.org/httpclient-3.x/methods/head.html

UPDATE2:阅读此 http://www.pubbs.net/200811/httpd/17210-usershttpd-how-to-reject-head-request.html 让我相信您的 Apache 服务器可以设置为拒绝 HEAD 请求.在这种情况下,它将返回 403.

UPDATE2: Reading this http://www.pubbs.net/200811/httpd/17210-usershttpd-how-to-reject-head-request.html makes me believe that your Apache server could be set to reject HEAD requests. In that case it will return a 403.

这篇关于HEAD 请求收到“403 禁止"而获得“200 ok"?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆