网站不接受wget用户代理标头 [英] Sites not accepting wget user agent header
问题描述
当我运行此命令时:
wget --user-agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:21.0) Gecko/20100101 Firefox/21.0" http://yahoo.com
...我得到了这个结果(文件中没有其他内容):
...I get this result (with nothing else in the file):
<!-- hw147.fp.gq1.yahoo.com uncompressed/chunked Wed Jun 19 03:42:44 UTC 2013 -->
但是当我运行 wget http://yahoo.com
没有 - user-agent
选项,我得到整页。
But when I run wget http://yahoo.com
with no --user-agent
option, I get the full page.
用户代理是我当前浏览器发送的标题。为什么会这样?有没有办法确保用户代理在使用wget时不被阻止?
The user agent is the same header that my current browser sends. Why does this happen? Is there a way to make sure the user agent doesn't get blocked when using wget?
推荐答案
雅虎服务器似乎做了一些基于用户代理
的启发式< * / *
。
It seems Yahoo server does some heuristic based on User-Agent
in a case Accept
header is set to */*
.
接受:text / html
Accept: text/html
为我做了诀窍。
例如
wget --header="Accept: text/html" --user-agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:21.0) Gecko/20100101 Firefox/21.0" http://yahoo.com
注意:如果你没有声明接受
标题,那么 wget
自动添加接受:* / *
这意味着给我任何东西。
Note: if you don't declare Accept
header then wget
automatically adds Accept:*/*
which means give me anything you have.
这篇关于网站不接受wget用户代理标头的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!