Python urllib 冻结特定 URL [英] Python urllib freezes with specific URL
问题描述
我正在尝试获取一个页面并且 urlopen 挂起并且从不返回任何内容,尽管该网页非常轻巧并且可以使用任何浏览器打开而没有任何问题
I am trying to fetch a page and urlopen hangs and never returns anything, although the web page is very light and can be opened with any browser without any problems
import urllib.request
with urllib.request.urlopen("http://www.planalto.gov.br/ccivil_03/_Ato2007-2010/2008/Lei/L11882.htm") as response:
print(response.read())
这个简单的代码在检索响应时会冻结,但是如果您尝试打开 http://www.planalto.gov.br/ccivil_03/_Ato2007-2010/2008/Lei/L11882.htm 打开没有任何问题
This simple code just freezes while retrieving the response, but if you try to open http://www.planalto.gov.br/ccivil_03/_Ato2007-2010/2008/Lei/L11882.htm it opens without any problem
推荐答案
www.planalto.gov.br 正在使用用户代理检测.如果您指定有效的用户代理,则请求将正确完成.urllib 库没有崩溃,只是在等待.
www.planalto.gov.br is using user-agent detection. If you specify a valid user-agent, the request fulfills correctly. The urllib library didn't crash, it's just waiting.
curl -H "User-Agent:Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36" http://www.planalto.gov.br/ccivil_03/_Ato2007-2010/2008/Lei/L11882.htm
对我来说效果很好,但是
worked just fine for me but
curl http://www.planalto.gov.br/ccivil_03/_Ato2007-2010/2008/Lei/L11882.htm
没有.
像上面 RPGillespie 说的,使用 urllib2 或 requests 添加用户代理头(参见 如何使用 python 的 urllib 设置标头? 了解更多信息).
Like RPGillespie said above, use urllib2 or requests to add the user-agent header (see How do I set headers using python's urllib? for more information about that).
这篇关于Python urllib 冻结特定 URL的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!