Python urllib 冻结特定 URL [英] Python urllib freezes with specific URL

查看:20
本文介绍了Python urllib 冻结特定 URL的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试获取一个页面并且 urlopen 挂起并且从不返回任何内容,尽管该网页非常轻巧并且可以使用任何浏览器打开而没有任何问题

I am trying to fetch a page and urlopen hangs and never returns anything, although the web page is very light and can be opened with any browser without any problems

import urllib.request
with urllib.request.urlopen("http://www.planalto.gov.br/ccivil_03/_Ato2007-2010/2008/Lei/L11882.htm") as response:
    print(response.read())

这个简单的代码在检索响应时会冻结,但是如果您尝试打开 http://www.planalto.gov.br/ccivil_03/_Ato2007-2010/2008/Lei/L11882.htm 打开没有任何问题

This simple code just freezes while retrieving the response, but if you try to open http://www.planalto.gov.br/ccivil_03/_Ato2007-2010/2008/Lei/L11882.htm it opens without any problem

推荐答案

www.planalto.gov.br 正在使用用户代理检测.如果您指定有效的用户代理,则请求将正确完成.urllib 库没有崩溃,只是在等待.

www.planalto.gov.br is using user-agent detection. If you specify a valid user-agent, the request fulfills correctly. The urllib library didn't crash, it's just waiting.

curl -H "User-Agent:Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36" http://www.planalto.gov.br/ccivil_03/_Ato2007-2010/2008/Lei/L11882.htm

对我来说效果很好,但是

worked just fine for me but

curl http://www.planalto.gov.br/ccivil_03/_Ato2007-2010/2008/Lei/L11882.htm

没有.

像上面 RPGillespie 说的,使用 urllib2 或 requests 添加用户代理头(参见 如何使用 python 的 urllib 设置标头? 了解更多信息).

Like RPGillespie said above, use urllib2 or requests to add the user-agent header (see How do I set headers using python's urllib? for more information about that).

这篇关于Python urllib 冻结特定 URL的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆