Urlopen [Errno -2] Python [英] Urlopen [Errno -2] Python

查看:31
本文介绍了Urlopen [Errno -2] Python的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我开发了一部分从网络抓取中使用的代码:

I have a developed a part of code which I use from web scraping:

link = 'http://www.cmegroup.com'+div.findAll('a')[3]['href']
user_agent = 'Mozilla/5.0'
headers = {'User-Agent':user_agent}
req = urllib2.Request(link, headers=headers)
page = urllib2.urlopen(req).read()

然而,我不明白的是有时我在请求链接时遇到错误.但有时,我不会.例如,错误:

However what I don't understand is sometimes I get an error requesting the link. But sometimes, I don't. For example, the error:

urllib2.URLError: <urlopen error [Errno -2] Name or service not known>

出来找这个链接:

http://www.cmegroup.com/trading/energy/refined-products/mini-european-naphtha-platts-cif-nwe-swap-futures_product_calendar_futures.html

当我重新运行代码时,我不会再收到此链接的错误,而是其他链接的错误.这可能是由于无线连接造成的吗?

When I re-run the code, I won't get an error for this link again, but for some other. Could this be due a wireless connection?

推荐答案

这看起来像是 DNS 或网络问题.如果您多次为同一个 URL 运行相同的代码,并且它有时有效但有时不起作用,则问题可能不在于您的代码.

This looks like a DNS or network problem. If you run the same code for the same URL several times and it sometimes works but sometimes doesn't, the problem is probably not your code.

要调试问题,您可以在语句周围执行 try-except 块并从那里启动 pdb 或 ipdb(如果已安装):

To debug the issue, you could do a try-except block around the statement and start pdb or ipdb (if installed) from there:

try:
    response = urllib2.urlopen(req)
except urllib2.URLError as ex:
    import pdb; pdb.set_trace()  # Use ipdb if installed
else:
    page = response.read()

然后您可以查看响应、状态代码、异常跟踪等...

Then you can take a look at the response, the status code, the exception trace etc...

(作为旁注,如果外部依赖不是问题,我强烈建议使用 requests 包而不是 urllib2.)

(As a sidenote, if external dependencies are not a problem, I'd strongly recommend to use the requests package instead of urllib2.)

这篇关于Urlopen [Errno -2] Python的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆