处理urllib2中的异常并在Python中进行机械化 [英] Handling exceptions from urllib2 and mechanize in Python
问题描述
我看着此 stackoverflow答案,可以看到我如何处理各种异常。我也看到(虽然看起来很笨拙)我如何将try / exception放在while循环中来控制5次尝试...但是我不明白如何突破循环,或者在连接时继续是成功的,也没有例外。
从机械化导入浏览器
导入时间
b =浏览器()
在尝试时尝试= 0
5:
try:
r = b.open('http://www.google.com/foobar')
except(mechanize.HTTPError,mechanize.URLError)as e:
如果isinstance(e,mechanize.HTTPError):
打印e.code
尝试+ = 1
睡眠(30)
如果尝试> 4:
退出()
其他:
打印e.reason.args
尝试+ = 1
睡眠(30)
如果尝试> 4:
退出()
打印如何在第一次成功的b.open()尝试之后到达这里?
我会感激建议(1)如何突破成功开放的循环, (2)如何使整个块更笨拙/更优雅。
你不必重复除了你在这两种情况下的块之外。
从机械化导入浏览器
导入时间
b = Browser()
尝试= 0
,而True:
try:
r = b.open('http://www.google.com/foobar')
except(mechanize.HTTPError,mechanize.URLError)as e:
尝试+ = 1
如果isinstance(e,mechanize.HTTPError):
打印e.code
其他:
print e.reason.args
如果尝试> 4:
退出()
睡眠(30)
继续
break
另外,您可能可以使用而不是r:
取决于什么 Browser.open
返回。
编辑: roadierich 显示了一个更优雅的方式
try:
doSomething()
break
除了:
..
因为错误跳过了其他块。
I am a novice at using exception handling. I am using the mechanize module to scrape several websites. My program fails frequently because the connection is slow and because the requests timeout. I would like to be able to retry the website (on a timeout, for instance) up to 5 times after 30 second delays between each try.
I looked at this stackoverflow answer and can see how I can handle various exceptions. I also see (although it looks very clumsy) how I can put the try/exception inside a while loop to control the 5 attempts ... but I do not understand how to break out of the loop, or "continue" when the connection is successful and no exception has been thrown.
from mechanize import Browser
import time
b = Browser()
tried=0
while tried < 5:
try:
r=b.open('http://www.google.com/foobar')
except (mechanize.HTTPError,mechanize.URLError) as e:
if isinstance(e,mechanize.HTTPError):
print e.code
tried += 1
sleep(30)
if tried > 4:
exit()
else:
print e.reason.args
tried += 1
sleep(30)
if tried > 4:
exit()
print "How can I get to here after the first successful b.open() attempt????"
I would appreciate advice about (1) how to break out of the loop on a successful open, and (2) how to make the whole block less clumsy/more elegant.
You don't have to repeat things in the except block that you do in either case.
from mechanize import Browser
import time
b = Browser()
tried=0
while True:
try:
r=b.open('http://www.google.com/foobar')
except (mechanize.HTTPError,mechanize.URLError) as e:
tried += 1
if isinstance(e,mechanize.HTTPError):
print e.code
else:
print e.reason.args
if tried > 4:
exit()
sleep(30)
continue
break
Also, you may be able to use while not r:
depending on what Browser.open
returns.
Edit: roadierich showed a more elegant way with
try:
doSomething()
break
except:
...
Because an error skips to the except block.
这篇关于处理urllib2中的异常并在Python中进行机械化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!