处理urllib2中的异常并在Python中进行机械化 [英] Handling exceptions from urllib2 and mechanize in Python

查看：203 发布时间：2017/10/1 15:50:07 python exception-handling mechanize-python

本文介绍了处理urllib2中的异常并在Python中进行机械化的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我是使用异常处理的新手。我正在使用机械化模块来刮几个网站。我的程序频繁失败，因为连接速度很慢，因为请求超时。我希望能够在每次尝试之后延迟30秒钟后重试网站（例如超时）超过5次。

我看着此 stackoverflow答案，可以看到我如何处理各种异常。我也看到（虽然看起来很笨拙）我如何将try / exception放在while循环中来控制5次尝试...但是我不明白如何突破循环，或者在连接时继续是成功的，也没有例外。

 从机械化导入浏览器
导入时间
 
b =浏览器（）
在尝试时尝试= 0 
 5：
 try：
r = b.open（'http://www.google.com/foobar'）
 except（mechanize.HTTPError，mechanize.URLError）as e：
如果isinstance（e，mechanize.HTTPError）：
打印e.code 
尝试+ = 1 
睡眠（30）
如果尝试> 4：
退出（）
其他：
打印e.reason.args 
尝试+ = 1 
睡眠（30）
如果尝试> 4：
退出（）
 
打印如何在第一次成功的b.open（）尝试之后到达这里？

我会感激建议（1）如何突破成功开放的循环，（2）如何使整个块更笨拙/更优雅。

解决方案

你不必重复除了你在这两种情况下的块之外。

 从机械化导入浏览器
导入时间
 
b = Browser（）
尝试= 0 
，而True：
 try：
r = b.open（'http://www.google.com/foobar'）
 except（mechanize.HTTPError，mechanize.URLError）as e：
尝试+ = 1 
如果isinstance（e，mechanize.HTTPError）：
打印e.code 
其他：
 print e.reason.args 
如果尝试> 4：
退出（）
睡眠（30）
继续
 break

另外，您可能可以使用而不是r：取决于什么 Browser.open 返回。

编辑： roadierich 显示了一个更优雅的方式

  try：
 doSomething（）
 break 
除了：
 ..

因为错误跳过了其他块。

I am a novice at using exception handling. I am using the mechanize module to scrape several websites. My program fails frequently because the connection is slow and because the requests timeout. I would like to be able to retry the website (on a timeout, for instance) up to 5 times after 30 second delays between each try.

I looked at this stackoverflow answer and can see how I can handle various exceptions. I also see (although it looks very clumsy) how I can put the try/exception inside a while loop to control the 5 attempts ... but I do not understand how to break out of the loop, or "continue" when the connection is successful and no exception has been thrown.

from mechanize import Browser
import time

b = Browser()
tried=0
while tried < 5:
  try:
    r=b.open('http://www.google.com/foobar')
  except (mechanize.HTTPError,mechanize.URLError) as e:
    if isinstance(e,mechanize.HTTPError):
      print e.code
      tried += 1
      sleep(30)
      if tried > 4:
        exit()
    else:
      print e.reason.args
      tried += 1
      sleep(30)
      if tried > 4:
        exit()

print "How can I get to here after the first successful b.open() attempt????"

I would appreciate advice about (1) how to break out of the loop on a successful open, and (2) how to make the whole block less clumsy/more elegant.

解决方案

You don't have to repeat things in the except block that you do in either case.

from mechanize import Browser
import time

b = Browser()
tried=0
while True:
  try:
    r=b.open('http://www.google.com/foobar')
  except (mechanize.HTTPError,mechanize.URLError) as e:
      tried += 1
    if isinstance(e,mechanize.HTTPError):
      print e.code
    else:
      print e.reason.args
    if tried > 4:
      exit()
    sleep(30)
    continue
  break

Also, you may be able to use while not r: depending on what Browser.open returns.

Edit: roadierich showed a more elegant way with

try:
  doSomething()
  break
except:
  ...

Because an error skips to the except block.

这篇关于处理urllib2中的异常并在Python中进行机械化的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

处理urllib2中的异常并在Python中进行机械化 [英] Handling exceptions from urllib2 and mechanize in Python

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

处理urllib2中的异常并在Python中进行机械化 [英] Handling exceptions from urllib2 and mechanize in Python

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭