在Python 2.4中超时urllib2 urlopen操作 [英] Timing out urllib2 urlopen operation in Python 2.4

查看:246
本文介绍了在Python 2.4中超时urllib2 urlopen操作的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我刚刚继承了一些Python代码,需要尽快修复错误. 我对Python的了解很少,所以请原谅我的无知. 我正在使用urllib2从网页中提取数据. 尽管使用了socket.setdefaulttimeout(30),但仍然遇到了似乎无限期挂起的URL.

I've just inherited some Python code and need to fix a bug as soon as possible. I have very little Python knowledge so please excuse my ignorance. I am using urllib2 to extract data from web pages. Despite using socket.setdefaulttimeout(30) I am still coming across URLs that hang seemingly indefinitely.

我想暂停提取,并且在网上搜索很多之后已经走了这么远:

I want to time out the extraction and have got this far after much searching the web:

import socket 
socket.setdefaulttimeout(30)

reqdata = urllib2.Request(urltocollect)

    def handler(reqdata):
        ????  reqdata.close() ????


    t = Timer(5.0, handler,[reqdata])
    t.start()
    urldata = urllib2.urlopen(reqdata)
    t.cancel()

处理程序函数在时间过去之后触发,但是我不知道如何获取它来停止openurl操作.

The handler function triggers after the time has passed but I don't know how to get it to stop the openurl operation.

任何指导将不胜感激. C

Any guidance would be gratefully received. C

更新------------------------- 以我的经验,在某些URL上使用urllib2.urlopen会挂起并无限期地等待. 执行此操作的URL是用浏览器指向时始终无法解析的URL,浏览器仅在活动指示器移动时等待,但从未完全连接. 我怀疑这些URL可能卡在某种无限循环重定向中. urlopen的timeout参数(在更高版本的Python中)和socket.setdefaulttimeout()全局设置在我的系统上未检测到此问题.

UPDATE ------------------------- In my experience when used on certain URLs urllib2.urlopen hangs and waits indefinitely. The URLs that do this are ones that when pointed to with a browser never resolve, the browser just waits with the activity indicator moving but never connecting fully. I suspect that these URLs may be stuck inside some kind of infinite looping redirect. The timeout argument to urlopen (in later versions of Python) and the socket.setdefaulttimeout() global setting do not detect this issue on my system.

我尝试了多种解决方案,但最终我升级到了Python 2.7,并在下面使用了Werner回答的变体.谢谢沃纳.

I tried a number of solutions but in the end I updraded to Python 2.7 and used a variation of Werner’s answer below. Thanks Werner.

推荐答案

您可以使用信号来实现.

You can achieve this using signals.

这是我的信号装饰器的一个示例,您可以使用它来设置各个功能的超时时间.

Here's an example of my signal decorator that you can use to set the timeout for individual functions.

Ps.不知道这在语法上是否对2.4正确.我使用的是2.6,但2.4支持信号.

Ps. not sure if this is syntactically correct for 2.4. I'm using 2.6 but the 2.4 supports signals.

import signal
import time

class TimeOutException(Exception):
    pass

def timeout(seconds, *args, **kwargs):
    def fn(f):
        def wrapped_fn(*args, **kwargs):
            signal.signal(signal.SIGALRM, handler)
            signal.alarm(seconds)
            f(*args, **kwargs)
        return wrapped_fn
    return fn

def handler(signum, frame):
    raise TimeOutException("Timeout")

@timeout(5)
def my_function_that_takes_long(time_to_sleep):
    time.sleep(time_to_sleep)

if __name__ == '__main__':
    print 'Calling function that takes 2 seconds'
    try:
        my_function_that_takes_long(2)
    except TimeOutException:
        print 'Timed out'

    print 'Calling function that takes 10 seconds'
    try:
        my_function_that_takes_long(10)
    except TimeOutException:
        print 'Timed out'

这篇关于在Python 2.4中超时urllib2 urlopen操作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆