在 python 2.7 中重用 httplib.HTTPConnection [英] Reusing httplib.HTTPConnection in python 2.7

查看:43
本文介绍了在 python 2.7 中重用 httplib.HTTPConnection的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我最近继承了一个 python 项目,我现在正在维护它.部分代码从网站发出几十万个请求并将结果保存到数据库中.代码将相同的 httplib.HTTPConnection 对象重用于到达请求,然后循环遍历

I recently inherited a python project, and I'm working on maintaining it now. Part of the code makes a few hundred thousand requests from a website and saves the results to a database. The code is reusing the same httplib.HTTPConnection object for reach request and then just looping over a

conn.request("GET",someString,'',headers)

response = conn.getresponse()

部分.几天前在我的日志中,我看到其中一个请求抛出了异常:

section. A few days ago in my logs I saw that one of the requests threw the exception:

[Errno 104] Connection reset by peer  

之后是所有其他 conn.request() 失败.我的第一个倾向是为每个请求建立一个新的连接,但这对性能的影响是深远而可怕的.所以我的问题是,我该如何解决这个问题,尤其是因为我不能 100% 确定我什至可以真正测试这个.

followed by every other conn.request() failing. My first inclination was to just build a new connection for each request, but the perfomance impact of that was profound and horrible. So my question is, how do I fix this, especially since I'm not 100% sure how I can even really test this.

如果我只是在异常之后调用 conn.connect() ,它会正确地重新连接吗?

If I just call conn.connect() after an exception, will it correctly reconnect?

我正在寻找有关如何修复它以及可能如何测试它的建议.

I'm looking for advise on how to fix it and possibly how I could test it.

感谢您的时间.

推荐答案

我认为您首先需要决定要处理的故障模式.例如,连接是否因为服务器上的临时资源问题而重置,快速周转连接将修复它?或者,服务器是否关闭或重新启动,您应该中止您的进程?

I think you first need to decide the failure mode you want to handle. For instance, did the connection reset because of a temporary resource problem on the server and a quick turnaround connect will fix it? Or, is the server down or rebooting and you should abort your process?

假设第一种情况,我认为您的想法是正确的.尝试这样的事情(注意,这不是工作代码 - 这只是逻辑的一个例子):

Presuming the first case, I think you are thinking along the right lines. Try something like this (note, this is not working code - it's just an example of the logic):

while True:
    try:
        conn.request("GET",someString,'',headers)
        response = conn.getresponse()
    except httplib.HTTPException, e:
        conn.connect()
        continue
    break

您可能应该为此添加一些逻辑,以在重复连接尝试之间暂停并在尝试一定次数后放弃(这基本上是上面的第二种情况).

You should probably add some logic to that to pause between repeated connect attempts and to give up after a certain number of tries (which is basically the second scenario above).

为了测试这一点,请尝试使用 tcpkill 来重置 TCP 连接:

In order to test this, try using tcpkill to cause the TCP connection to reset:

http://www.gnutoolbox.com/tcpkill-command/

这篇关于在 python 2.7 中重用 httplib.HTTPConnection的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆