线程内存使用量不断增加 [英] Thread memory usage keeps increasing

查看:298
本文介绍了线程内存使用量不断增加的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试访问网页,并检查网站所有者是否允许与他联系.

I am trying to visit the webpages and check if the website owner allows to contact him or not..

这里是 http://pastebin.com/12rLXQaz

这是每个线程调用的函数:

This is the function that each thread calls:

def getpage():
    try:
        curl = urls.pop(0)
        print "working on " +str(curl)
        thepage1 = requests.get(curl).text
        global ctot
        if "Contact Us" in thepage1:
            slist.write("\n" +curl)
            ctot = ctot + 1
    except:
        pass
    finally:
        if len(urls)>0 :
            getpage()  

但是事情是程序的内存不断增加..(pythonw.exe)

But the thing is memory of program keep on getting increased.. (pythonw.exe)

当线程再次调用该函数时,条件为true ..程序的内存应至少保持大约相同的水平.

As the thread calling the function again the condition is true .. the memory of the program should stay at least approximately at the same level.

对于包含约10万个URL的列表,该程序占用的空间远远超过3GB,并且正在增加...

For a list containing about 100k URLs, the program is taking much more than 3GB and increasing...

推荐答案

您的程序无缘无故是递归的.递归意味着为您获得的每个页面创建一个新的变量集,并且由于函数中的局部变量仍在引用这些变量,因此由于函数永无止境,垃圾回收永远不会起作用,并且它将继续进行永远吃掉记忆.

Your program is recursive for no reason. The recursion means that for each page you get you create a new set of variables, and since these are still being referenced by the local variables in the function, since the function never ends, the garbage collection never comes into play, and it will continue to eat memory for ever.

while 语句,这是您要使用的语句,而不是此处的递归.

Read up on the while statement, it's the one you want to use instead of recursion here.

while len(urls)>0 :
    try:
        curl = urls.pop(0)
        thepage1 = requests.get(curl).text
        global ctot
        if "Contact Us" in thepage1:
            slist.write("\n" +curl)
            ctot = ctot + 1
    except:
        pass

这篇关于线程内存使用量不断增加的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆