如何调试Python内存故障? [英] How to debug Python memory fault?

查看:195
本文介绍了如何调试Python内存故障?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

编辑:真的很感激帮助找到错误 - 但由于可能很难找到/复制,任何一般的调试帮助也将不胜感激!帮我自己帮忙! =)



编辑2:缩小范围,注释掉代码。



编辑3:似乎lxml可能不是罪魁祸首,谢谢!完整的脚本是。可能在Python解释器或lxml库中有一个错误,而且没有额外的工具很难找到它。



当CPU使用率上升时,您可以中断在gdb下运行的脚本到100%,看堆栈跟踪。这可能有助于了解脚本中的内容。


Edit: Really appreciate help in finding bug - but since it might prove hard to find/reproduce, any general debug help would be greatly appreciated too! Help me help myself! =)

Edit 2: Narrowing it down, commenting out code.

Edit 3: Seems lxml might not be the culprit, thanks! The full script is here. I need to go over it looking for references. What do they look like?

Edit 4: Actually, the scripts stops (goes 100%) in this, the parse_og part of it. So edit 3 is false - it must be lxml somehow.

Edit 5 MAJOR EDIT: As suggested by David Robinson and TankorSmash below, I've found a type of data content that will send lxml.etree.HTML( data ) in a wild loop. (I carelessly disregarded it, but find my sins redeemed as I've paid a price to the tune of an extra two days of debug! ;) A working crashing script is here. (Also opened a new question.)

Edit 6: Turns out this is a bug with lxml version 2.7.8 and below (at least). Updated to lxml 2.9.0, and bug is gone. Thanks also to the fine folks over at this follow-up question.

I don't know how to debug this weird problem I'm having. The below code runs fine for about five minutes, when the RAM is suddenly completely filled up (from 200MB to 1700MB during the 100% period - then when memory is full, it goes into blue wait state).

It's due to the code below, specifically the first two lines. That's for sure. But what is going on? What could possibly explain this behaviour?

def parse_og(self, data):
    """ lxml parsing to the bone! """
    try:
        tree = etree.HTML( data ) # << break occurs on this line >>
        m = tree.xpath("//meta[@property]")

        #for i in m:
        #   y = i.attrib['property']
        #   x = i.attrib['content']
        #   # self.rj[y] = x  # commented out in this example because code fails anyway


        tree = ''
        m = ''
        x = ''
        y = ''
        i = ''

        del tree
        del m
        del x
        del y
        del i

    except Exception:
        print 'lxml error: ', sys.exc_info()[1:3]
        print len(data)
        pass

解决方案

You can try Low-level Python debugging with GDB. Probably there is a bug in Python interpreter or in lxml library and it is hard to find it without extra tools.

You can interrupt your script running under gdb when CPU usage goes to 100% and look at stack trace. It will probably help to understand what's going on inside script.

这篇关于如何调试Python内存故障?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆