Google ndb库中的内存泄漏 [英] Memory leak in Google ndb library

查看:121
本文介绍了Google ndb库中的内存泄漏的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我认为 ndb 库中存在内存泄漏,但我找不到位置。



是有一种方法可以避免以下所述的问题?

您是否有更准确的测试方法来确定问题所在?






我就是这样重现了这个问题的:

我用2个文件创建了一个简约的Google App Engine应用程序。

app.yaml

  application:myapplicationid 
version :demo
runtime:python27
api_version:1
线程安全:是


处理程序:
- url:/.*
脚本:main.APP

库:
- 名称:webapp2
版本:latest

main.py

 # -  *  -  coding:utf-8  -  *  -  
内存泄漏演示。
from google.appengine.ext import ndb
import webapp2


class DummyModel(ndb.Model):

content = ndb.TextProperty ()


class CreatePage(webapp2.RequestHandler):

def get(self):
value = str(102 ** 100000)
实体=(DummyModel(content = value)for _ in xrange(100))
ndb.put_multi(实体)


class MainPage(webapp2.RequestHandler):

def get(self):
在这里建议使用query()。iter()`:
https://code.google.com/ p / googleappengine / issues / detail?id = 9610
无需装饰器和经典
`query()。fetch()`即可复制相同的结果。

用于范围内的(10):
用于DummyModel.query()中的实体iter():
pass#做任何你想做的事
self.response.headers ['Content-Type'] ='text / plain'
self.response.write('Hello,World!')


APP = webapp2 .WSGIApplication([
('/',MainPage),
('/ create',CreatePage),
])

我上传了一次名为 / create 的应用程序。

之后,每次调用 / 会增加实例所使用的内存,直到它由于错误而超过128MB时超出软私有内存限制总共



内存使用图示例(您可以看到内存增长和崩溃):



注意:这个问题可以用 webapp2 之外的其他框架来重现,例如 web.py

解决方案

经过更多的调查,并在谷歌工程师的帮助下,我发现了两个解释我的内存消耗。 / b>

上下文和线程
$ b

ndb.Context code>是一个线程本地对象,只有在线程中有新的请求时才会被清除。所以线程在请求​​之间保持它。 GAE实例中可能存在很多线程,并且在第二次使用线程并清除上下文之前可能需要数百个请求。

这不是内存泄漏,但内存中的上下文大小可能会超过可用内存在一个小的GAE实例中。

解决方法:

您不能配置GAE实例。所以最好尽量减少每个上下文。避免上下文缓存,并在每次请求后清除它。



事件队列

<看起来,NDB不保证事件队列在请求后被清空。这又不是内存泄漏。但它会在线程上下文中留下 Futures ,并回到第一个问题。



解决方法:

将所有使用NDB的代码与 @ ndb.toplevel 包装在一起。


I think there is a memory leak in the ndb library but I can not find where.

Is there a way to avoid the problem described below?
Do you have a more accurate idea of testing to figure out where the problem is?


That's how I reproduced the problem :

I created a minimalist Google App Engine application with 2 files.
app.yaml:

application: myapplicationid
version: demo
runtime: python27
api_version: 1
threadsafe: yes


handlers:
- url: /.*
  script: main.APP

libraries:
- name: webapp2
  version: latest

main.py:

# -*- coding: utf-8 -*-
"""Memory leak demo."""
from google.appengine.ext import ndb
import webapp2


class DummyModel(ndb.Model):

    content = ndb.TextProperty()


class CreatePage(webapp2.RequestHandler):

    def get(self):
        value = str(102**100000)
        entities = (DummyModel(content=value) for _ in xrange(100))
        ndb.put_multi(entities)


class MainPage(webapp2.RequestHandler):

    def get(self):
        """Use of `query().iter()` was suggested here:
            https://code.google.com/p/googleappengine/issues/detail?id=9610
        Same result can be reproduced without decorator and a "classic"
            `query().fetch()`.
        """
        for _ in range(10):
            for entity in DummyModel.query().iter():
                pass # Do whatever you want
        self.response.headers['Content-Type'] = 'text/plain'
        self.response.write('Hello, World!')


APP = webapp2.WSGIApplication([
    ('/', MainPage),
    ('/create', CreatePage),
])

I uploaded the application, called /create once.
After that, each call to / increases the memory used by the instance. Until it stops due to the error Exceeded soft private memory limit of 128 MB with 143 MB after servicing 5 requests total.

Exemple of memory usage graph (you can see the memory growth and crashes) :

Note: The problem can be reproduced with another framework than webapp2, like web.py

解决方案

After more investigations, and with the help of a google engineer, I've found two explanation to my memory consumption.

Context and thread

ndb.Context is a "thread local" object and is only cleared when a new request come in the thread. So the thread hold on it between requests. Many threads may exist in a GAE instance and it may take hundreds of requests before a thread is used a second time and it's context cleared.
This is not a memory leak, but contexts size in memory may exceed the available memory in a small GAE instance.

Workaround:
You can not configure the number of threads used in a GAE instance. So it is best to keep each context smallest possible. Avoid in-context cache, and clear it after each request.

Event queue

It seems that NDB does not guarantee that event queue is emptied after a request. Again this is not a memory leak. But it leave Futures in your thread context, and you're back to the first problem.

Workaround:
Wrap all your code that use NDB with @ndb.toplevel.

这篇关于Google ndb库中的内存泄漏的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆