NDB在长时间请求期间不清除内存 [英] NDB not clearing memory during a long request

查看:164
本文介绍了NDB在长时间请求期间不清除内存的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

目前,我正在将一个长时间运行的作业卸载到TaskQueue中,以计算数据存储区中NDB实体之间的连接。

基本上这个队列处理几个与查询有关的实体键列表,这些列表由 node_in_connected_nodes 函数在 GetConnectedNodes 节点中:

  class GetConnectedNodes(object):
以分页方式从节点列表中获取连接节点的类
def __init __(self,list,query):
#super(GetConnectedNodes,self).__ init __()
self.nodes = [ndb.model.Key('Node','%s'%x)for x in list]
self .cursor = 0
self.MAX_QUERY = 100
#logging.info('Max query - %d'%self.MAX_QUERY)
self.max_connections = len(list)
self.connections = deque()
self.query = query

def node_in_connected_nodes(self):
检查下一个节点的连接节点中是否存在节点在
节点列表中
如果不存在,将返回False,或者连接器的证据列表如果有的话


while self.cursor< self.max_connections:
if len(self.connections)== 0:
end = self.MAX_QUERY
if self.max_connections - self.cursor< self.MAX_QUERY:
end = self.max_connections - self.cursor
self.connections.clear()
self.connections = deque(ndb.model .get_multi_async(self.nodes [self.cursor:self.cursor + end]))

connection = self.connections.popleft()
connection_nodes = connection.get_result()。connections

如果connection_nodes中有self.query:
connection_sources = connection.get_result()。sources
#yield(列表中的当前节点索引,源)
yield(self .cursor,connection_sources [connection_nodes.index(self.query)])
self.cursor + = 1




这里 Node 有一个重复的属性 connections th at包含一个包含其他 Node 键标识的数组,以及一个匹配 sources 数组的数组。



产生的结果存储在一个blobstore中。



现在我遇到的问题是,在迭代连接函数内存不会被清除。以下日志显示了在创建新的 GetConnectedNodes 实例之前AppEngine使用的内存:

  I 2012-08-23 16:58:01.643优先级HGNC:4839  -  mem 32 
I 2012-08-23 16:59:21.819优先级HGNC:3003 - mem 380
I 2012-08-23 17:00:00.918优先级HGNC:8932 - 内存468
I 2012-08-23 17:00:01.424优先级HGNC:24771 - 内存435
I 2012-08-23 17:00:20.334优先HGNC:9300 - 内存417
I 2012-08-23 17:00:48.476优先级HGNC:10545 - 内存447
I 2012-08-23 17 :01:01.489优先级HGNC:12775 - mem 485
I 2012-08-23 17:01:46.0​​84优先级HGNC:2001 - mem 564
C 2012-08-23 17:02:18.028超出软在处理1个请求后,私人内存限制为628.609 MB总计

除了一些波动之外,内存还在不断增加,即使没有访问任何先前的值。我发现很难调试,或者弄清楚我是否有内存泄漏的地方,但是我似乎已经把它跟踪到了这个类。希望有任何帮助。

解决方案

我们有类似的问题(长时间运行的请求)。
我们通过关闭默认的ndb缓存来解决这些问题。
您可以阅读更多有关此处


I am currently offloading a long running job to a TaskQueue to calculate connections between NDB entities in the Datastore.

Basically this queue handles several lists of entity keys that are to be related to another query by the node_in_connected_nodes function in the GetConnectedNodes node:

class GetConnectedNodes(object):
"""Class for getting the connected nodes from a list of nodes in a paged way"""
def __init__(self, list, query):
    # super(GetConnectedNodes, self).__init__()
    self.nodes = [ndb.model.Key('Node','%s' % x) for x in list]
    self.cursor = 0
    self.MAX_QUERY = 100
    # logging.info('Max query - %d' % self.MAX_QUERY)
    self.max_connections = len(list)
    self.connections = deque()
    self.query=query

def node_in_connected_nodes(self):
    """Checks if a node exists in the connected nodes of the next node in the 
       node list.
       Will return False if it doesn't, or the list of evidences for the connection
       if it does.
       """
    while self.cursor < self.max_connections:
        if len(self.connections) == 0:
            end = self.MAX_QUERY
            if self.max_connections - self.cursor < self.MAX_QUERY:
                end = self.max_connections - self.cursor
            self.connections.clear()
            self.connections = deque(ndb.model.get_multi_async(self.nodes[self.cursor:self.cursor+end]))

        connection = self.connections.popleft()
        connection_nodes = connection.get_result().connections

        if self.query in connection_nodes:
            connection_sources = connection.get_result().sources
            # yields (current node index in the list, sources)
            yield (self.cursor, connection_sources[connection_nodes.index(self.query)])
        self.cursor += 1

Here a Node has a repeated property connections that contains an array with other Node key ids, and a matching sources array to that given connection.

The yielded results are stored in a blobstore.

Now the problem I'm getting is that after an iteration of connection function the memory is not cleared somehow. The following log shows the memory used by AppEngine just before creating a new GetConnectedNodes instance:

I 2012-08-23 16:58:01.643 Prioritizing HGNC:4839 - mem 32
I 2012-08-23 16:59:21.819 Prioritizing HGNC:3003 - mem 380
I 2012-08-23 17:00:00.918 Prioritizing HGNC:8932 - mem 468
I 2012-08-23 17:00:01.424 Prioritizing HGNC:24771 - mem 435
I 2012-08-23 17:00:20.334 Prioritizing HGNC:9300 - mem 417
I 2012-08-23 17:00:48.476 Prioritizing HGNC:10545 - mem 447
I 2012-08-23 17:01:01.489 Prioritizing HGNC:12775 - mem 485
I 2012-08-23 17:01:46.084 Prioritizing HGNC:2001 - mem 564
C 2012-08-23 17:02:18.028 Exceeded soft private memory limit with 628.609 MB after servicing 1 requests total

Apart from some fluctuations the memory just keeps increasing, even though none of the previous values are accessed. I found it quite hard to debug this or to figure out if I have a memory leak somewhere, but I seem to have traced it down to that class. Would appreciate any help.

解决方案

We had similar issues (with long running requests). We solved them by turning-off the default ndb cache. You can read more about it here

这篇关于NDB在长时间请求期间不清除内存的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆