NDB 在长时间请求期间不清除内存 [英] NDB not clearing memory during a long request

查看:17
本文介绍了NDB 在长时间请求期间不清除内存的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前正在将长时间运行的作业卸载到 TaskQueue,以计算数据存储区中 NDB 实体之间的连接.

I am currently offloading a long running job to a TaskQueue to calculate connections between NDB entities in the Datastore.

基本上,这个队列通过 GetConnectedNodes 节点中的 node_in_connected_nodes 函数处理几个与另一个 query 相关的实体键列表:

Basically this queue handles several lists of entity keys that are to be related to another query by the node_in_connected_nodes function in the GetConnectedNodes node:

class GetConnectedNodes(object):
"""Class for getting the connected nodes from a list of nodes in a paged way"""
def __init__(self, list, query):
    # super(GetConnectedNodes, self).__init__()
    self.nodes = [ndb.model.Key('Node','%s' % x) for x in list]
    self.cursor = 0
    self.MAX_QUERY = 100
    # logging.info('Max query - %d' % self.MAX_QUERY)
    self.max_connections = len(list)
    self.connections = deque()
    self.query=query

def node_in_connected_nodes(self):
    """Checks if a node exists in the connected nodes of the next node in the 
       node list.
       Will return False if it doesn't, or the list of evidences for the connection
       if it does.
       """
    while self.cursor < self.max_connections:
        if len(self.connections) == 0:
            end = self.MAX_QUERY
            if self.max_connections - self.cursor < self.MAX_QUERY:
                end = self.max_connections - self.cursor
            self.connections.clear()
            self.connections = deque(ndb.model.get_multi_async(self.nodes[self.cursor:self.cursor+end]))

        connection = self.connections.popleft()
        connection_nodes = connection.get_result().connections

        if self.query in connection_nodes:
            connection_sources = connection.get_result().sources
            # yields (current node index in the list, sources)
            yield (self.cursor, connection_sources[connection_nodes.index(self.query)])
        self.cursor += 1

这里的 Node 有一个重复的属性 connections,它包含一个带有其他 Node 键 ID 的数组,以及一个匹配的 sources数组到给定的连接.

Here a Node has a repeated property connections that contains an array with other Node key ids, and a matching sources array to that given connection.

产生的结果存储在 blobstore 中.

The yielded results are stored in a blobstore.

现在我遇到的问题是,在连接函数的迭代之后,内存并没有以某种方式清除.以下日志显示了 AppEngine 在创建新的 GetConnectedNodes 实例之前使用的内存:

Now the problem I'm getting is that after an iteration of connection function the memory is not cleared somehow. The following log shows the memory used by AppEngine just before creating a new GetConnectedNodes instance:

I 2012-08-23 16:58:01.643 Prioritizing HGNC:4839 - mem 32
I 2012-08-23 16:59:21.819 Prioritizing HGNC:3003 - mem 380
I 2012-08-23 17:00:00.918 Prioritizing HGNC:8932 - mem 468
I 2012-08-23 17:00:01.424 Prioritizing HGNC:24771 - mem 435
I 2012-08-23 17:00:20.334 Prioritizing HGNC:9300 - mem 417
I 2012-08-23 17:00:48.476 Prioritizing HGNC:10545 - mem 447
I 2012-08-23 17:01:01.489 Prioritizing HGNC:12775 - mem 485
I 2012-08-23 17:01:46.084 Prioritizing HGNC:2001 - mem 564
C 2012-08-23 17:02:18.028 Exceeded soft private memory limit with 628.609 MB after servicing 1 requests total

除了一些波动外,内存只会不断增加,即使之前的值都没有被访问.我发现很难调试这个或弄清楚我是否在某处有内存泄漏,但我似乎已经将其追溯到那个类.将不胜感激任何帮助.

Apart from some fluctuations the memory just keeps increasing, even though none of the previous values are accessed. I found it quite hard to debug this or to figure out if I have a memory leak somewhere, but I seem to have traced it down to that class. Would appreciate any help.

推荐答案

我们遇到了类似的问题(长时间运行的请求).我们通过关闭默认的 ndb 缓存解决了这些问题.您可以在此处

We had similar issues (with long running requests). We solved them by turning-off the default ndb cache. You can read more about it here

这篇关于NDB 在长时间请求期间不清除内存的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆