NDB在长时间请求期间不清除内存 [英] NDB not clearing memory during a long request
问题描述
目前,我正在将一个长时间运行的作业卸载到TaskQueue中,以计算数据存储区中NDB实体之间的连接。
基本上这个队列处理几个与查询
有关的实体键列表,这些列表由 node_in_connected_nodes
函数在 GetConnectedNodes
节点中:
class GetConnectedNodes(object):
以分页方式从节点列表中获取连接节点的类
def __init __(self,list,query):
#super(GetConnectedNodes,self).__ init __()
self.nodes = [ndb.model.Key('Node','%s'%x)for x in list]
self .cursor = 0
self.MAX_QUERY = 100
#logging.info('Max query - %d'%self.MAX_QUERY)
self.max_connections = len(list)
self.connections = deque()
self.query = query
def node_in_connected_nodes(self):
检查下一个节点的连接节点中是否存在节点在
节点列表中
如果不存在,将返回False,或者连接器的证据列表如果有的话
。
while self.cursor< self.max_connections:
if len(self.connections)== 0:
end = self.MAX_QUERY
if self.max_connections - self.cursor< self.MAX_QUERY:
end = self.max_connections - self.cursor
self.connections.clear()
self.connections = deque(ndb.model .get_multi_async(self.nodes [self.cursor:self.cursor + end]))
connection = self.connections.popleft()
connection_nodes = connection.get_result()。connections
如果connection_nodes中有self.query:
connection_sources = connection.get_result()。sources
#yield(列表中的当前节点索引,源)
yield(self .cursor,connection_sources [connection_nodes.index(self.query)])
self.cursor + = 1
这里 Node
有一个重复的属性 connections
th at包含一个包含其他 Node
键标识的数组,以及一个匹配 sources
数组的数组。
产生的结果存储在一个blobstore中。
现在我遇到的问题是,在迭代连接函数内存不会被清除。以下日志显示了在创建新的 GetConnectedNodes
实例之前AppEngine使用的内存:
I 2012-08-23 16:58:01.643优先级HGNC:4839 - mem 32
I 2012-08-23 16:59:21.819优先级HGNC:3003 - mem 380
I 2012-08-23 17:00:00.918优先级HGNC:8932 - 内存468
I 2012-08-23 17:00:01.424优先级HGNC:24771 - 内存435
I 2012-08-23 17:00:20.334优先HGNC:9300 - 内存417
I 2012-08-23 17:00:48.476优先级HGNC:10545 - 内存447
I 2012-08-23 17 :01:01.489优先级HGNC:12775 - mem 485
I 2012-08-23 17:01:46.084优先级HGNC:2001 - mem 564
C 2012-08-23 17:02:18.028超出软在处理1个请求后,私人内存限制为628.609 MB总计
除了一些波动之外,内存还在不断增加,即使没有访问任何先前的值。我发现很难调试,或者弄清楚我是否有内存泄漏的地方,但是我似乎已经把它跟踪到了这个类。希望有任何帮助。
我们有类似的问题(长时间运行的请求)。
我们通过关闭默认的ndb缓存来解决这些问题。
您可以阅读更多有关此处
I am currently offloading a long running job to a TaskQueue to calculate connections between NDB entities in the Datastore.
Basically this queue handles several lists of entity keys that are to be related to another query
by the node_in_connected_nodes
function in the GetConnectedNodes
node:
class GetConnectedNodes(object):
"""Class for getting the connected nodes from a list of nodes in a paged way"""
def __init__(self, list, query):
# super(GetConnectedNodes, self).__init__()
self.nodes = [ndb.model.Key('Node','%s' % x) for x in list]
self.cursor = 0
self.MAX_QUERY = 100
# logging.info('Max query - %d' % self.MAX_QUERY)
self.max_connections = len(list)
self.connections = deque()
self.query=query
def node_in_connected_nodes(self):
"""Checks if a node exists in the connected nodes of the next node in the
node list.
Will return False if it doesn't, or the list of evidences for the connection
if it does.
"""
while self.cursor < self.max_connections:
if len(self.connections) == 0:
end = self.MAX_QUERY
if self.max_connections - self.cursor < self.MAX_QUERY:
end = self.max_connections - self.cursor
self.connections.clear()
self.connections = deque(ndb.model.get_multi_async(self.nodes[self.cursor:self.cursor+end]))
connection = self.connections.popleft()
connection_nodes = connection.get_result().connections
if self.query in connection_nodes:
connection_sources = connection.get_result().sources
# yields (current node index in the list, sources)
yield (self.cursor, connection_sources[connection_nodes.index(self.query)])
self.cursor += 1
Here a Node
has a repeated property connections
that contains an array with other Node
key ids, and a matching sources
array to that given connection.
The yielded results are stored in a blobstore.
Now the problem I'm getting is that after an iteration of connection function the memory is not cleared somehow. The following log shows the memory used by AppEngine just before creating a new GetConnectedNodes
instance:
I 2012-08-23 16:58:01.643 Prioritizing HGNC:4839 - mem 32
I 2012-08-23 16:59:21.819 Prioritizing HGNC:3003 - mem 380
I 2012-08-23 17:00:00.918 Prioritizing HGNC:8932 - mem 468
I 2012-08-23 17:00:01.424 Prioritizing HGNC:24771 - mem 435
I 2012-08-23 17:00:20.334 Prioritizing HGNC:9300 - mem 417
I 2012-08-23 17:00:48.476 Prioritizing HGNC:10545 - mem 447
I 2012-08-23 17:01:01.489 Prioritizing HGNC:12775 - mem 485
I 2012-08-23 17:01:46.084 Prioritizing HGNC:2001 - mem 564
C 2012-08-23 17:02:18.028 Exceeded soft private memory limit with 628.609 MB after servicing 1 requests total
Apart from some fluctuations the memory just keeps increasing, even though none of the previous values are accessed. I found it quite hard to debug this or to figure out if I have a memory leak somewhere, but I seem to have traced it down to that class. Would appreciate any help.
We had similar issues (with long running requests). We solved them by turning-off the default ndb cache. You can read more about it here
这篇关于NDB在长时间请求期间不清除内存的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!