延迟App Engine:追踪内存泄漏 [英] App Engine Deferred: Tracking Down Memory Leaks
问题描述
我们有一个App Engine应用程序,可以将相当大的文件写入Google Cloud Store。这些文件是动态创建的CSV文件,因此我们使用Python的 StringIO.StringIO
作为缓冲区,并使用 csv.writer
作为一般来说,我们的过程如下所示:
#根据需要导入
#(gcs是Google Cloud Store客户端)
buffer = StringIO.StringIO()
writer = csv.writer(buffer )
#...
#写入一些行
#...
data = file_buffer.getdata()
filename = 'fileFilename.csv'
try:
with gcs.open(filename,content_type ='text / csv',mode ='w')as file_stream:
file_stream.write (数据)
file_stream.close()
除了异常,e:
#处理异常
finally:
file_buffer.close()
据我们所知, csv.writer
不会需要自己关闭。相反,只有上面的缓冲区
和 file_stream
需要关闭。
我们在调用 deferred
时运行上述过程通过App Engine的任务队列。最终,我们在执行一些任务后出现以下错误:
在维护完成后超过128 MB的软专用内存限制和142 MB 11个请求总数
显然,我们的应用程序中存在内存泄漏。然而,如果上面的代码是正确的(我们承认可能并非如此),那么我们唯一的想法就是通过服务我们的请求来保留大量的内存(如错误消息所暗示的)。 p>
因此,我们想知道在执行 deferred
时,App Engine是否保留了一些实体。我们还应该注意到,尽管出现这些错误消息,但我们的CSV最终仍能成功编写。
描述的症状不是必然表示应用程序内存泄漏。可能的替代解释包括:应用程序的基准内存占用量(对于像python这样的脚本语言沙箱可以大于对应的脚本语言的占用空间)实例启动时间,请参阅内存使用情况不同在前端和后端之间非常(很奇怪))可能对于实例类为应用程序/模块配置。要解决 - 选择了更高的内存实例类(其中,作为副作用,也意味着更快的类实例)。或者,如果由于超出内存限制而导致的实例查杀率可以容忍,那么让GAE回收这些实例即可:)特别是如果启用了多线程请求处理,则 相关Q& As:
在Google ndb库中的内存泄漏
We have an App Engine application that writes many files of a relatively large size to Google Cloud Store. These files are CSVs that are dynamically created, so we use Python's StringIO.StringIO
as a buffer and csv.writer
as the interface for writing to that buffer.
In general, our process looks like this:
# imports as needed
# (gcs is the Google Cloud Store client)
buffer = StringIO.StringIO()
writer = csv.writer(buffer)
# ...
# write some rows
# ...
data = file_buffer.getdata()
filename = 'someFilename.csv'
try:
with gcs.open(filename, content_type='text/csv', mode='w') as file_stream:
file_stream.write(data)
file_stream.close()
except Exception, e:
# handle exception
finally:
file_buffer.close()
As we understand it, the csv.writer
does not need to be closed itself. Rather, only the buffer
above and the file_stream
need be closed.
We run the above process in a deferred
, invoked by App Engine's task queue. Ultimately, we get the following error after a few invocations of our task:
Exceeded soft private memory limit of 128 MB with 142 MB after servicing 11 requests total
Clearly, then, there is a memory leak in our application. However, if the above code is correct (which we admit may not be the case), then our only other idea is that some large amount of memory is being held through the servicing of our requests (as the error message suggests).
Thus, we are wondering if some entities are kept by App Engine during the execution of a deferred
. We should also note that our CSVs are ultimately written successfully, despite these error messages.
The symptom described isn't necessarily an indication of an application memory leak. Potential alternate explanations include:
- the app's baseline memory footprint (which for the scripting-language sandboxes like python can be bigger than the footprint at the instance startup time, see Memory usage differs greatly (and strangely) between frontend and backend) may be too high for the instance class configured for the app/module. To fix - chose a higher memory instance class (which, as a side effect, also means a faster class instance). Alternatively, if the rate of instance killing due to exceeding memory limits is tolerable, just let GAE recycle the instances :)
- peaks of activity, especially if multi-threaded request handling is enabled, means higher memory consumption and also potential overloading of the memory garbage collector. Limiting the number of requests performed in parallel, adding (higher) delays in lower priority deferred task processing and other similar measures reducing the average request processing rate per instance can help give the garbage collector a chance to cleanup leftovers from requests. Scalability should not be harmed (with dynamic scaling) as other instances would be started to help with the activity peak.
Related Q&As:
- How does app engine (python) manage memory across requests (Exceeded soft private memory limit)
- Google App Engine DB Query Memory Usage
- Memory leak in Google ndb library
这篇关于延迟App Engine:追踪内存泄漏的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!