App Engine Deferred:跟踪内存泄漏 [英] App Engine Deferred: Tracking Down Memory Leaks

查看:34
本文介绍了App Engine Deferred:跟踪内存泄漏的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们有一个 App Engine 应用程序,可以将许多相对较大的文件写入 Google Cloud Store.这些文件是动态创建的 CSV,因此我们使用 Python 的 StringIO.StringIO 作为缓冲区,csv.writer 作为写入该缓冲区的接口.

一般来说,我们的流程是这样的:

# 根据需要导入#(gcs 是 Google Cloud Store 客户端)缓冲区 = StringIO.StringIO()作家= csv.writer(缓冲区)# ...# 写一些行# ...数据 = file_buffer.getdata()文件名 = 'someFilename.csv'尝试:使用 gcs.open(filename, content_type='text/csv', mode='w') 作为 file_stream:file_stream.write(数据)file_stream.close()除了例外,e:# 处理异常最后:file_buffer.close()

据我们了解,csv.writer 不需要自行关闭.相反,只需要关闭上面的 bufferfile_stream.


我们在 deferred 中运行上述流程,由 App Engine 的任务队列调用.最终,我们在几次任务调用后得到以下错误:

<块引用>

在总共处理 11 个请求后,超过了 128 MB 的软私有内存限制,达到 142 MB

很明显,我们的应用程序中存在内存泄漏.但是,如果上述代码是正确的(我们承认可能并非如此),那么我们唯一的想法是通过服务我们的请求(如错误消息所示)保留了大量内存.

因此,我们想知道在 deferred 执行期间 App Engine 是否保留了某些实体.我们还应该注意,尽管有这些错误消息,我们的 CSV 最终还是成功写入了.

解决方案

所描述的症状必然表示应用程序内存泄漏.可能的替代解释包括:

  • 应用的基线内存占用(对于像 python 这样的脚本语言沙箱可能比实例启动时的占用更大,请参阅 前端和后端之间的内存使用差异很大(而且很奇怪))对于实例类.修复 - 选择更高内存的实例类(其中,作为副作用,也意味着更快的类实例).或者,如果超出内存限制导致实例被杀死的速度可以接受,就让 GAE 回收实例:)
  • 活动高峰,尤其是在启用多线程请求处理的情况下,意味着更高的内存消耗以及内存垃圾收集器的潜在过载.限制并行执行的请求数量,在较低优先级的延迟任务处理中添加(更高)延迟以及其他类似的降低每个实例的平均请求处理速率的措施可以帮助垃圾收集器有机会清理请求中的剩余部分.不应损害可扩展性(使用动态扩展),因为其他实例会开始帮助达到活动高峰.

相关问答:

We have an App Engine application that writes many files of a relatively large size to Google Cloud Store. These files are CSVs that are dynamically created, so we use Python's StringIO.StringIO as a buffer and csv.writer as the interface for writing to that buffer.

In general, our process looks like this:

# imports as needed
# (gcs is the Google Cloud Store client)

buffer = StringIO.StringIO()
writer = csv.writer(buffer)

# ...
# write some rows
# ...

data = file_buffer.getdata()
filename = 'someFilename.csv'

try:
    with gcs.open(filename, content_type='text/csv', mode='w') as file_stream:
        file_stream.write(data)
        file_stream.close()

except Exception, e:
    # handle exception
finally:
    file_buffer.close()

As we understand it, the csv.writer does not need to be closed itself. Rather, only the buffer above and the file_stream need be closed.


We run the above process in a deferred, invoked by App Engine's task queue. Ultimately, we get the following error after a few invocations of our task:

Exceeded soft private memory limit of 128 MB with 142 MB after servicing 11 requests total

Clearly, then, there is a memory leak in our application. However, if the above code is correct (which we admit may not be the case), then our only other idea is that some large amount of memory is being held through the servicing of our requests (as the error message suggests).

Thus, we are wondering if some entities are kept by App Engine during the execution of a deferred. We should also note that our CSVs are ultimately written successfully, despite these error messages.

解决方案

The symptom described isn't necessarily an indication of an application memory leak. Potential alternate explanations include:

  • the app's baseline memory footprint (which for the scripting-language sandboxes like python can be bigger than the footprint at the instance startup time, see Memory usage differs greatly (and strangely) between frontend and backend) may be too high for the instance class configured for the app/module. To fix - chose a higher memory instance class (which, as a side effect, also means a faster class instance). Alternatively, if the rate of instance killing due to exceeding memory limits is tolerable, just let GAE recycle the instances :)
  • peaks of activity, especially if multi-threaded request handling is enabled, means higher memory consumption and also potential overloading of the memory garbage collector. Limiting the number of requests performed in parallel, adding (higher) delays in lower priority deferred task processing and other similar measures reducing the average request processing rate per instance can help give the garbage collector a chance to cleanup leftovers from requests. Scalability should not be harmed (with dynamic scaling) as other instances would be started to help with the activity peak.

Related Q&As:

这篇关于App Engine Deferred:跟踪内存泄漏的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆