Python H2O内存管理 [英] Python H2O Memory Management

查看:98
本文介绍了Python H2O内存管理的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

类似于R 此处中的这个问题,我离开了在H2O中使用网格搜索运行循环时出现内存问题.在R中,在每个循环中执行gc()确实有帮助.这里建议的解决方案是什么?

Similar to this question in R here, I get out of memory issues when running loops with grid search in H2O. In R, doing gc() during each loop did help. What is the proposed solution here?

推荐答案

Python API中似乎没有h2o.gc()函数.请参阅如何调试内存问题?"在常见问题解答.如果您怀疑问题是后端保留了不再应该的内存,则可以使用REST API直接发布该后端命令(GarbageCollect).研究详细的日志,可能有助于确认是否是这种情况.

There appears to be no h2o.gc() function in the Python API. See "How can I debug memory issues?" in the FAQ. You could POST that back-end command (GarbageCollect) directly using the REST API if you suspect the problem is the back-end holding on to memory that it no longer should be. Studying the detailed logs, might help confirm if that is the case.

从评论中总结出建议:

  • 在循环结束时,在不再需要的H2O框架和模型上使用h2o.remove().
  • 如果您不需要保留任何内容,请使用h2o.removeAll(),并且循环将重新加载它所需的所有数据.
  • 使用H2OGridSearch而不是您自己的循环和自己的网格代码.
  • Use h2o.remove() on H2O frames and models you no longer need, at the end of the loop.
  • Use h2o.removeAll() if you do not need to keep anything around, and your loop will be re-loading all the data it needs.
  • Use H2OGridSearch rather than your own loops and your own grid code.

我还要补充一点,要知道cbind,rbind和任何修改H2O框架的函数都会复制整个框架.有时重新考虑数据整理步骤的方式可以减少内存需求.

I'd also add to be aware that cbind, rbind and any function that modifies an H2O frame will make a copy of the entire frame. Sometimes re-thinking the way you do your data munging steps can reduce the memory requirements.

这篇关于Python H2O内存管理的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆