MATLAB 的垃圾收集器? [英] MATLAB's Garbage Collector?

查看:34
本文介绍了MATLAB 的垃圾收集器?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

你的心理模型是什么?它是如何实施的?它有哪些优点和缺点?MATLAB GC vs. Python GC?

What is your mental model of it? How is it implemented? Which strengths and weaknesses does it have? MATLAB GC vs. Python GC?

在看起来无害的代码中使用 MATLAB 嵌套函数时,我有时会看到奇怪的性能瓶颈,我确定这是因为 GC.垃圾收集器是 VM 的重要组成部分,Mathworks 并未将其公开.

I sometimes see strange performance bottlenecks when using MATLAB nested functions in otherwise innocuously looking code, I am sure it is because of GC. Garbage Collector is an important part of VM and Mathworks does not make it public.

我的问题是关于 MATLAB 自己的和 GC!与处理 Java/COM 对象/防止内存不足"错误/堆栈变量分配无关.

My question is about MATLAB's own heap and GC! Not about handling of Java/COM objects / preventing "out of memory" errors / allocation of stack variables.

第一个回答实际上是元回答我为什么要关心?".我很在意,因为 GC 在实现 linked listMVC 模式.

the first response is actually the meta-answer "Why should I care?". I do care because GC manifests itself when implementing linked list or MVC pattern.

推荐答案

这是我收集的事实列表.在这种情况下,术语 memory (de)allocation 似乎比 GC 更合适.

This is the list of facts I collected. Instead of GC the term memory (de)allocation seems to be more appropriate in this context.

我的主要信息来源是 Loren 的博客(尤其是它的评论)和 这篇来自 MATLAB Digest 的文章.

My principal information source is the blog of Loren (especially its comments) and this article from MATLAB Digest.

由于其面向可能具有大型数据集的数值计算,MATLAB 在优化 堆栈对象 性能如使用 对数据进行就地操作按引用调用 函数参数.同样因为它的方向,它的内存模型基本上是 不同于像Java这样的面向对象语言.

Because of its orientation for numeric computing with possible large data sets, MATLAB does really good job on optimizing stack objects performance like using in-place operations on data and call-by-reference on function arguments. Also because of its orientation its memory model is fundamentally different from such OO languages as Java.

MATLAB 在版本 7 之前正式没有用户定义的堆内存(在版本 6 中,schema.m 文件中有未记录的 reference 功能).MATLAB 7 具有 嵌套函数(闭包)和处理对象两种形式的堆,它们的实施共享相同的基础.作为旁注,OO 可以模拟 在 MATLAB 中带有闭包(对于 2008a 之前的版本很有趣).

MATLAB had officially no user-defined heap memory until version 7 (in version 6 there was undocumented reference functionality in schema.m files). MATLAB 7 has heap both in form of nested functions (closures) and handle objects, their implementation share the same underpinnings. As a side note OO could be emulated with closures in MATLAB (interesting for pre-2008a).

令人惊讶的是,可以检查由函数句柄(闭包)捕获的封闭函数的整个工作区,请参阅函数 MATLAB 帮助中的函数(fhandle).这意味着封闭的工作区正在冻结在内存中.这就是为什么 cellfun/arrayfun 在嵌套函数中使用时有时会很慢的原因.

Surprisingly it is possible to examine entire workspace of the enclosing function captured by function handle (closure), see function functions(fhandle) in MATLAB Help. It means that enclosing workspace is being frozen in memory. This is why cellfun/arrayfun are sometimes very slow when used inside nested functions.

LorenBrad Phelan 关于对象清理.

There are also interesting posts by Loren and Brad Phelan on object cleanup.

在 MATLAB 中关于堆释放最有趣的事实是,在我看来,MATLAB 尝试在每次释放堆栈时执行它,即在离开每个函数时.这有优势,但也是一个巨大的如果堆释放缓慢,则 CPU 损失.而且在某些情况下在 MATLAB 中实际上非常慢!

The most interesting fact about heap deallocation in MATLAB is, in my opinion, that MATLAB tries to do it each time the stack is being deallocated, i.e. on leaving every function. This has advantages but is also a huge CPU penalty if heap deallocation is slow. And it is actually very slow in MATLAB in some scenarios!

MATLAB 内存释放可以命中代码的性能问题非常糟糕.我总是注意到我无意中在我的代码中引入了循环引用,因为它突然以 x20 的速度运行,有时在离开函数和返回调用者之间需要几秒钟的时间(花在清理上的时间).这是一个已知问题,参见 Dave Foti这个较旧的论坛帖子 用哪个代码让这张图可视化性能(测试是在不同机器上进行的,所以不同MATLAB版本的绝对时序比较没有意义):

The performance problems of MATLAB memory deallocation that can hit code are pretty bad. I always notice that I unintentionally introduce a cyclic references in my code when it suddenly runs x20 slower and sometimes needs some seconds between leaving function and returning to its caller (time spent on cleanup). It is a known problem, see Dave Foti and this older forum post which code is used to make this picture visualizing performance (tests are made on different machines, so absolute timing comparison of different MATLAB versions is meaningless):

参考对象池大小的线性增加意味着 MATLAB 性能的多项式(或指数)下降!对于价值对象,正如预期的那样,性能是线性的.

Linear increase of pool size for reference-objects means polynomial (or exponential) decrease of MATLAB performance! For value-objects the performance is, as expected, linear.

考虑到这些事实,我只能推测 MATLAB 使用了不是非常有效的引用计数形式来进行堆释放.

Considering these facts I can only speculate that MATLAB uses not yet very efficient form of reference counting for heap deallocation.

编辑:我总是遇到许多小嵌套函数的性能问题,但最近我注意到,至少在 2006a 中,单个嵌套作用域的清理em> 有几兆的数据也很糟糕,仅将嵌套范围变量设置为空就需要 1.5 秒!

EDIT: I always encountered performance problem with many small nested functions but recently I noticed that at least with 2006a the cleanup of a single nested scope with some megabytes of data is also terrible, it takes 1.5 seconds just to set nested scope variable to empty!

EDIT 2:最后我得到了答案 - Dave Foti 本人.他承认存在缺陷,但表示 MATLAB 将保留其目前的确定性清理方法.

EDIT 2: finally I got the answer - by Dave Foti himself. He acknowledges the flaws but says that MATLAB is going to retain its present deterministic cleanup approach.

图例:执行时间越短越好

这篇关于MATLAB 的垃圾收集器?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆