MATLAB的垃圾收集器? [英] MATLAB's Garbage Collector?

查看:479
本文介绍了MATLAB的垃圾收集器?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

你的心智模式是什么?它是如何实现的?它有哪些优势和劣势? MATLAB GC与Python GC



我在使用MATLAB嵌套函数时看到奇怪的性能瓶颈,是因为GC。垃圾收集器是虚拟机的重要组成部分,Mathworks不会公开它。



我的问题是关于MATLAB自己的和GC!不是关于处理Java / COM对象/防止内存不足错误/分配堆栈变量。



编辑:第一个响应是实际上是元回答我为什么要关心?。我很在乎,因为GC在执行链接列表 MVC pattern。

解决方案

这是我收集的事实清单。我们的主要信息来源是Loren博客,我们的主要信息来源是 内存(de)分配 (特别是其评论)和这篇文章来自MATLAB Digest。



由于数值计算的方向可能与大数据集有关,所以MATLAB在优化堆栈对象性能如使用就地对数据进行操作 call-by-reference 函数参数。同样因为它的定位,它的记忆模型基本上是与Java等OO语言不同。



MATLAB在版本7之前没有用户定义的堆内存(在版本6中没有记录参考功能位于 schema.m 文件中)。 MATLAB 7具有嵌套函数(闭包)和处理对象的形式实施共享相同的基础。作为旁注,OO可以模拟与MATLAB中的闭包(2008年之前有趣)。

令人惊讶的是,可以检查由函数句柄(闭包)捕获的封闭函数的整个工作空间,参见函数 functions(fhandle)。这意味着封闭的工作空间在内存中被冻结。这就是为什么 cellfun / arrayfun 在嵌套函数中使用时有时非常慢。



罗兰 Brad Phelan on object cleanup。



在我看来,关于MATLAB中堆释放的最有趣的事实是,每次堆栈被释放时(即离开每个函数时),MATLAB都会尝试执行堆释放。这有优势,但也是一个巨大的如果堆释放速度缓慢,CPU会受到惩罚。在某些情况下,它实际上在MATLAB中非常缓慢!



可能遇到代码的MATLAB内存释放的性能问题非常糟糕。我总是注意到,当它突然运行x20时,无意中在我的代码中引入了一个循环引用,有时在离开函数和返回调用程序(花费在清理上的时间)之间需要几秒钟的时间。这是一个已知的问题,请参阅 Dave Foti 这个较旧的论坛帖子使用哪个代码来使这个图片可视化性能(测试是在不同的机器上进行的,因此不同MATLAB版本的绝对时序比较是没有意义的):

线性增加参考对象池大小意味着多项式(或指数)减少MATLAB性能!对于值对象,性能如预期的那样是线性的。考虑到这些事实,我只能推测,MATLAB使用的还不是非常有效的引用计数形式。 / em>,以解决堆释放问题。



编辑:我总是遇到很多小嵌套函数的性能问题,但最近我注意到,至少在2006年,清理一个嵌套的作用域与一些兆字节的数据也很糟糕,只需要1.5秒就可以将嵌套的作用域变量设置为空!



编辑2 :最后我得到了答案 - Dave Foti自己。他承认这些缺陷,但表示MATLAB将保留现有的确定性清理方法。

图例:执行时间越短越好






What is your mental model of it? How is it implemented? Which strengths and weaknesses does it have? MATLAB GC vs. Python GC?

I sometimes see strange performance bottlenecks when using MATLAB nested functions in otherwise innocuously looking code, I am sure it is because of GC. Garbage Collector is an important part of VM and Mathworks does not make it public.

My question is about MATLAB's own heap and GC! Not about handling of Java/COM objects / preventing "out of memory" errors / allocation of stack variables.

EDIT: the first response is actually the meta-answer "Why should I care?". I do care because GC manifests itself when implementing linked list or MVC pattern.

解决方案

This is the list of facts I collected. Instead of GC the term memory (de)allocation seems to be more appropriate in this context.

My principal information source is the blog of Loren (especially its comments) and this article from MATLAB Digest.

Because of its orientation for numeric computing with possible large data sets, MATLAB does really good job on optimizing stack objects performance like using in-place operations on data and call-by-reference on function arguments. Also because of its orientation its memory model is fundamentally different from such OO languages as Java.

MATLAB had officially no user-defined heap memory until version 7 (in version 6 there was undocumented reference functionality in schema.m files). MATLAB 7 has heap both in form of nested functions (closures) and handle objects, their implementation share the same underpinnings. As a side note OO could be emulated with closures in MATLAB (interesting for pre-2008a).

Surprisingly it is possible to examine entire workspace of the enclosing function captured by function handle (closure), see function functions(fhandle) in MATLAB Help. It means that enclosing workspace is being frozen in memory. This is why cellfun/arrayfun are sometimes very slow when used inside nested functions.

There are also interesting posts by Loren and Brad Phelan on object cleanup.

The most interesting fact about heap deallocation in MATLAB is, in my opinion, that MATLAB tries to do it each time the stack is being deallocated, i.e. on leaving every function. This has advantages but is also a huge CPU penalty if heap deallocation is slow. And it is actually very slow in MATLAB in some scenarios!

The performance problems of MATLAB memory deallocation that can hit code are pretty bad. I always notice that I unintentionally introduce a cyclic references in my code when it suddenly runs x20 slower and sometimes needs some seconds between leaving function and returning to its caller (time spent on cleanup). It is a known problem, see Dave Foti and this older forum post which code is used to make this picture visualizing performance (tests are made on different machines, so absolute timing comparison of different MATLAB versions is meaningless):

Linear increase of pool size for reference-objects means polynomial (or exponential) decrease of MATLAB performance! For value-objects the performance is, as expected, linear.

Considering these facts I can only speculate that MATLAB uses not yet very efficient form of reference counting for heap deallocation.

EDIT: I always encountered performance problem with many small nested functions but recently I noticed that at least with 2006a the cleanup of a single nested scope with some megabytes of data is also terrible, it takes 1.5 seconds just to set nested scope variable to empty!

EDIT 2: finally I got the answer - by Dave Foti himself. He acknowledges the flaws but says that MATLAB is going to retain its present deterministic cleanup approach.

Legend: Shorter execution time is better

这篇关于MATLAB的垃圾收集器?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆