垃圾收集器如何比显式内存释放更快? [英] How can garbage collectors be faster than explicit memory deallocation?

查看:85
本文介绍了垃圾收集器如何比显式内存释放更快?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在阅读此生成的html,(可能会过期,这是原始ps文件.)

GC误区3:垃圾收集器总是比显式的内存释放慢.
GC神话4:垃圾收集器总是比显式的内存释放更快.

对我来说,这是一个很大的WTF.与显式的内存释放相比,GC如何更快?它实质上不是在释放内存/使其再次使用时调用显式内存释放程序吗?那么... wtf ....实际上是什么意思?

非常小的物体稀疏 堆==> GC通常更便宜, 尤其是线程

我还是不明白.就像说C ++比机器代码要快(如果您不理解这句话中的wtf,请停止编程.让-1开始).经过一番快速的Google搜索后,有消息来源建议您在有大量内存时速度更快.我在想的是,它根本不会打扰免费的人.当然可以很快,而且我编写了一个自定义分配器来执行该操作,在一个不释放任何对象的应用程序中它根本不释放(void free(void*p){})(它仅在终止时才释放),并且大部分具有定义如果是lib和诸如stl之类的东西.所以...我很确定这也会使GC更快.如果我仍然想要释放,我想我可以使用一个分配器,该分配器使用双端队列或本质上就是这样的实现

if (freeptr < someaddr) {
    *freeptr=ptr;
    ++freeptr; 
}
else
{
    freestuff();
    freeptr = freeptrroot;
}

我相信它会很快.我已经回答了我的问题.永远不会调用GC收集器的情况是这样会更快,但是...我确信这不是文档的意思,因为它在测试中提到了两个收集器.我敢肯定,即使使用什么GC,即使调用一次GC收集器,同样的应用程序也会变慢.如果知道它永远不需要免费,那么可以像我拥有的​​那个应用一样使用一个空的免费主体.

无论如何,我会发布此问题以进一步了解.

解决方案

GC会比显式内存释放更快吗?

  1. GC可以将指针碰撞分配分配到线程本地的代中,然后依靠复制集合来处理(相对)疏散幸存者的罕见情况.像malloc这样的传统分配器经常争夺全局锁和搜索树.

  2. GC可以通过重置线程本地分配缓冲区而不是依次在每个块上依次调用free,即O(1)而不是O(n)来同时释放许多死块.

  3. 通过压缩旧块,以便将更多块放入每个高速缓存行中.改进的位置提高了缓存效率.

  4. 通过利用诸如不变类型之类的额外静态信息.

  5. 通过利用额外的动态信息,例如通过写屏障记录的数据来改变堆的拓扑.

  6. 通过使更有效的技术易于处理,例如通过消除免等待算法中的手动内存管理的麻烦.

  7. 通过将释放分配推迟到更合适的时间或将其卸载到另一个内核. (感谢安德鲁·希尔的这个想法!)

I was reading this html generated, (may expire, Here is the original ps file.)

GC Myth 3: Garbage collectors are always slower than explicit memory deallocation.
GC Myth 4: Garbage collectors are always faster than explicit memory deallocation.

This was a big WTF for me. How would GC be faster then explicit memory deallocation? isnt it essentially calling a explicit memory deallocator when it frees the memory/make it for use again? so.... wtf.... what does it actually mean?

Very small objects & large sparse heaps ==> GC is usually cheaper, especially with threads

I still don't understand it. Its like saying C++ is faster then machine code (if you don't understand the wtf in this sentence please stop programming. Let the -1 begin). After a quick google one source suggested its faster when you have a lot of memory. What i am thinking is it means it doesn't bother will the free at all. Sure that can be fast and i have written a custom allocator that does that very thing, not free at all (void free(void*p){}) in ONE application that doesnt free any objects (it only frees at end when it terminates) and has the definition mostly in case of libs and something like stl. So... i am pretty sure this will be faster the GC as well. If i still want free-ing i guess i can use an allocator that uses a deque or its own implementation thats essentially

if (freeptr < someaddr) {
    *freeptr=ptr;
    ++freeptr; 
}
else
{
    freestuff();
    freeptr = freeptrroot;
}

which i am sure would be really fast. I sort of answered my question already. The case the GC collector is never called is the case it would be faster but... i am sure that is not what the document means as it mention two collectors in its test. i am sure the very same application would be slower if the GC collector is called even once no matter what GC used. If its known to never need free then an empty free body can be used like that one app i had.

Anyways, i post this question for further insight.

解决方案

How would GC be faster then explicit memory deallocation?

  1. GCs can pointer-bump allocate into a thread-local generation and then rely upon copying collection to handle the (relatively) uncommon case of evacuating the survivors. Traditional allocators like malloc often compete for global locks and search trees.

  2. GCs can deallocate many dead blocks simultaneously by resetting the thread-local allocation buffer instead of calling free on each block in turn, i.e. O(1) instead of O(n).

  3. By compacting old blocks so more of them fit into each cache line. The improved locality increases cache efficiency.

  4. By taking advantage of extra static information such as immutable types.

  5. By taking advantage of extra dynamic information such as the changing topology of the heap via the data recorded by the write barrier.

  6. By making more efficient techniques tractable, e.g. by removing the headache of manual memory management from wait free algorithms.

  7. By deferring deallocation to a more appropriate time or off-loading it to another core. (thanks to Andrew Hill for this idea!)

这篇关于垃圾收集器如何比显式内存释放更快?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆