HotSpot使用的Mark-Compact算法是什么? [英] What's the Mark-Compact algorithm used by HotSpot?
问题描述
在
When reading the Mark-Compact chapter on The Garbage Collection Handbook, a sequence of alternatives were presented, but most of them looked old / theoretical (for instance, the 2-finger compaction and the Lisp2 3-pass approach requiring an extra header word per object).
有人知道HotSpot在运行Mark-Compact时会使用哪种算法(我想是在其前代产品中)?
Is anyone aware of what algorithm does HotSpot uses when running Mark-Compact (in its old-generation, I assume)?
谢谢
推荐答案
免责声明:我不是GC专家/作家;波纹管中写的所有东西都可能会发生变化,其中有些可能过于简单.请把它和一粒盐一起吃.
Big disclaimer: I am not a GC expert/writer; all the things that are written bellow are subject to changes and some of them might be way too simplistic. Please take this with a grain of salt.
我只会说Shenandoah
,因为我认为我理解它;这不是世代的GC.
I will only speak about Shenandoah
, as I think I understand it; which is not a generational GC.
这里实际上有两个阶段:Mark
和Compact
.在这里,我会特别强调两者都是 concurrent ,并且确实会在您的应用程序运行时发生(带有一些非常短的STW事件).
There are two phases here actually: Mark
and Compact
. I would strongly emphases here that both are concurrent and do happen while your application is running (with some very short STW events).
现在到细节.我已经在这里解释了一些事情,但是因为该答案与某种不同的问题有关;我会在这里解释更多.我假设遍历活动对象图对您来说不是什么新闻,毕竟您正在读了一本有关GC
的书.正如该答案所解释的,当应用程序完全停止(也称为安全点)时,识别活动对象很容易.没有人在脚下改变任何东西,地板坚硬,您可以控制一切.并行收集器可以做到这一点.
And now to the details. I have explained a bit of things here, but because that answer is related to somehow a different question; I'll explain more here. I assume that traversing the graph of live objects is no news for you, after all you are reading a book about GC
. As that answer explains, when the application is fully stopped (also called brought to safe-points), identifying live objects is easy. No one is changing anything under you feet, the floor is rigid and you control everything. Parallel collectors do this.
真正痛苦的方式是并发做事. Shenandoah使用一种称为Snapshot at the beginning
的算法(该书将其解释为AFAIK),简称为SATB
.基本上,此算法的实现方式如下:我将开始同时扫描对象的图(从GC根开始),如果在扫描过程中任何内容发生了变化,我都不会改变.堆,但将记录这些更改并在以后进行处理".
The really painful way is to do things concurrent. Shenandoah employs an algorithm called Snapshot at the beginning
(that book explains it AFAIK), will call it SATB
for short. Basically this algorithm is implemented like this: "I will start to scan concurrently the graph of objects (from GC roots), if anything changes while I scan, I will not alter the heap, but will record these changes and deal with them later".
您需要询问的第一部分是:我扫描时.如何实现的?好吧,在执行concurrent mark
之前,有一个名为Initial Mark
的STW event
.在该阶段要完成的事情之一是设置一个标志,指示并发标记已开始.稍后,在执行代码时,将检查该标志(Shenandoah
因此采用了解释器中的更改).用伪代码:
The very first part that you need to question is : while I scan. How is that achieved? Well, before doing the concurrent mark
, there is a STW event
called Initial Mark
. One of the things that gets done in that phase is to set a flag that concurrent marking has started. Later, while executing code, that flag is checked (Shenandoah
thus employs changes in the interpreter). In pseudo-code:
if(!concurrentMarkingActive) {
// do whatever you were doing and alter the heap
} else {
// shenandoah magic
}
在可能看起来像这样的机器代码中:
In machine code that might look like this:
test %r11, %r11 (test concurrentMarkingActive flag)
jne // concurrent marking is currently active
现在,GC知道何时进行并发标记.
Now GC knows when concurrent marking is happening.
但是并发标记是如何实现的.当堆本身发生突变(不稳定)时,如何扫描堆?脚下的地板增加了更多的孔,并且也将其移除.
But how is concurrent marking even implemented. How can you scan the heap while the heap itself is mutated (not stable)? The floor under your feet adds more holes and removes them also.
那是神仙多亚魔法".对堆的更改被拦截",而不是直接保留.因此,如果GC在这个时间点执行并发标记,并且应用程序代码尝试使堆发生变化,则这些更改将记录在每个线程SATB queues
中(开头是快照).当并发标记结束时,将排空那些队列(通过称为Final Mark
的STW event
),并再次分析那些排空的更改(现在记在STW event
下).
That is the "shenandoah magic". Changes to the heap are "intercepted" and not persisted directly. So if GC performs a concurrent mark at this point in time, and application code tries to mutate the heap, those changes are recorded in each threads SATB queues
(snapshot at the beginning). When concurrent mark is over, those queues are drained (via a STW event
called Final Mark
) and those changes that were drained are analyzed again (remember under a STW event
now).
完成此阶段最终标记时,GC知道存在的内容,因此也隐含了垃圾内容.
When this phase Final Mark is over GC knows what is alive and thus what is implicitly garbage.
紧致阶段是下一个.现在应该Shenandoah
将活动对象移动到不同的区域(以紧凑的方式),并将当前区域标记为可以再次分配的区域.当然,在简单的STW phase
中,这很容易:移动对象,更新指向该对象的引用.完毕.当您必须同时进行 ...
Compact phase is next. Shenandoah
is now supposed to move live objects to different regions (in a compact fashion) and mark the current region as one where we can allocate again. Of course, in a simple STW phase
, this would be easy : move the object, update references pointing to it. Done. When you have to do it concurrently...
您不能拿走对象并将其移动到另一个区域,然后然后一个接一个地更新您的引用.考虑一下,让我们假设这是我们拥有的第一个状态:
You can't take the object and simply move it to a different region and then update your references one by one. Think about it, let's suppose this is the first state we have:
refA, refB
|
---------
| i = 0 |
| j = 0 |
---------
对此实例有两个引用:refA
和refB
.我们创建该对象的副本:
There are two references to this instance: refA
and refB
. We create a copy of this object:
refA, refB
|
--------- ---------
| i = 0 | | i = 0 |
| j = 0 | | j = 0 |
--------- ---------
我们创建了一个副本,但尚未更新任何参考.现在,我们将单个引用指向副本:
We created a copy, but have not updated any references yet. We now move a single reference to point to the copy:
refA refB
| |
--------- ---------
| i = 0 | | i = 0 |
| j = 0 | | j = 0 |
--------- ---------
现在有趣的部分是:ThreadA
执行refA.i = 5
,而ThreadB
执行refB.j = 6
,因此您的状态变为:
And now the interesting part: ThreadA
does refA.i = 5
, while ThreadB
does refB.j = 6
so your state becomes:
refA refB
| |
--------- ---------
| i = 5 | | i = 0 |
| j = 0 | | j = 6 |
--------- ---------
您现在如何合并这些对象?老实说-我什至不知道那是否有可能,而且这也不是Shenandoah
采取的路线.
How do you merge these objects now? I'll be honest - I have no idea if that would even be possible and neither is this a route that Shenandoah
took.
相反,来自Shenandoah
的解决方案做了一件非常有趣的事情,恕我直言.在每个实例中添加了额外指针,也称为转发指针:
Instead, the solution from Shenandoah
does a very interesting thing IMHO. An extra pointer added to each instance, also called forwarding pointer:
refA, refB
|
fwdPointer1
|
---------
| i = 0 |
| j = 0 |
---------
refA
和refB
指向fwdPointer1
,而fwdPointer1
指向实际对象.现在创建副本:
refA
and refB
points to fwdPointer1
, while fwdPointer1
to the real Object. Let's create the copy now:
refA, refB
|
fwdPointer1 fwdPointer2
| |
--------- ---------
| i = 0 | | i = 0 |
| j = 0 | | j = 0 |
--------- ---------
现在,我们要切换所有引用(refA
和refB
)以指向副本.如果仔细观察,这只需要更改单个指针-fwdPointer1
.使fwdPointer1
指向fwdPointer2
,您已完成.这意味着一个更改,而不是refB
的两个.这样做的最大好处是,您无需扫描堆并找出指向您的实例的引用.
And now, we want to switch all references (refA
and refB
) to point to the copy. If you look closely, this requires only a single pointer change - fwdPointer1
. Make fwdPointer1
point to fwdPointer2
and you are done. This means one single change as opposed to two (in this set-up) of refA
and refB
. The bigger win here is that you don't need to scan the heap and find out references that point to your instance.
有没有办法自动更新引用?当然:AtomicReference
(至少在Java中).这里的想法几乎是相同的,我们通过CAS
(比较和交换)原子地更改fwdPointer1
,例如:
Is there a way to atomically update a reference? Of course : AtomicReference
(at least in java). The idea here is almost the same, we atomically change the fwdPointer1
via a CAS
(compare and swap), as such:
refA, refB
|
fwdPointer1 ---- fwdPointer2
|
--------- ---------
| i = 0 | | i = 0 |
| j = 0 | | j = 0 |
--------- ---------
因此,refA
和refB
指向fwdPointer1
,现在它指向我们创建的副本.通过一次CAS
操作,我们已同时 切换了所有对新创建副本的引用.
So, refA
and refB
point to fwdPointer1
, which now points to the copy we have created. Via a single CAS
operation, we have switched concurrently all references to the newly created copy.
但是,我们需要了解缺点,没有免费的午餐.
But, we need to understand the drawbacks, there is no free lunch.
-
首先,很明显:
Shenandoah
添加了一个计算机头,该头用于堆中的每个实例(进一步阅读,因为这是错误的;但是使理解更容易).
First, is obvious :
Shenandoah
adds a machine header that each instance in the heap (read further, as this is false; but makes understanding easier).
这些副本中的每个副本都会在新区域中生成一个额外的对象,因此在某个点上至少会有两个相同对象的副本(因此,Shenandoah
起作用需要额外的空间).
Each of these copies will generate an extra object in the new region, thus at some point there will be at least two copies of the same object (extra space required for Shenandoah
to function, as such).
当ThreadA
执行refA.i = 5
(来自上一示例)时,如何知道是否应尝试创建副本,写入该副本以及CAS
与forwarding pointer
相比,只需执行写对象?请记住,这是同时发生的.与concurrentMarkingActive
标志相同的解决方案.有一个标志isEvacuationToADifferentRegionActive
(不是实际名称).如果该标志是true
=> Shenandoah Magic,则只需照此进行写.
When ThreadA
does refA.i = 5
(from the previous example), how does it know if it should try to create a copy, write to that copy and CAS
that forwarding pointer
vs simply do a write to the object? Remember that this happens concurrently. Same solution as with concurrentMarkingActive
flag. There is a flag isEvacuationToADifferentRegionActive
(not the actual name). If that flag is true
=> Shenandoah Magic, else simply do the write as it.
如果您真的了解了这最后一点,您的自然问题应该是:
If you really understood this last point, your natural question should be :
等待秒!这是否意味着对于每个实例,Shenandoah对
isEvacuationToADifferentRegionActive
进行if/else
对每个实例的写操作-是该原语还是引用?这还意味着必须通过forwarding pointer
?"
"WAIT A SECOND! Does this mean that Shenandoah does an
if/else
againstisEvacuationToADifferentRegionActive
for EACH AND SINGLE write to an instance - be that primitive or reference? Also does that mean that EACH read must be accessed via theforwarding pointer
?"
答案曾经是 是;但事情发生了变化:通过此问题(尽管我听起来很多比实际情况更糟).现在他们对整个对象使用Load
障碍,更多详细信息
The answer used to be YES; but things have changed: via this issue (though I make it sound a lot worse than it really is). Now they use Load
barriers for the entire Object, more details here. Instead of having a barrier on each write (that if/else
against the flag) and a dereference via the forwarding pointer
for each read, they moved to a load barrier
. Basically do that if/else
only when you load the object. Since writing to it implies reading first, they thus preserve "to-space invariant". Apparently this is simpler, better and easier to optimize. Hooray!
还记得forwarding pointer
吗?好吧,它已经不存在了.我还不了解它的全部细节(但是),但是它必须做一些可能使用mark word
和from space
的事情,由于增加了负载屏障,因此不再使用了.很多此处有更多详细信息.一旦我了解了它在内部的实际工作原理,便会更新该帖子.
Remember that forwarding pointer
? Well, it does not exist anymore. I don't understand the details in its entire glory (yet), but it has to do something with the possibility to use the mark word
and the from space
that, since the addition of load barriers, is not used anymore. A lot more details here. Once I understand how that really works internally, will update the post.
G1
与Shenandoah
并没有太大区别,但细节之处在于魔鬼.例如,G1
中的Compact
阶段始终是STW
事件. G1
总是 代-即使您不希望这样做(Shenandoah
可以有点像-可以控制此设置),依此类推.
G1
is not VERY much different than what Shenandoah
is, but the devil is in the details. For example Compact
phase in G1
is a STW
event, always. G1
is always generational - even if you want that or not (Shenandoah
can be sort of like that - there is a setting to control this), etc.
这篇关于HotSpot使用的Mark-Compact算法是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!