卡片表和作者屏障实际上如何工作? [英] How actually card table and writer barrier works?

查看:123
本文介绍了卡片表和作者屏障实际上如何工作?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在阅读一些关于Java中垃圾收集的材料,以深入了解GC过程中真正发生的事情。



我遇到了名为卡表。我谷歌搜索了它,并没有找到全面的信息。大部分解释都很浅,并且描述得像一些魔术。



我的问题是:卡片表和写入屏障如何工作?卡表中标记了什么?那么垃圾收集器如何知道这个特定的对象被另一个对象所引用,并保留在老一代中。



我想对这个机制有一些实际的想象,就像我应该准备一些模拟。

解决方案

我不知道你是否发现了一些格外糟糕的描述,或者你是否期望太多细节,对于我见过的解释。如果描述简洁明了,那是因为它确实是一个相当简单的机制。



正如你显然已经知道的那样,一代垃圾收集器需要能够枚举指代年轻物体的旧物体。扫描所有旧物体是正确的,但这破坏了世代方法的优点,所以你必须缩小它的范围。无论你如何做,都需要一个写屏障 - 每当成员变量(引用类型的)被分配/写入时执行一段代码。如果新引用指向一个年轻对象并且它存储在一个旧对象中,则写入屏障会记录垃圾收集的事实。区别在于它如何记录。有确切的方案使用所谓的记忆集合,每个 旧对象的集合,它具有(在某个时刻)对年轻对象的引用。正如你可以想象的那样,这需要相当多的空间。



卡表是一种折衷:不是告诉你哪些对象正好 >包含年轻的指针(或者至少在某些时候),它将对象分组到固定大小的桶中,并跟踪哪些桶包含带有年轻指针的对象。这当然会减少空间使用量。为了保持正确性,只要你对这些对象保持一致,你如何挖掘这些对象并不重要。为了提高效率,您只需按照内存地址对它们进行分组(因为您可以免费获得),然后再除以2的较大幂数(以使该分区成为便宜的按位操作)。

另外,不是维护一个明确的桶列表,而是为每个可能的桶提前预留一些空间。具体来说,有一个N位或字节的数组,其中N是桶的数量,所以如果 i i th的值为0 th桶不包含年轻的指针,或者1包含年轻的指针。这是卡表正确的。通常,这个空间被分配并释放,同时还有一大堆作为堆的一部分的内存。如果不需要增长的话,它甚至可以嵌入到内存块的开始部分。除非整个地址空间被用作堆(这是非常罕见的),否则上面的公式给出的数字从 start_of_memory_region>> K >而不是0,所以为了得到卡表的索引,你必须减去堆的开始地址的开始。



总之,当写屏障发现语句 some_obj.field = other_obj; 将一个年轻指针存储在一个旧对象中时,它会这样做:

  card_table [(&old_obj  -  start_of_heap)>> K] = 1; 

其中& old_obj 是地址这个对象现在有一个年轻的指针(因为它刚刚决定引用一个旧对象,所以它已经在一个寄存器中)。
在次要GC期间,垃圾收集器查看卡表以确定哪些堆区要扫描年轻指针:

 对于i从0到(heap_size>> K):
如果card_table [i]:
扫描堆[i<< K ..(i + 1)< K]为年轻的指针


I am reading some materials about garbage collection in Java in order to get to know more deeply what really happens in GC process.

I came across on the mechanism called "card table". I've Googled for it and haven't found comprehensive information. Most of explanations are rather shallow and describes it like some magic.

My question is: How card table and write barrier works? What is marked in card tables? How then garbage collector knows that particular object is referenced by another object persisted in older generation.

I would like to have some practical imagination about that mechanism, like I was supposed to prepare some simulation.

解决方案

I don't know whether you found some exceptionally bad description or whether you expect too many details, I've been quite satisfied with the explanations I've seen. If the descriptions are brief and sound simplistic, that's because it really is a rather simple mechanism.

As you apparently already know, a generational garbage collector needs to be able to enumerate old objects that refer to young objects. It would be correct to scan all old objects, but that destroys the advantages of the generational approach, so you have to narrow it down. Regardless of how you do that, you need a write barrier - a piece of code executed whenever a member variable (of a reference type) is assigned/written to. If the new reference points to a young object and it's stored in an old object, the write barrier records that fact for the garbage collect. The difference lies in how it's recorded. There are exact schemes using so-called remembered sets, a collection of every old object that has (had at some point) a reference to a young object. As you can imagine, this takes quite a bit of space.

The card table is a trade-off: Instead of telling you which objects exactly contains young pointers (or at least did at some point), it groups objects into fixed-sized buckets and tracks which buckets contain objects with young pointers. This, of course, reduces space usage. For correctness, it doesn't really matter how you bucket the objects, as long as you're consistent about it. For efficiency, you just group them by their memory address (because you have that available for free), divided by some larger power of two (to make the division a cheap bitwise operation).

Also, instead of maintaining an explicit list of buckets, you reserve some space for each possible bucket up-front. Specifically, there is an array of N bits or bytes, where N is the number of buckets, so that the ith value is 0 if the ith bucket contains no young pointers, or 1 if it does contain young pointers. This is the card table proper. Typically this space is allocated and freed along with a large block of memory used as (part of) the heap. It may even be embedded in the start of the memory block, if it doesn't need to grow. Unless the entire address space is used as heap (which is very rare), the above formula gives numbers starting from start_of_memory_region >> K instead of 0, so to get an index into the card table you have to subtract the start of the start address of the heap.

In summary, when the write barrier finds that the statement some_obj.field = other_obj; stores a young pointer in an old object, it does this:

card_table[(&old_obj - start_of_heap) >> K] = 1;

Where &old_obj is the address of the object that now has a young pointer (which is already in a register because it was just determined to refer to an old object). During minor GC, the garbage collector looks at the card table to determine which heap regions to scan for young pointers:

for i from 0 to (heap_size >> K):
    if card_table[i]:
        scan heap[i << K .. (i + 1) << K] for young pointers

这篇关于卡片表和作者屏障实际上如何工作?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆