为什么有对象池的罗斯林这么多的实现? [英] Why are there so many implementations of Object Pooling in Roslyn?

查看:162
本文介绍了为什么有对象池的罗斯林这么多的实现?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

的<一个href="http://source.roslyn.$c$cplex.com/#Microsoft.$c$cAnalysis/ObjectPool%25601.cs,20b9a041fb2d5b00">ObjectPool在罗斯林C#编译器用于重复使用频繁使用的对象,这通常得到new'ed和垃圾回收往往一个类型。这降低了其要发生的数量和垃圾收集操作的尺寸。

在罗斯林编译器似乎有对象的几个独立的游泳池和每个池都有不同的大小。我想知道为什么有这么多的实现中,preferred实现是什么,为什么他们选择了20,100或128一池的大小。

1 - <一个href="http://source.roslyn.$c$cplex.com/#Microsoft.$c$cAnalysis.Workspaces/Utilities/ObjectPools/SharedPools.cs,b2114905209e7df3">SharedPools - 存储20个对象或100池,如果BigDefault使用。这其中也奇怪,因为它创建PooledObject的一个新实例,这使得当我们试图池中的对象,而不是创建和销毁新的没有意义的。

  //示例1  - 在using语句,所以对象被释放的时候在最后。
使用(PooledObject&LT;富&GT; pooledObject = SharedPools.Default&LT;列表&LT;富&GT;&GT;()GetPooledObject())
{
    //做些事情pooledObject.Object
}

//例2  - 没有使用的语句,所以你需要确保没有异常不抛出。
名单&LT;富&GT;名单= SharedPools.Default&LT;列表&LT;富&GT;&GT;()AllocateAndClear()。
//做些事情列表
SharedPools.Default&LT;列表&LT;富&GT;&GT;()免费(列表)。

//实施例3  - 我还看到的上面的图案,这结束了同实施例1,除了实施例1似乎创建了IDisposable的新实例此变化[PooledObject&其中; T&GT;] [3]的对象。这大概是preferred的选择,如果你想减少GC的。
名单&LT;富&GT;名单= SharedPools.Default&LT;列表&LT;富&GT;&GT;()AllocateAndClear()。
尝试
{
    //做些事情列表
}
最后
{
    SharedPools.Default&LT;列表&LT;富&GT;&GT;()免费(列表)。
}
 

2 - <一个href="http://source.roslyn.$c$cplex.com/#Microsoft.$c$cAnalysis.Workspaces/Formatting/ListPool.cs,1086fa28bcfcb8ca">ListPool和<一href="http://source.roslyn.$c$cplex.com/#Microsoft.$c$cAnalysis.Workspaces/Formatting/StringBuilderPool.cs,039ef0c630df07c3">StringBuilderPool - 未严格分开实现,但周围的SharedPools实施包装上面显示专为列表和StringBuilder的公司。因此,这重新使用存储在SharedPools对象池。

  //示例1  - 没有使用的语句,所以你需要确保没有异常抛出。
StringBuilder的StringBuilder的= StringBuilderPool.Allocate();
//做些事情的StringBuilder
StringBuilderPool.Free(StringBuilder的);

//例2  - 更安全的版本,例1。
StringBuilder的StringBuilder的= StringBuilderPool.Allocate();
尝试
{
    //做些事情的StringBuilder
}
最后
{
    StringBuilderPool.Free(StringBuilder的);
}
 

3 - <一个href="http://source.roslyn.$c$cplex.com/#Microsoft.$c$cAnalysis/PooledDictionary.cs,ebb1ac303c777646">PooledDictionary和<一href="http://source.roslyn.$c$cplex.com/#Microsoft.$c$cAnalysis/PooledHashSet.cs,afe982be5207ab5e">PooledHashSet - 这些用ObjectPool的直接和有一个完全独立的对象池。店128对象池。

  //例1
PooledHashSet&LT;富&GT; HashSet的= PooledHashSet&LT;富&GT; .GetInstance()
//做些事情HashSet的。
hashSet.Free();

//例2  - 更安全的版本,例1。
PooledHashSet&LT;富&GT; HashSet的= PooledHashSet&LT;富&GT; .GetInstance()
尝试
{
    //做些事情HashSet的。
}
最后
{
    hashSet.Free();
}
 

解决方案

我是负责人的罗斯林性能V-团队。所有对象池的目的是降低分摊率,垃圾回收,因此,频率。这是以增加长寿命(根2)对象为代价。这有助于编译器的吞吐量略有但主要的影响是在Visual Studio中的响应使用VB或者C#智能感知时。

  

为什么有这么多的实现。

有没有快速的答案,但我能想到的原因有三:

  1. 每个执行供应稍微不同的目的和它们被调谐用于这一目的。
  2. 分层 - 所有的池是从编译器层内部和内部细节可以不从工作区层,或反之亦然引用。我们确实有通过链接文件的一些code共享,但是我们尽量把它保持在最低限度。
  3. 在没有巨大的努力已经进入了统一,你今天看到的实现。
  

在preferred实现是什么

ObjectPool的&LT; T&GT; 是preferred实现什么多数code使用。需要注意的是 ObjectPool的&LT; T&GT; 所使用的 ArrayBuilder&LT; T&GT; .GetInstance(),这就是可能的最大用户在罗斯林汇集对象。因为 ObjectPool的&LT; T&GT; 如此大量使用,这是我们重复code跨层通过链接文件的案件之一。 ObjectPool的&LT; T&GT; 被调整为最大吞吐量

在工作区层,你会看到 SharedPool&LT; T&GT; 尝试通过不相交的组件共享池的实例,以减少总体内存使用情况。我们试图避免每个组件创建自己的游泳池专用于特定的目的,相反份额的基础上元素的类型。这方面的一个很好的例子是 StringBuilderPool

  

为什么他们选20,100或128的池大小。

通常,这是在典型的工作负载的分析和仪器的结果。我们通常有罢工和池中的住总字节分配率之间的平衡(在游泳池未命中)。这两个因素在起作用是:

  1. 最大并行度(访问池并发线程)
  2. 在包括重叠的分配和嵌套分配的访问模式。

在事物的宏伟计划,相较于住总内存(第2代堆的大小)汇编的物体在游泳池举行的内存很小,但是,我们也注意不要返回巨大的天体(通常是大的集合),回水池 - 我们只是把它们放在地板上通过调用 ForgetTrackedObject

对于未来,我认为一个领域,我们可以提高是有字节数组(缓冲区)与约束长度池。这将帮助,尤其是在编译器的发射相位(PEWriter)将MemoryStream实现。这些MemoryStreams需要连续的字节数组的快速写入,但它们是动态调整。这意味着他们偶尔需要调整 - 每一次的大小通常增加了一倍。每个大小调整是一个新的分配,但它会很高兴能够抓住从专用游泳池大小的缓冲区并返回较小的缓冲区回到一个不同的池。因此,举例来说,你有64个字节的缓冲区,另一个是128字节的缓冲区等一个游泳池。总池内存将受到限制,但你避免炒单在GC堆的缓冲区增长。

再次感谢您的问题。

保罗·哈灵顿。

The ObjectPool is a type used in the Roslyn C# compiler to reuse frequently used objects which would normally get new'ed up and garbage collected very often. This reduces the amount and size of garbage collection operations which have to happen.

The Roslyn compiler seems to have a few separate pools of objects and each pool has a different size. I want to know why there are so many implementations, what the preferred implementation is and why they picked a pool size of 20, 100 or 128.

1 - SharedPools - Stores a pool of 20 objects or 100 if the BigDefault is used. This one is also strange in that it creates a new instance of PooledObject, which makes no sense when we are trying to pool objects and not create and destroy new ones.

// Example 1 - In a using statement, so the object gets freed at the end.
using (PooledObject<Foo> pooledObject = SharedPools.Default<List<Foo>>().GetPooledObject())
{
    // Do something with pooledObject.Object
}

// Example 2 - No using statement so you need to be sure no exceptions are not thrown.
List<Foo> list = SharedPools.Default<List<Foo>>().AllocateAndClear();
// Do something with list
SharedPools.Default<List<Foo>>().Free(list);

// Example 3 - I have also seen this variation of the above pattern, which ends up the same as Example 1, except Example 1 seems to create a new instance of the IDisposable [PooledObject<T>][3] object. This is probably the preferred option if you want fewer GC's.
List<Foo> list = SharedPools.Default<List<Foo>>().AllocateAndClear();
try
{
    // Do something with list
}
finally
{
    SharedPools.Default<List<Foo>>().Free(list);
}

2 - ListPool and StringBuilderPool - Not strictly separate implementations but wrappers around the SharedPools implementation shown above specifically for List and StringBuilder's. So this re-uses the pool of objects stored in SharedPools.

// Example 1 - No using statement so you need to be sure no exceptions are thrown.
StringBuilder stringBuilder= StringBuilderPool.Allocate();
// Do something with stringBuilder
StringBuilderPool.Free(stringBuilder);

// Example 2 - Safer version of Example 1.
StringBuilder stringBuilder= StringBuilderPool.Allocate();
try
{
    // Do something with stringBuilder
}
finally
{
    StringBuilderPool.Free(stringBuilder);
}

3 - PooledDictionary and PooledHashSet - These use ObjectPool directly and have a totally separate pool of objects. Stores a pool of 128 objects.

// Example 1
PooledHashSet<Foo> hashSet = PooledHashSet<Foo>.GetInstance()
// Do something with hashSet.
hashSet.Free();

// Example 2 - Safer version of Example 1.
PooledHashSet<Foo> hashSet = PooledHashSet<Foo>.GetInstance()
try
{
    // Do something with hashSet.
}
finally
{
    hashSet.Free();
}

解决方案

I'm the lead for the Roslyn performance v-team. All object pools are designed to reduce the allocation rate and, therefore, the frequency of garbage collections. This comes at the expense of adding long-lived (gen 2) objects. This helps compiler throughput slightly but the major effect is on Visual Studio responsiveness when using the VB or C# IntelliSense.

why there are so many implementations".

There's no quick answer, but I can think of three reasons:

  1. Each implementation serves a slightly different purpose and they are tuned for that purpose.
  2. "Layering" - All the pools are internal and internal details from the Compiler layer may not be referenced from the Workspace layer or vice versa. We do have some code sharing via linked files, but we try to keep it to a minimum.
  3. No great effort has gone into unifying the implementations you see today.

what the preferred implementation is

ObjectPool<T> is the preferred implementation and what the majority of code uses. Note that ObjectPool<T> is used by ArrayBuilder<T>.GetInstance() and that's probably the largest user of pooled objects in Roslyn. Because ObjectPool<T> is so heavily used, this is one of the cases where we duplicated code across the layers via linked files. ObjectPool<T> is tuned for maximum throughput.

At the workspace layer, you'll see that SharedPool<T> tries to share pooled instances across disjoint components to reduce overall memory usage. We were trying to avoid having each component create its own pool dedicated to a specific purpose and, instead share based on the type of element. A good example of this is the StringBuilderPool.

why they picked a pool size of 20, 100 or 128.

Usually, this is the result of profiling and instrumentation under typical workloads. We usually have to strike a balance between allocation rate ("misses" in the pool) and the total live bytes in the pool. The two factors at play are:

  1. The maximum degree of parallelism (concurrent threads accessing the pool)
  2. The access pattern including overlapped allocations and nested allocations.

In the grand scheme of things, the memory held by objects in the pool is very small compared to the total live memory (size of the Gen 2 heap) for a compilation but, we do also take care not to return giant objects (typically large collections) back to the pool - we'll just drop them on the floor with a call to ForgetTrackedObject

For the future, I think one area we can improve is to have pools of byte arrays (buffers) with constrained lengths. This will help, in particular, the MemoryStream implementation in the emit phase (PEWriter) of the compiler. These MemoryStreams require contiguous byte arrays for fast writing but they are dynamically sized. That means they occasionally need to resize - usually doubling in size each time. Each resize is a new allocation, but it would be nice to be able to grab a resized buffer from a dedicated pool and return the smaller buffer back to a different pool. So, for example, you would have a pool for 64-byte buffers, another for 128-byte buffers and so on. The total pool memory would be constrained, but you avoid "churning" the GC heap as buffers grow.

Thanks again for the question.

Paul Harrington.

这篇关于为什么有对象池的罗斯林这么多的实现?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆