如何在C#/。NET 3.5字典实施? [英] How is the c#/.net 3.5 dictionary implemented?

查看:195
本文介绍了如何在C#/。NET 3.5字典实施?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用的是它采用了一些大词典(可达10 ^ 6元),它的大小是事先未知的,(虽然我可以在某些情况下猜)的应用程序。我不知道该词典是如何实现的,即效果是多么糟糕,如果我不给字典大小的初步估计。它在内部使用(自增长)数组列表的方式做?在这种情况下让字典增长可能会留下很多的LOH大未引用阵列。

I'm using an application which uses a number of large dictionaries ( up to 10^6 elements), the size of which is unknown in advance, (though I can guess in some cases). I'm wondering how the dictionary is implemented, i.e. how bad the effect is if I don't give an initial estimate of the dictionary size. Does it internally use a (self-growing) array in the way List does? in which case letting the dictionaries grow might leave a lot of large un-referenced arrays on the LOH.

推荐答案

使用反射时,我发现以下内容:词典保持在一个结构数组中的数据。它使多少空的地方留下了数组中的计数。当您添加一个项目,没有空的地方留下,它增加了内部数组从旧的阵列到新数组的大小(见下文),并将数据复制。

Using Reflector, I found the following: The Dictionary keeps the data in a struct array. It keeps a count on how many empty places are left in that array. When you add an item and no empty place is left, it increases the size of the internal array (see below) and copies the data from the old array to the new array.

所以我建议,如果你知道会有很多条目,您应该使用在其中设置初始大小的构造。

So I would suggest you should use the constructor in which you set the initial size if you know there will be many entries.

编辑:其实逻辑非常有趣,有一个名为 HashHelpers 找素数的内部类。为了加速此时,它也存储在来自3的静态数组高达7199369一些素数(部分丢失;对于理由,见下文)。当您提供的能力,找到下一个黄金(值相同或更大)从阵列,并将其用作初始容量。如果你给它一个较大数量比阵列中,它开始手动检查。

The logic is actually quite interesting: There is an internal class called HashHelpers to find primes. To speed this up, it also has stored some primes in a static array from 3 up to 7199369 (some are missing; for the reason, see below). When you supply a capacity, it finds the next prime (same value or larger) from the array, and uses that as initial capacity. If you give it a larger number than in its array, it starts checking manually.

所以,如果没有什么作为容量词典过去了,起始容量为三种。

So if nothing is passed as capacity to the Dictionary, the starting capacity is three.

一旦超过容量,它乘以两个电流容量,然后查找使用辅助类下一个较大的素数。这就是为什么在数组中需要并不是每一个素数,因为不是真正需要的素数靠得太近。

Once the capacity is exceeded, it multiplies the current capacity by two and then finds the next larger prime using the helper class. That is why in the array not every prime is needed, since primes "too close together" aren't really needed.

因此​​,如果我们没有传递初始值,我们会得到(我查了内部数组):

So if we pass no initial value, we would get (I checked the internal array):


  1. 3

  2. 7

  3. 17

  4. 37

  5. 71

  6. 163

  7. 353

  8. 761

  9. 1597

  10. 3371

  11. 7013

  12. 14591

  13. 30293

  14. 62851

  15. 130363

  16. 270371

  17. 560689

  18. 1162687

  19. 2411033

  20. 4999559

一旦我们通过这种规模,下一步低于内部阵列之外,它将手动搜索的质数。这将是相当缓慢的。你可以用7199369初始化(数组中的最大值),或考虑是否具有比在词典约5万个条目更可能意味着你应该重新考虑你的设计。

Once we pass this size, the next step falls outside the internal array, and it will manually search for larger primes. This will be quite slow. You could initialize with 7199369 (the largest value in the array), or consider if having more than about 5 million entries in a Dictionary might mean that you should reconsider your design.

这篇关于如何在C#/。NET 3.5字典实施?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆