应一个.NET通用字典中的容量等于项目数它将包含初始化? [英] Should a .NET generic dictionary be initialised with a capacity equal to the number of items it will contain?

查看:125
本文介绍了应一个.NET通用字典中的容量等于项目数它将包含初始化?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果我有,比方说,100项将被保存在一本字典,我应该初始化它这样?

If I have, say, 100 items that'll be stored in a dictionary, should I initialise it thus?

var myDictionary = new Dictionary<Key, Value>(100);

我的理解是,.NET词典内部自身的大小,当它到达一个给定的负载,以及负载阈值被定义为产能的比率。

My understanding is that the .NET dictionary internally resizes itself when it reaches a given loading, and that the loading threshold is defined as a ratio of the capacity.

这将表明,如果100个项目被加入到上述字典,那么它会自行调整大小时的项目之一的溶液。调整大小的字典是我想避免,因为它有一个性能命中,是浪费内存。

That would suggest that if 100 items were added to the above dictionary, then it would resize itself when one of the items was added. Resizing a dictionary is something I'd like to avoid as it has a performance hit and is wasteful of memory.

散列碰撞的概率正比于装载在字典中。因此,即使在字典不调整自身(和将其所有的时隙),则性能必须降低由于这些冲突。

The probability of hashing collisions is proportional to the loading in a dictionary. Therefore, even if the dictionary does not resize itself (and uses all of its slots) then the performance must degrade due to these collisions.

应如何最好的决定是什么身份来初始化字典,假设你知道有多少项目会在字典里?

How should one best decide what capacity to initialise the dictionary to, assuming you know how many items will be inside the dictionary?

推荐答案

你应该初始化字典能力取决于两个因素: (1)gethash code函数的分布, (2)你有多少条目插入。

What you should initialize the dictionary capacity to depends on two factors: (1) The distribution of the gethashcode function, and (2) How many items you have to insert.

您的散列函数要么是随机分布的,或者它应该被特别为您的组输入。假设第一,但如果你有兴趣在第二查找完美的散列函数。

Your hash function should either be randomly distributed, or it is should be specially formulated for your set of input. Let's assume the first, but if you are interested in the second look up perfect hash functions.

如果你有100个项目插入到字典中,随机分布的哈希函数,并设置了容量为100,那么当你插入第i个项目进入哈希表你有一个(I-1)/ 100的概率该第i个项目将碰撞在插入另一个项目。如果你想降低碰撞的这个概率,增加容量。加倍预期容量两半碰撞的机会。

If you have 100 items to insert into the dictionary, a randomly distributed hash function, and you set the capacity to 100, then when you insert the ith item into the hash table you have a (i-1) / 100 probability that the ith item will collide with another item upon insertion. If you want to lower this probability of collision, increase the capacity. Doubling the expected capacity halves the chance of collision.

此外,如果你知道如何频繁地你将要访问字典中的每一个项目,你可能要插入的顺序,因为您插入第一个将平均更快地访问项目降低频率的项目。

Furthermore, if you know how frequently you are going to be accessing each item in the dictionary you may want to insert the items in order of decreasing frequency since the items that you insert first will be on average faster to access.

这篇关于应一个.NET通用字典中的容量等于项目数它将包含初始化?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆