HashSet 与列表性能 [英] HashSet vs. List performance
问题描述
很明显,通用HashSet
类的搜索性能高于通用List
类.只需将基于散列的密钥与 List
类中的线性方法进行比较.
It's clear that a search performance of the generic HashSet<T>
class is higher than of the generic List<T>
class. Just compare the hash-based key with the linear approach in the List<T>
class.
然而,计算散列键本身可能需要一些 CPU 周期,因此对于少量项目,线性搜索可以真正替代 HashSet
.
However calculating a hash key may itself take some CPU cycles, so for a small amount of items the linear search can be a real alternative to the HashSet<T>
.
我的问题:盈亏平衡点在哪里?
My question: where is the break-even?
为了简化场景(公平起见),我们假设 List
类使用元素的 Equals()
方法来标识一个项目.>
To simplify the scenario (and to be fair) let's assume that the List<T>
class uses the element's Equals()
method to identify an item.
推荐答案
很多人都说,一旦达到速度实际上是一个问题的规模,HashSet
将永远击败List
,但这取决于你在做什么.
A lot of people are saying that once you get to the size where speed is actually a concern that HashSet<T>
will always beat List<T>
, but that depends on what you are doing.
假设您有一个 List
,其中平均只有 5 个项目.在大量循环中,如果每个循环添加或删除单个项目,则最好使用 List
.
Let's say you have a List<T>
that will only ever have on average 5 items in it. Over a large number of cycles, if a single item is added or removed each cycle, you may well be better off using a List<T>
.
我在我的机器上为此做了一个测试,而且,它必须非常非常小才能从 List
中获得优势.对于短字符串列表,在大小为 5 之后,对于大小为 20 之后的对象,优势就消失了.
I did a test for this on my machine, and, well, it has to be very very small to get an advantage from List<T>
. For a list of short strings, the advantage went away after size 5, for objects after size 20.
1 item LIST strs time: 617ms
1 item HASHSET strs time: 1332ms
2 item LIST strs time: 781ms
2 item HASHSET strs time: 1354ms
3 item LIST strs time: 950ms
3 item HASHSET strs time: 1405ms
4 item LIST strs time: 1126ms
4 item HASHSET strs time: 1441ms
5 item LIST strs time: 1370ms
5 item HASHSET strs time: 1452ms
6 item LIST strs time: 1481ms
6 item HASHSET strs time: 1418ms
7 item LIST strs time: 1581ms
7 item HASHSET strs time: 1464ms
8 item LIST strs time: 1726ms
8 item HASHSET strs time: 1398ms
9 item LIST strs time: 1901ms
9 item HASHSET strs time: 1433ms
1 item LIST objs time: 614ms
1 item HASHSET objs time: 1993ms
4 item LIST objs time: 837ms
4 item HASHSET objs time: 1914ms
7 item LIST objs time: 1070ms
7 item HASHSET objs time: 1900ms
10 item LIST objs time: 1267ms
10 item HASHSET objs time: 1904ms
13 item LIST objs time: 1494ms
13 item HASHSET objs time: 1893ms
16 item LIST objs time: 1695ms
16 item HASHSET objs time: 1879ms
19 item LIST objs time: 1902ms
19 item HASHSET objs time: 1950ms
22 item LIST objs time: 2136ms
22 item HASHSET objs time: 1893ms
25 item LIST objs time: 2357ms
25 item HASHSET objs time: 1826ms
28 item LIST objs time: 2555ms
28 item HASHSET objs time: 1865ms
31 item LIST objs time: 2755ms
31 item HASHSET objs time: 1963ms
34 item LIST objs time: 3025ms
34 item HASHSET objs time: 1874ms
37 item LIST objs time: 3195ms
37 item HASHSET objs time: 1958ms
40 item LIST objs time: 3401ms
40 item HASHSET objs time: 1855ms
43 item LIST objs time: 3618ms
43 item HASHSET objs time: 1869ms
46 item LIST objs time: 3883ms
46 item HASHSET objs time: 2046ms
49 item LIST objs time: 4218ms
49 item HASHSET objs time: 1873ms
这是显示为图表的数据:
Here is that data displayed as a graph:
代码如下:
static void Main(string[] args)
{
int times = 10000000;
for (int listSize = 1; listSize < 10; listSize++)
{
List<string> list = new List<string>();
HashSet<string> hashset = new HashSet<string>();
for (int i = 0; i < listSize; i++)
{
list.Add("string" + i.ToString());
hashset.Add("string" + i.ToString());
}
Stopwatch timer = new Stopwatch();
timer.Start();
for (int i = 0; i < times; i++)
{
list.Remove("string0");
list.Add("string0");
}
timer.Stop();
Console.WriteLine(listSize.ToString() + " item LIST strs time: " + timer.ElapsedMilliseconds.ToString() + "ms");
timer = new Stopwatch();
timer.Start();
for (int i = 0; i < times; i++)
{
hashset.Remove("string0");
hashset.Add("string0");
}
timer.Stop();
Console.WriteLine(listSize.ToString() + " item HASHSET strs time: " + timer.ElapsedMilliseconds.ToString() + "ms");
Console.WriteLine();
}
for (int listSize = 1; listSize < 50; listSize+=3)
{
List<object> list = new List<object>();
HashSet<object> hashset = new HashSet<object>();
for (int i = 0; i < listSize; i++)
{
list.Add(new object());
hashset.Add(new object());
}
object objToAddRem = list[0];
Stopwatch timer = new Stopwatch();
timer.Start();
for (int i = 0; i < times; i++)
{
list.Remove(objToAddRem);
list.Add(objToAddRem);
}
timer.Stop();
Console.WriteLine(listSize.ToString() + " item LIST objs time: " + timer.ElapsedMilliseconds.ToString() + "ms");
timer = new Stopwatch();
timer.Start();
for (int i = 0; i < times; i++)
{
hashset.Remove(objToAddRem);
hashset.Add(objToAddRem);
}
timer.Stop();
Console.WriteLine(listSize.ToString() + " item HASHSET objs time: " + timer.ElapsedMilliseconds.ToString() + "ms");
Console.WriteLine();
}
Console.ReadLine();
}
这篇关于HashSet 与列表性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!