为什么HashSet< Point>比HashSet< string>慢多少？ [英] Why is HashSet<Point> so much slower than HashSet<string>?

查看：82 发布时间：2020/5/30 19:41:11 c# .net performance collections hashset

本文介绍了为什么HashSet< Point>比HashSet< string>慢多少？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想存储一些像素位置而不允许重复，所以首先想到的是 HashSet< Point> 或类似的类。但是，与 HashSet< string> 之类的东西相比，这似乎非常慢。

例如，以下代码：

  HashSet< Point> points = new HashSet< Point>（）; 
使用（位图img =新位图（1000，1000））
 {
 for（int x = 0; x< img.Width; x ++）
 {
表示（int y = 0; y< img.Height; y ++）
 {
 points.Add（new Point（x，y））; 
} 
} 
}

大约需要22.5秒。 / p>

以下代码（出于明显的原因，这不是一个好选择）仅需1.6秒：

  HashSet< string> points = new HashSet< string>（）; 
使用（位图img =新位图（1000，1000））
 {
 for（int x = 0; x< img.Width; x ++）
 {
表示（int y = 0; y< img.Height; y ++）
 {
 points.Add（x +， + y）; 
} 
} 
}

所以，我的问题是：

是否有原因？我检查了此答案，但是22.5秒远比该答案中显示的数字还要长。

是否有更好的方法来存储点而不重复？

解决方案

有Point结构引起的两个性能问题。将 Console.WriteLine（GC.CollectionCount（0））; 添加到测试代码中时，您可以看到一些内容。您会看到Point测试需要大约3720个集合，而字符串测试只需要大约18个集合。不是免费的。当您看到一个值类型引发了如此多的集合时，您需要得出结论呃，哦，太多拳击。

问题在于 HashSet< T> 需要 IEqualityComparer< T> 才能完成工作。由于您没有提供，因此需要回退到 EqualityComparer.Default< T>（）返回的值。该方法可以很好地处理字符串，它实现了IEquatable。但是不是Point的一种，它是从.NET 1.0衍生出来的一种类型，从未受到泛型的热爱。它所能做的就是使用Object方法。

另一个问题是Point.GetHashCode（）在此测试中做得不好，有太多的冲突，所以它非常重锤Object.Equals（）。 String具有出色的GetHashCode实现。

您可以通过为HashSet提供良好的比较器来解决这两个问题。像这样：

  class PointComparer：IEqualityComparer< Point> {
 public bool Equals（Point x，y点）{
 return x.X == y.X&& x.Y == y.Y; 
} 
 
 public int GetHashCode（Point obj）{
 //适用于实际位图的完美哈希，其宽度/高度从不> = 65536 
 return（obj .Y<< 16）^ obj.X; 
} 
}

并使用它：

  HashSet< Point> list = new HashSet< Point>（new PointComparer（））;

现在它的速度快了150倍左右，很容易击败了字符串测试。

I wanted to store some pixels locations without allowing duplicates, so the first thing comes to mind is HashSet<Point> or similar classes. However this seems to be very slow compared to something like HashSet<string>.

For example, this code:

HashSet<Point> points = new HashSet<Point>();
using (Bitmap img = new Bitmap(1000, 1000))
{
    for (int x = 0; x < img.Width; x++)
    {
        for (int y = 0; y < img.Height; y++)
        {
            points.Add(new Point(x, y));
        }
    }
}

takes about 22.5 seconds.

While the following code (which is not a good choice for obvious reasons) takes only 1.6 seconds:

HashSet<string> points = new HashSet<string>();
using (Bitmap img = new Bitmap(1000, 1000))
{
    for (int x = 0; x < img.Width; x++)
    {
        for (int y = 0; y < img.Height; y++)
        {
            points.Add(x + "," + y);
        }
    }
}

So, my questions are:

Is there a reason for that? I checked this answer, but 22.5 sec is way more than the numbers shown in that answer.
Is there a better way to store points without duplicates?

解决方案

There are two perf problems induced by the Point struct. Something you can see when you add Console.WriteLine(GC.CollectionCount(0)); to the test code. You'll see that the Point test requires ~3720 collections but the string test only needs ~18 collections. Not for free. When you see a value type induce so many collections then you need to conclude "uh-oh, too much boxing".

At issue is that HashSet<T> needs an IEqualityComparer<T> to get its job done. Since you did not provide one, it needs to fall back to one returned by EqualityComparer.Default<T>(). That method can do a good job for string, it implements IEquatable. But not for Point, it is a type that harks from .NET 1.0 and never got the generics love. All it can do is use the Object methods.

The other issue is that Point.GetHashCode() does not do a stellar job in this test, too many collisions, so it hammers Object.Equals() pretty heavily. String has an excellent GetHashCode implementation.

You can solve both problems by providing the HashSet with a good comparer. Like this one:

class PointComparer : IEqualityComparer<Point> {
    public bool Equals(Point x, Point y) {
        return x.X == y.X && x.Y == y.Y;
    }

    public int GetHashCode(Point obj) {
        // Perfect hash for practical bitmaps, their width/height is never >= 65536
        return (obj.Y << 16) ^ obj.X;
    }
}

And use it:

HashSet<Point> list = new HashSet<Point>(new PointComparer());

And it is now about 150 times faster, easily beating the string test.

这篇关于为什么HashSet< Point>比HashSet< string>慢多少？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

为什么HashSet< Point>比HashSet< string>慢多少？ [英] Why is HashSet<Point> so much slower than HashSet<string>?

问题描述

相关文章

C#/.NET最新文章

热门教程

热门工具

登录关闭

为什么HashSet&lt; Point&gt;比HashSet&lt; string&gt;慢多少？ [英] Why is HashSet&lt;Point&gt; so much slower than HashSet&lt;string&gt;?

问题描述

相关文章

C#/.NET最新文章

热门教程

热门工具

登录 关闭

为什么HashSet< Point>比HashSet< string>慢多少？ [英] Why is HashSet<Point> so much slower than HashSet<string>?

登录关闭