不使用GetHashCode的HashSet和Dictionary的C#性能替代 [英] C# performant alternatives to HashSet and Dictionary that do not use GetHashCode

查看:284
本文介绍了不使用GetHashCode的HashSet和Dictionary的C#性能替代的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻找性能比列表更好但不使用内部GetHashCode方法的HashSetDictionary对象的内置替代品.我需要这样做,因为对于我编写的类,除了

之外,没有其他方法可以编写与Equals达成常规合同的GetHashCode方法

I'm looking for built-in alternatives of HashSet and Dictionary objects that have better performance than lists but do not use the internal GetHashCode method. I need this because for the class I have written, there is no way of writing a GetHashCode method that fulfills the usual contract with Equals other than

public override int GetHashCode() { return 0; } // or return any other constant value

会将HashSetDictionary转换为普通列表(从性能角度来看).

which would turn HashSet and Dictionary into ordinary lists (performance-wise).

所以我需要的是一个set实现和一个mapping实现.有什么建议吗?

So what I need is a set implementation and a mapping implementation. Any suggestions?

我的课是一个基于公差的3维向量课:

My class is a tolerance-based 3-dimensional vector class:

public class Vector
{
    private static const double TOL = 1E-10;
    private double x, y, z;

    public Vector(double x, double y, double z)
    {
        this.x = x; this.y = y; this.z = z;
    }

    public override bool Equals(object o)
    {
        Vector other = o as Vector;

        if (other == null)
            return false;

        return ((Math.Abs(x - other.x) <= TOL) &&
                (Math.Abs(y - other.y) <= TOL) &&
                (Math.Abs(z - other.z) <= TOL));
    }
}

请注意,我的Equals方法不是可传递的.但是,在用例中,我可以使它局部地"传递,因为在某个时候,我将知道需要放入集合/映射键集中的所有向量,并且我也知道它们将以群集的形式出现.因此,当我收集了所有向量后,我将为每个簇选择一个代表,并用该代表替换所有原始向量.然后Equals将在我的集合/映射键集中的元素之间传递.

Note that my Equals method is not transitive. However, in my use case I can make it "locally" transitive because at some point, I will know all vectors that I need to put into my set / mapping key set, and I also know that they will come in clusters. So when I have collected all vectors, I will choose one representative per cluster and replace all original vectors by the representative. Then Equals will be transitive among the elements of my set / mapping key set.

设置或映射后,我将从另一个来源收集向量(出于这个问题,我们假设我将要求用户输入向量).这些可以是任何可能的向量.这些永远不会添加到集合/映射中,但是我将需要知道它们是否包含在映射的集合/键集中(关于容差),并且我需要从映射中知道它们的值.

When I have my set or mapping, I will collect vectors from another source (for the sake of this question let's assume I'll ask a user to type in a vector). These can be any possible vector. Those will never be added to the set/mapping, but I will need to know if they are contained in the set / key set of the mapping (regarding tolerance), and I will need to know their value from the mapping.

推荐答案

您需要一个支持排序,二进制搜索和快速插入的数据结构.不幸的是,.NET Framework中没有这样的集合. SortedDictionary不支持二进制搜索,而SortedList遭受O(n)插入未排序数据的问题.因此,您必须搜索第三方工具.一个很好的候选人似乎是 C5 库的TreeDictionary.这是一个红黑树实现,提供了重要的方法RangeFromTo.这是一个不完整的字典实现,它以Vectors作为键,并在内部由C5.TreeDictionary支持:

You need a data structure that supports sorting, binary search and fast insertion. Unfortunately there is no such collection in the .NET Framework. The SortedDictionary doesn't supports binary search, while the SortedList suffers from O(n) insertion for unsorted data. So you must search for a third party tool. A good candidate seems to be the TreeDictionary of C5 library. It is a red-black tree implementation that offers the important method RangeFromTo. Here is an incomplete implementation of a Dictionary that has Vectors as keys, backed internally by a C5.TreeDictionary:

public class VectorDictionary<TValue>
{
    private C5.TreeDictionary<double, (Vector, TValue)> _tree =
        new C5.TreeDictionary<double, (Vector, TValue)>();

    public bool TryGetKeyValue(Vector key, out (Vector, TValue) pair)
    {
        double xyz = key.X + key.Y + key.Z;
        // Hoping that not all vectors are crowded in the same diagonal line
        var range = _tree.RangeFromTo(xyz - Vector.TOL * 3, xyz + Vector.TOL * 3);
        var equalPairs = range.Where(e => e.Value.Item1.Equals(key));
        // Selecting a vector from many "equal" vectors is tricky.
        // Some may be more equal than others. :-) Lets return the first for now.
        var selectedPair = equalPairs.FirstOrDefault().Value;
        pair = selectedPair;
        return selectedPair.Item1 != null;
    }

    public Vector GetExisting(Vector key)
    {
        return TryGetKeyValue(key, out var pair) ? pair.Item1 : default;
    }

    public bool Contains(Vector key) => TryGetKeyValue(key, out var _);

    public bool Add(Vector key, TValue value)
    {
        if (Contains(key)) return false;
        _tree.Add(key.X + key.Y + key.Z, (key, value));
        return true;
    }

    public TValue this[Vector key]
    {
        get => TryGetKeyValue(key, out var pair) ? pair.Item2 : default;
        set => _tree.Add(key.X + key.Y + key.Z, (key, value));
    }

    public int Count => _tree.Count;
}

用法示例:

var dictionary = new VectorDictionary<int>();
Console.WriteLine($"Added: {dictionary.Add(new Vector(0.5 * 1E-10, 0, 0), 1)}");
Console.WriteLine($"Added: {dictionary.Add(new Vector(0.6 * 1E-10, 0, 0), 2)}");
Console.WriteLine($"Added: {dictionary.Add(new Vector(1.6 * 1E-10, 0, 0), 3)}");
Console.WriteLine($"dictionary.Count: {dictionary.Count}");
Console.WriteLine($"dictionary.Contains: {dictionary.Contains(new Vector(2.5 * 1E-10, 0, 0))}");
Console.WriteLine($"dictionary.GetValue: {dictionary[new Vector(2.5 * 1E-10, 0, 0)]}");

输出:

Added: True
Added: False
Added: True
dictionary.Count: 2
dictionary.Contains: True
dictionary.GetValue: 3

这篇关于不使用GetHashCode的HashSet和Dictionary的C#性能替代的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆