具有元组键的字典比嵌套字典慢.为什么? [英] Dictionary with tuple key slower than nested dictionary. Why?

查看:56
本文介绍了具有元组键的字典比嵌套字典慢.为什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经测试了使用(int,int,字符串)元组作为键与使用嵌套词典:Dictionary >>进行相同操作时在字典中检索,更新和删除值的速度.

I've tested the speed of retrieving, updating and removing values in a dictionary using a (int, int, string) tuple as key versus the same thing with a nested Dictionary: Dictionary>>.

我的测试表明,元组字典要慢得多(检索58%,更新69%和删除200%).我没想到的是.嵌套字典需要进行更多查找,那么为什么元组字典要慢得多?

My tests show the tuple dictionary to be a lot slower (58% for retrieving, 69% for updating and 200% for removing). I did not expect that. The nested dictionary needs to do more lookups, so why is the tuple dictionary that much slower?

我的测试代码:

    public static object TupleDic_RemoveValue(object[] param)
    {
        var dic = param[0] as Dictionary<(int did, int eid, string name), string>;
        var keysToRetrieve = param[2] as List<(int did, int eid, string name)>;

        foreach (var key in keysToRetrieve)
        {
            dic.Remove(key);
        }

        return dic;

    }


    public static object NestedDic_RemoveValue(object[] param)
    {
        var dic = param[1] as Dictionary<int, Dictionary<int, Dictionary<string, string>>>;
        var keysToRetrieve = param[2] as List<(int did, int eid, string name)>;


        foreach (var key in keysToRetrieve)
        {
            if (dic.TryGetValue(key.did, out var elementMap) && elementMap.TryGetValue(key.eid, out var propertyMap))
                propertyMap.Remove(key.name);
        }

        return dic;

    }

有关测试的其他信息:该词典共包含10000个条目.键递增:([0-100],[0-100],属性[0-100]").在单个测试中,检索了100个键(字典中不存在10%的键),更新了100个值(对于其中10%是新的值)或删除了100个键(对于其中的10%不在字典中的键)和).检索,更新和删除是3个独立的测试.每个测试执行了1000次.我比较了平均执行时间和中值执行时间.

Extra info on the test: The dictionary contains a total of 10 000 entries. The keys are incrementing: ([0-100],[0-100],"Property[0-100]"). In a single test 100 keys are retrieved (for which 10% was not present in the dictionary), 100 values are updated (for which 10% are new) or 100 keys are removed (for which 10% were not in the dictionary to begin with). Retrieval, updating and removing were 3 separate tests. Each test was executed 1000 times. I compared both the mean and median execution time.

推荐答案

Dictionary 中的查找依赖于两点.第一个是项目的哈希码,用于将项目分离为存储桶.两个不同键可以具有相同哈希码,因此一旦找到潜在的匹配项,就会针对每个项目调用 Equals (具有该哈希码)),直到找到完全匹配的内容为止.

Lookups in a Dictionary rely on two things. The first is an item's hash code which is used to separate the items into buckets. Two different keys can have the same hash code, so once a potential match is found, Equals is called against each item (with that hash code) until an exact match is found.

ValueTuple 的哈希码实现(适用于arity-2 + *)将元组中每个项目的 Equal Comparer.Default< T> .GetHashCode 的结果传递给元组到内部方法 ValueTuple.CombineHashCodes ,该方法依次调用 System.Numerics.Hashing.HashHelpers.Combine .元组中的项目越多,对两个 Combine 方法的嵌套调用就越多.将其与普通的 int GetHashCode 进行比较,后者直接返回该值.

ValueTuple's hash code implementation (for arity-2+ *) passes the result of Equality Comparer.Default<T>.GetHashCode for each item In the tuple to an internal method ValueTuple.CombineHashCodes, which in turn calls System.Numerics.Hashing.HashHelpers.Combine. The more items in the tuple, the more nested calls to both of the Combine methods. Compare this to a normal int's GetHashCode which just returns the value directly.

我觉得您后面的示例会更快.正如评论中指出的那样,您还将剪切必要的数据以搜索较小的分区.每次查找都必须调用 GetHashCode ,并在找到可能的匹配项后,即 Equals .在我看来,在第一种情况下散列冲突的可能性更高,这意味着对 Equals 的调用更多(在这种情况下,这只是对 EqualityComparer< T>的调用.等于元组中每个项目的代码.

It makes sense to me that your latter example would be faster. As pointed out in the comments, you are also cutting the necessary data to search into smaller partitions. Each lookup has to call GetHashCode and upon finding a potential match, Equals. It seems to me that there's a higher chance for hash collision in the first scenario, which would mean more calls to Equals (which in this case is just a call to EqualityComparer<T>.Default.Equals for each item in the tuple).

最后,它取决于性能分析(更确切地说,是性能分析(正确地分析 –释放模式,调整调用,足够的迭代次数等))以及您的特定用例.

In the end it comes down to profiling (and rather, profiling properly--Release Mode, jitting the calls, enough iterations, etc.) as well as your particular use case.

如果性能确实在您的用例中很重要(例如,紧密循环中的查找),也许最好使用您自己的类型和哈希码/等于实现,而不是 ValueTuple .但同样,它归结为分析.

If performance really matters in your use case (lookups in a tight loop, for example), perhaps it would be better to use your own type and hash code/equals implementations rather than ValueTuples. But again, it comes down to profiling.

*请注意,一元数元组有一种特殊情况.

* Note that there is a special case for a 1-arity tuple.

HashHelpers.Combine

ValueTuple

Int32.GetHashCode

这篇关于具有元组键的字典比嵌套字典慢.为什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆