为什么我的字典不好用C#组合键执行? [英] Why is my dictionary performing poorly with composite keys in C#?

查看:124
本文介绍了为什么我的字典不好用C#组合键执行?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个使用递归遍历树和更新项目的方法。



目前该方法需要很长时间来处理所有的项目,所以我就开始优化的东西。在这些事情是使用字典而不是执行的每个项目一个数据库查询。



这本字典的定义是

  System.Collections.Generic.Dictionary< EffectivePermissionKey,迈德特> 



键类型被定义为



 私人结构EffectivePermissionKey 
{
// http://blog.martindoms.com/2011/01/03/c-tip-override-equals-on-value -types换更好的性能/
公众覆盖布尔等于(对象aObject)
{
如果(aObject == NULL)
返回FALSE;
,否则
返回aObject是EffectivePermissionKey&放大器;&安培;等于((EffectivePermissionKey)aObject);
}

公共布尔等于(EffectivePermissionKey aObject)
{
返回this.ID == aObject.ID&放大器;&安培; this.OrchardUserID == aObject.OrchardUserID;
}

公共覆盖INT的GetHashCode()
{
// http://stackoverflow.com/a/32502294/3936440
返回选中( ID.GetHashCode()* 23 * 23±OrchardUserID.GetHashCode()* 23);
}

公众诠释身份证;
公众诠释OrchardUserID;
}

在该方法运行大约需要5000递归更新所有项目。



最初花了周围的没有字典。



与DB查询第一种方法通过使用字典替换 INT 键把22秒



现在,随着数据库查询换成使用上述和适当的 TryGetValue()调用它需要的97秒< - WAT



这是怎么回事?什么会导致这种大规模的性能下降?



修改



起初,似乎是一个散列冲突问题我的,所以我说在 EffectivePermissionKey.Equals断点()来验证此方法被调用,但它不叫,因此也没有哈希冲突我猜。



EDIT2



现在我很困惑。我以为等于()只获取调用的时候,哈希代码做的不可以匹配。打印出我的钥匙的哈希码和)在 TryGetValue(使用的密钥后我看到这些代码相匹配。然后我看了看源代码词典<> 键,有一个在 A线FindEntry(),看起来像这样

 如果(项[I] .hashCode ==&的hashCode功放;&安培; comparer.Equals(项[I] .KEY ,键))回报我; 

这意味着,在字典中的 GetHashCode()方法的每个项目键 等于()被调用,因为我处理字典中的所有项目作为项目数据库查询的结果而这些结果在那里反正字典方法之前处理。


解决方案

算了乡亲,对不起,把你的时间,我的做法是完全错误的。让我知道为什么



分解为简单的问题:

  A  - >递归1,DB查询与ID = 1 $ B $条b B节点A的许可 - >递归2,DB查询与ID = 2 
C节点B的许可 - >递归3,DB查询与ID = 3
D节点C的权限 - >递归4,数据库查询与ID = 4



正如你所看到的,一是DB节点D的许可每个树节点查询。



现在为优化这种有缺陷的方法:

 词典< INT ,PermissionData> MYMAP 

...

DB的所有权限查询并插入到MYMAP

...

A - >递归1,myMap.TryGetValue(1,出...)
B - >递归2,myMap.TryGetValue(2,出...)
C - >递归3,myMap.TryGetValue(3,出...)
D - >递归4,myMap.TryGetValue(4,出...)

您现在查询是看做一次,但每个节点上的 TryGetValue()调用。



在我的特定情况下,这实际上是慢作为执行单查询,因为




  • 字典包含尽可能多的按键作为存在的节点,因为每个节点都有一个DB权限项






  • 每个 TryGetValue()要求

    /结果


    1. 创建密钥实例(通过ID和用户ID)

    2. 调用 TryGetValue()

    3. 计算的关键实例的哈希值

    4. 调用等于()




这4个步骤是围绕5000次与执行5000简单的实体框架查询的执行( SELECT * FROM表WHERE ID = ... )。我不知道为什么,但查询速度较快这里,也许是编译器优化了一些东西。



无论如何,我重新设计了整个事情,现在我有一个外环通过用户ID和内部递归遍历女巫使用的字典用一个简单的INT键(节点ID)。它给我照明快速的结果。整个执行现在只需约16秒,有一些更多的调整和线程我得到它下降到1秒以内。任务完成了。


I have a method that uses recursion to traverse a tree and update the items.

Currently the method takes pretty long to process all the items, so i started optimizing things. Among those things is the use of a dictionary instead of executing a DB query for each item.

The dictionary is defined as

System.Collections.Generic.Dictionary<EffectivePermissionKey, MyData>

The key type is defined as

private struct EffectivePermissionKey
{
  // http://blog.martindoms.com/2011/01/03/c-tip-override-equals-on-value-types-for-better-performance/
  public override bool Equals(object aObject)
  {
    if (aObject == null)
      return false;
    else
      return aObject is EffectivePermissionKey && Equals((EffectivePermissionKey)aObject);
  }

  public bool Equals(EffectivePermissionKey aObject)
  {
    return this.ID == aObject.ID && this.OrchardUserID == aObject.OrchardUserID;
  }

  public override int GetHashCode()
  { 
    // http://stackoverflow.com/a/32502294/3936440
    return unchecked(ID.GetHashCode() * 23 * 23 + OrchardUserID.GetHashCode() * 23);
  }

  public int ID;
  public int OrchardUserID;
}

When the method runs it takes around 5000 recursions to update all items.

Initially it took around 100 seconds without the dictionary.

A first approach with DB queries replaced by the use of a dictionary with int keys took 22 seconds.

Now, with DB queries replaced by the use of the dictionary defined above and proper TryGetValue() calls it takes 97 seconds <- WAT.

What is going on here? What could cause this massive performance drop?

Edit

At first, it seemed like a hash collision issue to me, so i added a breakpoint in EffectivePermissionKey.Equals() to verify that this method is called but it's not called, therefore no hash collision i guess.

Edit2

Now i'm confused. I thought Equals() only gets called when the hash code does not match. After printing out the hash codes of my keys and the keys used in TryGetValue() i see that these codes match. Then i looked at the source code of Dictionary<> and there's a line in FindEntry() that looks like this:

if (entries[i].hashCode == hashCode && comparer.Equals(entries[i].key, key)) return i;

This means that for each item key in the dictionary the GetHashCode() and Equals() gets called because i process all items in the dictionary as the items are the result of the DB query whereas these results where processed before the dictionary approach anyway.

解决方案

Nevermind folks, sorry for taking your time, my approach was completely wrong. Let me tell why.

The issue broken down for simplicity:

A -> recursion 1, DB query for permission of node A with ID = 1
  B -> recursion 2, DB query for permission of node B with ID = 2
  C -> recursion 3, DB query for permission of node C with ID = 3
    D -> recursion 4, DB query for permission of node D with ID = 4

As you can see, one DB query per tree node.

Now the flawed approach for optimizing this:

Dictionary<int, PermissionData> myMap

...

DB query of all permissions and insert into myMap

...

A -> recursion 1, myMap.TryGetValue(1, out ...)
  B -> recursion 2, myMap.TryGetValue(2, out ...)
  C -> recursion 3, myMap.TryGetValue(3, out ...)
    D -> recursion 4, myMap.TryGetValue(4, out ...)

You see now that the query is done once but on each node aTryGetValue() call is made.

In my specific case this is actually slower as executing single queries because

  • the dictionary contains as much keys as nodes exist because each node has a DB permission entry

and

  • each TryGetValue() requires / results in

    1. creating a key instance (with ID and user ID)
    2. calling TryGetValue()
    3. calculating the hash of the key instance
    4. calling Equals()

These 4 steps are executed around 5000 times versus executing 5000 simple entity framework queries (SELECT * FROM table WHERE ID = ...). I don't know why, but the queries are faster here, perhaps the compiler optimizes something away.

Anyway, i reworked the whole thing and now i have an outer loop over User IDs and an inner recursive traversal witch uses dictionaries with a simple int key (node ID). It gives me lighting fast results. The whole execution now takes around 16 seconds and with a few more tweaks and threading i got it down to under 1 second. Mission accomplished.

这篇关于为什么我的字典不好用C#组合键执行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆