GUID的高效数据结构 [英] Efficient data structure for GUIDs

查看:144
本文介绍了GUID的高效数据结构的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在搜索一个数据结构,使我能够快速(首先是O(1) - 快速)确定给定的GUID是否是GUID集合的成员。



我目前的方法是使用0作为值的TDictionary。



尽管这样做很快,但使用Hashmap似乎是一种浪费重新创建一个GUID,这是通过被认为是唯一的定义,并且具有不需要的Dictionary字典句柄值。



必须有一个更好的解决方案,但我找不到一个你可以吗?

解决方案

我想你是99%的方式。


$ b $哈希听起来像是正确的解决方案。利用GUID的特殊性质的明显方法是提供您自己的散列函数,该函数组合成一个32位整数,构成一个GUID的4 32位整数。我只是XOR 4个整数。



我假定你使用的是Genericics.Collections.TDictionary。您可以通过将自定义比较器传递给构造函数来提供自己的散列函数。我不会担心存储备用值,我不认为这会以可辨别的方式影响性能。



我相信你将GUID存储为128位整数,而不是字符串。



最后,对我来说,对于GUID的默认比较器可能确实已经这样做了哈希码生成。在进行任何更改之前,请先检查一下。



编辑



代码使用Bob Jenkins哈希应用于二进制数据。 XOR会更快,但是默认的哈希码看起来不像是性能瓶颈。



换句话说,我认为 TDictionary< TGUID,Integer> 将充分满足您的需求。


I am in search of a data structure which enables me to quickly (prefarably O(1)-quickly) determine if a given GUID is a member of a Collection of GUIDs or not.

My current approach is to use a TDictionary with 0 as values.

While this works quickly, it seems to be a waste to use a Hashmap to rehash a GUID, which is by defintion considered to be unique, and to have the Dictionary handle values which are unneeded.

There must be a better solution for this, but I can't find one. Can you?

解决方案

I think you are 99% of the way there.

Hashing sounds like the right solution. The obvious way to take advantage of the special nature of the GUID is to supply your own hash function which combines into a single 32 bit integer the 4 32 bit integers that make up a GUID. I'd just XOR the 4 integers.

I presume you are using Generics.Collections.TDictionary. You can supply your own hash function by passing a custom comparer to the constructor. I wouldn't worry about storing spare values, I don't think it will affect performance in a discernible way.

I trust that you are storing your GUIDs as 128 bit integers and not as strings.

Finally, it has occurred to me that the default comparer for a GUID might indeed already do the hash code generation this way. It's worth checking that out before making any changes.

EDIT

Default hash code uses Bob Jenkins hash applied to the binary data. An XOR would be faster, but the default hash code doesn't seem like it would be a performance bottleneck.

In other words, I think that TDictionary<TGUID,Integer> will serve your needs perfectly adequately.

这篇关于GUID的高效数据结构的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆