为 TEqualityComparer.Construct 编写哈希函数的规范方法是什么? [英] What is the canonical way to write a hasher function for TEqualityComparer.Construct?
问题描述
考虑以下记录:
TMyRecord = record
b: Boolean;
// 3 bytes of padding in here with default record alignment settings
i: Integer;
end;
我希望实现 IEqualityComparer
.为此,我想调用 TEqualityComparer
.这需要与 TEqualityComparison
一起提供,这对我来说没有问题.
I wish to implement IEqualityComparer<TMyRecord>
. In order to do so I want to call TEqualityComparer<TMyRecord>.Construct
. This needs to be supplied with a TEqualityComparison<TMyRecord>
which presents no problems to me.
然而,Construct
还需要一个 THasher
,我想知道实现它的规范方法.该函数需要具有以下形式:
However, Construct
also requires a THasher<TMyRecord>
and I would like to know the canonical method for implementing that. The function needs to have the following form:
function MyRecordHasher(const Value: TMyRecord): Integer;
begin
Result := ???
end;
我希望我需要在记录值的两个字段上调用 BobJenkinsHash
,然后将它们组合起来.这是正确的方法吗?我应该如何将它们结合起来?
I expect that I need to call BobJenkinsHash
on both fields of the record value and then combine them some how. Is this the right approach, and how should I combine them?
我不使用 TEqualityComparison<TMyRecord>.Default
的原因是它使用了 CompareMem
,因此由于记录的填充会不正确.
The reason I don't use TEqualityComparison<TMyRecord>.Default
is that it uses CompareMem
and so will be incorrect due to the record's padding.
推荐答案
有效Java (by Joshua Bloch) 关于覆盖 hashCode 的部分可能很有用.它展示了如何组合对象(或记录)的各个部分以有效地构建哈希码.
The Effective Java (by Joshua Bloch) section about overriding hashCode could be useful. It shows how the individual parts of the object (or record) can be combined to efficiently construct a hashCode.
一个好的散列函数往往会为不相等的情况产生不相等的散列码对象.这正是第三条规定的意思.哈希码合约.理想情况下,散列函数应该分发任何不等的实例的合理集合在所有可能的哈希值.实现这一理想可能极其困难.幸运的是,实现公平的近似并不难.这里是一个简单的食谱:
A good hash function tends to produce unequal hash codes for unequal objects. This is exactly what is meant by the third provision of the hashCode contract. Ideally, a hash function should distribute any reasonable collection of unequal instances uniformly across all possible hash values. Achieving this ideal can be extremely difficult. Luckily it is not too difficult to achieve a fair approximation. Here is a simple recipe:
- 在名为
result
的int
变量中存储一些常量非零值,比如 17. 对于对象中的每个重要字段
f
(即equals 方法考虑的每个字段),执行以下操作:
- Store some constant nonzero value, say 17, in an
int
variable calledresult
. For each significant field
f
in your object (each field taken into account by the equals method, that is), do the following:
一个.计算字段的 int
哈希码 c
:..... details省略 ....
a. Compute an int
hash code c
for the field: ..... details omitted ....
B.将步骤 a 中计算出的哈希码 c 合并为结果如下:result = 37*result + c;
b. Combine the hash code c computed in step a into
result as follows: result = 37*result + c;
返回result
.
写完 hashCode
方法后,问问自己相等的实例是否具有相等的哈希码.如果没有,请找出原因并解决问题.
When you are done writing the hashCode
method, ask yourself whether equal instances have equal hash codes. If not, figure out why
and fix the problem.
这可以翻译成Delphi代码如下:
This can be translated into Delphi code as follows:
{$IFOPT Q+}
{$DEFINE OverflowChecksEnabled}
{$Q-}
{$ENDIF}
function CombinedHash(const Values: array of Integer): Integer;
var
Value: Integer;
begin
Result := 17;
for Value in Values do begin
Result := Result*37 + Value;
end;
end;
{$IFDEF OverflowChecksEnabled}
{$Q+}
{$ENDIF}
这将允许 MyRecordHasher
的实现:
function MyRecordHasher(const Value: TMyRecord): Integer;
begin
Result := CombinedHash([IfThen(Value.b, 0, 1), Value.i]);
end;
这篇关于为 TEqualityComparer.Construct 编写哈希函数的规范方法是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!