为什么`sxhash`为所有结构返回一个常数? [英] Why does `sxhash` return a constant for all structs?

查看:60
本文介绍了为什么`sxhash`为所有结构返回一个常数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在使用Common Lisp sxhash 函数时结构我对所有结构都得到相同的值(在SBCL中,只有相同类型的结构).例如,以下代码将打印两个整数列表,所有这些整数都具有相同的值.

When using the Common Lisp sxhash function on structs I'm getting the same value for all structs (in SBCL only structs of the same type). For instance, the following code prints two lists of integers all of which have the same value.

(progn 
  (defstruct foo 
    data)
  (print (mapcar #'sxhash (loop for i below 10 collect (make-foo :data i))))
  (defstruct bar 
    data)
  (print (mapcar #'sxhash (loop for i below 10 collect (make-bar :data i)))))

 ;;; Allegro
 (319 319 319 319 319 319 319 319 319 319) 
 (319 319 319 319 319 319 319 319 319 319) 
 ;;; SBCL
 (22591133455133788 22591133455133788 22591133455133788 22591133455133788
 22591133455133788 22591133455133788 22591133455133788 22591133455133788
 22591133455133788 22591133455133788) 
(21321591953876048 21321591953876048 21321591953876048 21321591953876048
 21321591953876048 21321591953876048 21321591953876048 21321591953876048
 21321591953876048 21321591953876048) 

我已经在 Allegro Lisp

I've tried this in both Allegro Lisp and SBCL and they both return (different) constants for all structs (of same type in SBCL). On the linked sxhash Hyperspec page there are the following statements:

  1. 对于任何两个对象x和y,它们都是位向量,字符,conses,数字,路径名,字符串或符号,并且相似,(sxhash x)和(sxhash y)在数学上相同值,即使x和y存在于同一Lisp的不同图像中执行.请参见第3.2.4节(编译文件中的文学对象).

  1. For any two objects, x and y, both of which are bit vectors, characters, conses, numbers, pathnames, strings, or symbols, and which are similar, (sxhash x) and (sxhash y) yield the same mathematical value even if x and y exist in different Lisp images of the same implementation. See Section 3.2.4 (Literal Objects in Compiled Files).

对象的哈希码在单个会话中始终是相同的,前提是该对象的可见性不被修改等于等价测试.请参见第18.1.2节(修改哈希表键).

The hash-code for an object is always the same within a single session provided that the object is not visibly modified with regard to the equivalence test equal. See Section 18.1.2 (Modifying Hash Table Keys).

后面的语句没有指定,但似乎暗示着,两个不等于 equal 的结构具有不同的哈希码(模冲突)是明智的.但是,可疑的结构不在第一段的列表中.最初,我将此归结为Allegro Lisp中的一个错误,但是现在,我在两种不同的实现中看到了这个错误,因此我认为规范中肯定有一些我不理解的地方.

The latter statement does not specify, but seems to imply, that it would be sensible that two structs which are not equal will have differing hash codes (modulo collision). However, structs are suspiciously absent from the list in the first paragraph. At first I chalked this up to a bug in Allegro Lisp but now that I see it in two different implementations I think there must be something about the spec I don't understand.

推荐答案

我已经请求Franz支持,这是他们的回应.大概是因为类似的原因,SBCL正在做类似的事情.

I've queried Franz support and this was their response. Presumably SBCL is doing something similar for similar reasons.

函数cl:sxhash总是返回相同的结构值对象.这样做的原因是因为它没有多余的存储空间其中的唯一哈希码.结果,使用结构作为键效率很低.excl :: hash-table-stats函数演示当给定具有结构作为键的哈希表时;直方图成为最坏的情况,因为每个键都需要相同的索引.

The function cl:sxhash always returns the same value for structure objects. The reason for this is because it has no extra space to store a unique hash code within it. As a result, using structures as keys is very inefficient. The excl::hash-table-stats function demonstrates this when given a hash-table with structs used as keys; the histogram becomes the worst case, because every key wants the same index.

决定保持结构对象的行为相同,因为在所有结构中自动包含一个哈希槽对象会使所有结构的平均长度延长一个单词.为了小型结构,这对于我们的许多用户来说是无法接受的.

The decision was made to keep the same behavior for structure objects, because the automatic inclusion of a hashing slot in all structure objects would have made all structs an average of one word longer. For small structs this is unacceptable for many of our users.

相反,用户可以定义带有额外插槽的结构,并且该结构类型的构造函数可以在其中存储唯一值广告位(随机值或通过增加每次运行构造函数时计数器).另外,创建一个哈希生成函数,该函数访问此哈希槽以生成其价值.如果要散列的结构埋在列表中,则该哈希函数将需要知道如何遍历这些键以获得唯一的价值.最后,然后使用记录在:make-hash-table的:hash-function参数中(仍然使用相等的测试参数),以创建一个哈希表分布均匀.

Instead, a user may define a struct with an extra slot, and the constructor for that struct type could store a unique value into that slot (either a random value or a value gotten by incrementing a counter each time the constructor is run). Also, create a hash generating function which accesses this hash-slot to generate its value. If the structs to be hashed are buried inside a list, then this hash function would need to know how to traverse these keys to obtain a unique value. Finally, then, build your hash-table using the documented :hash-function argument to make-hash-table (still using the equal test argument), to create a hash-table which will be well-distributed.

或者,如果可以保证您的插槽中没有在用作结构中的键之后,结构将被更改哈希表,您可以在自己的设备中使用equalp测试功能进行make-hash-table调用,而不是相等.如果这样做,则使确保这些结构对象没有改变,因为那样的话它们可能不会改变在哈希表中找到.

Alternatively, and if you can guarantee that none of the slots in your structures will be changed after they are used as keys in the hash-table, you can use the equalp test function in your make-hash-table call, rather than equal. If you do, however, make sure that these struct objects don't change, because then they may not be found in the hash-table.

这篇关于为什么`sxhash`为所有结构返回一个常数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆