HashMap中类似的字符串是否会导致碰撞机会增加? [英] Will similar Strings in HashMap lead to increased chance of collisions?
问题描述
HashMap< String,Object> hm = new HashMap<>();
final字符串前缀=我的对象;
int counter = 0;
void put(Object value){
hm.put(prefix +(counter ++),value);
$ b 每个条目的关键字都以相同的字符串开头,一个附加的数字,这是否可能导致更多的碰撞?我试图决定从性能的角度来看,创建唯一键的方式是否是一个好主意。
不,它会不。这不是必然的,因为 String#hashcode
;但是因为 HashMap
将通过与最后16位的XOR-16位异或来重新散列您的散列码。 //这是在内部完成的重新散列
static final hash(Object key){
int h;
return(key == null)? 0:(h = key.hashCode())^(h>>> 16);
}
但即使该会增加碰撞,您可能永远不会觉得它。
对于一个接一个放置条目的小桶/ bin,将调用等于
来获得实际 b / b>
$ b 如果某个bin / bucket达到某个阈值,它将被转换为完全平衡的树节点
。这样的树中的搜索时间是 0(logn)
。
甚至 if在重新散列之后,相同的条目会报告相同的散列码,如果匹配,映射仍然需要决定哪个条目更大。
如果您的键实现了Comparable,它会尝试调用 Comparable#compareTo
。如果他们没有执行 Comparable
, System.identityHashcode
将被调用以决定是否平局。
正如你从绩效的角度来看,因为所有这些内部的东西,你的平均搜索时间将是 O(1)
在地图中。
Consider the following:
HashMap<String, Object> hm = new HashMap<>();
final String prefix = "My objects ";
int counter = 0;
void put(Object value) {
hm.put(prefix+(counter++), value);
}
Given that the key of each entry starts with the same string and only differs by a number appended to it, is this likely to lead to more collisions? I'm trying to decide whether this way of creating unique keys is a good idea from a performance perspective.
No it will not. And that is not necessarily because of String#hashcode
; but because a HashMap
will re-hash whatever your hashcode is by XOR-ing firs 16 bits with the last 16.
// this is re-hashing that is done internally
static final int hash(Object key) {
int h;
return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}
But even if that would increase collision, you might never feel it.
For small buckets/bin where entries are placed one after another (in a linked fashion), equals
will be called to get the actual entry you care about.
In case a certain bin/bucket reaches a certain threshold, it will be transformed in a perfectly balanced tree node
. The search time in such a tree is 0(logn)
.
Even if the same entries report the same hashcode after the re-hash, a map has still to decide which entry is bigger in case of a tie.
It would then try to invoke Comparable#compareTo
in case your Keys implement Comparable. In case they do not implement Comparable
, System.identityHashcode
will be called to decide in case of a tie.
As you say from a performance perspective because of all these internal things, your average search time will be O(1)
in the Map.
这篇关于HashMap中类似的字符串是否会导致碰撞机会增加?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!