HashMap中类似的字符串是否会导致碰撞机会增加? [英] Will similar Strings in HashMap lead to increased chance of collisions?

查看:205
本文介绍了HashMap中类似的字符串是否会导致碰撞机会增加?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

考虑以下几点:

  HashMap< String,Object> hm = new HashMap<>(); 
final字符串前缀=我的对象;
int counter = 0;

void put(Object value){
hm.put(prefix +(counter ++),value);





$ b

每个条目的关键字都以相同的字符串开头,一个附加的数字,这是否可能导致更多的碰撞?我试图决定从性能的角度来看,创建唯一键的方式是否是一个好主意。

不,它会不。这不是必然的,因为 String#hashcode ;但是因为 HashMap 将通过与最后16位的XOR-16位异或来重新散列您的散列码。

  //这是在内部完成的重新散列
static final hash(Object key){
int h;
return(key == null)? 0:(h = key.hashCode())^(h>>> 16);
}

但即使该会增加碰撞,您可能永远不会觉得它。
对于一个接一个放置条目的小桶/ bin,将调用等于来获得实际 b / b>
$ b

如果某个bin / bucket达到某个阈值,它将被转换为完全平衡的树节点。这样的树中的搜索时间是 0(logn)



甚至 if在重新散列之后,相同的条目会报告相同的散列码,如果匹配,映射仍然需要决定哪个条目更大



如果您的键实现了Comparable,它会尝试调用 Comparable#compareTo 。如果他们没有执行 Comparable System.identityHashcode 将被调用以决定是否平局。

正如你从绩效的角度来看,因为所有这些内部的东西,你的平均搜索时间将是 O(1)在地图中。


Consider the following:

HashMap<String, Object> hm = new HashMap<>();
final String prefix = "My objects ";
int counter = 0;

void put(Object value) {
    hm.put(prefix+(counter++), value);
}

Given that the key of each entry starts with the same string and only differs by a number appended to it, is this likely to lead to more collisions? I'm trying to decide whether this way of creating unique keys is a good idea from a performance perspective.

解决方案

No it will not. And that is not necessarily because of String#hashcode; but because a HashMap will re-hash whatever your hashcode is by XOR-ing firs 16 bits with the last 16.

// this is re-hashing that is done internally
static final int hash(Object key) {
    int h;
    return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}

But even if that would increase collision, you might never feel it. For small buckets/bin where entries are placed one after another (in a linked fashion), equals will be called to get the actual entry you care about.

In case a certain bin/bucket reaches a certain threshold, it will be transformed in a perfectly balanced tree node. The search time in such a tree is 0(logn).

Even if the same entries report the same hashcode after the re-hash, a map has still to decide which entry is bigger in case of a tie.

It would then try to invoke Comparable#compareTo in case your Keys implement Comparable. In case they do not implement Comparable, System.identityHashcode will be called to decide in case of a tie.

As you say from a performance perspective because of all these internal things, your average search time will be O(1) in the Map.

这篇关于HashMap中类似的字符串是否会导致碰撞机会增加?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆