更快的哈希函数 [英] A faster hash function

查看:223
本文介绍了更快的哈希函数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试实现自己的哈希函数,我使用java将每个字符串的ASCII编号相加。我通过查找哈希表大小和总和来找到哈希码。大小%之和。我想知道在搜索字符串时是否有办法使用相同的流程但减少冲突?

I'm trying to implement my own hash function, i add up the ASCII numbers of each string, using java. I find the hash code by finding the mod of the size of the hash table and the sum. size%sum. I was wondering if there was a way to use the same process but reduce collisions, when searching for the string?

提前致谢。

推荐答案

我会查看String和HashMap的代码,因为它们具有较低的冲突率并且不使用并处理负数。

I would look at the code for String and HashMap as these have a low collision rate and don't use % and handle negative numbers.

来自String的来源

From the source for String

public int hashCode() {
    int h = hash;
    if (h == 0 && value.length > 0) {
        char val[] = value;

        for (int i = 0; i < value.length; i++) {
            h = 31 * h + val[i];
        }
        hash = h;
    }
    return h;
}

来自HashMap的来源

From the source for HashMap

/**
 * Retrieve object hash code and applies a supplemental hash function to the
 * result hash, which defends against poor quality hash functions.  This is
 * critical because HashMap uses power-of-two length hash tables, that
 * otherwise encounter collisions for hashCodes that do not differ
 * in lower bits. Note: Null keys always map to hash 0, thus index 0.
 */
final int hash(Object k) {
    int h = 0;
    if (useAltHashing) {
        if (k instanceof String) {
            return sun.misc.Hashing.stringHash32((String) k);
        }
        h = hashSeed;
    }

    h ^= k.hashCode();

    // This function ensures that hashCodes that differ only by
    // constant multiples at each bit position have a bounded
    // number of collisions (approximately 8 at default load factor).
    h ^= (h >>> 20) ^ (h >>> 12);
    return h ^ (h >>> 7) ^ (h >>> 4);
}

由于HashMap总是2的幂,你可以使用

As the HashMap is always a power of 2 in size you can use

        hash = (null != key) ? hash(key) : 0;
        bucketIndex = indexFor(hash, table.length);

/**
 * Returns index for hash code h.
 */
static int indexFor(int h, int length) {
    return h & (length-1);
}

使用& 快得多,只有当长度为正时才返回正数。

Using & is much faster than % and only return positive numbers as length is positive.

这篇关于更快的哈希函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆