预期的哈希冲突次数 [英] Expected number of hash collisions

查看:21
本文介绍了预期的哈希冲突次数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我觉得我对这个问题想得太多了,但无论如何……

I feel like I'm way overthinking this problem, but here goes anyway...

我有一个哈希表,它的内部数组中有 M 个槽.我需要在哈希表中插入 N 个元素.假设我有一个散列函数,该函数将 am 元素随机插入一个插槽中,每个插槽的概率相等,那么散列冲突总数的预期值是多少?

I have a hash table with M slots in its internal array. I need to insert N elements into the hash table. Assuming that I have a hash function that randomly inserts am element into a slot with equal probability for each slot, what's the expected value of the total number of hash collisions?

(抱歉,这与其说是编程问题,不如说是数学问题).

(Sorry that this is more of a math question than a programming question).

这是我必须使用 Python 模拟它的一些代码.我得到了数字答案,但无法将其概括为公式并进行解释.

Here's some code I have to simulate it using Python. I'm getting numerical answers, but having trouble generalizing it to a formula and explaining it.

import random
import pdb

N = 5
M = 8

NUM_ITER = 100000

def get_collisions(table):
    col = 0
    for item in table:
        if item > 1:
            col += (item-1)
    return col

def run():
    table = [0 for x in range(M)]

    for i in range(N):
        table[int(random.random() * M)] += 1

    #print table
    return get_collisions(table)

# Main

total = 0
for i in range(NUM_ITER):
    total += run()

print float(total)/NUM_ITER

推荐答案

你会在这里找到答案:Quora.com.m 个桶和 n 个插入的预期碰撞次数是

You'll find the answer here: Quora.com. The expected number of collisions for m buckets and n inserts is

n - m * (1 - ((m-1)/m)^n).

这篇关于预期的哈希冲突次数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆