从链接哈希表中有效地挑选一个随机元素? [英] Efficiently picking a random element from a chained hash table?

查看:252
本文介绍了从链接哈希表中有效地挑选一个随机元素?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

只是为了练习(而不是作业作业),我一直在努力解决这个问题(CLRS,第3版,练习11.2-6):

Just for practice (and not as a homework assignment) I have been trying to solve this problem (CLRS, 3rd edition, exercise 11.2-6):


假设我们已经将n个密钥存储在大小为m的哈希表中,通过链接解决
冲突,并且我们知道每个
链的长度,包括长度L最长链。描述一个
程序,从散列表中的
中随机均匀地选择一个密钥,并在预期时间O(L *(1 + m / n))中返回它。

Suppose we have stored n keys in a hash table of size m, with collisions resolved by chaining, and that we know the length of each chain, including the length L of the longest chain. Describe a procedure that selects a key uniformly at random from among the keys in the hash table and returns it in expected time O(L * (1 + m/n)).

到目前为止我所想到的是每个键的返回概率是1 / n。如果我们尝试获取1到n之间的随机值x,并尝试按顺序首先按桶排序,然后沿着桶中的链排列找到第x个密钥,那么将通过进行O(m)找到正确的桶通过桶逐个和O(L)时间获得链中的正确键。

What I thought so far is that the probability of each key being returned is 1/n. If we try to get a random value x between 1 to n, and try to find the xth key in sequence first sorted by bucket then along the chain in the bucket, then it will take O(m) to find the right bucket by going through buckets one by one and O(L) time to get the right key in chain.

推荐答案

重复以下步骤,直到元素被返回:

Repeat the following steps until an element is returned:


  • 随机选择一个存储桶。让 k 是桶中元素的数量。

  • 选择 p 均匀地从 1 ... L 。如果 p <= k 然后返回桶中的 p 元素。

  • Randomly select a bucket. Let k be the number of elements in the bucket.
  • Select p uniformly at random from 1 ... L. If p <= k then return the pth element in the bucket.

应该清楚的是,该过程会随机返回一个元素。我们对排列在2D数组中的元素应用拒绝抽样。

It should be clear that the procedure returns an element uniformly at random. We are sort of applying rejection sampling to the elements placed in a 2D array.

每个桶的预期元素数是 n / m 。抽样尝试成功的概率是(n / m)/ L 。因此,找到桶的尝试次数是 L * m / n 。与从桶中检索元素的 O(L)成本一起,预期运行时间为 O(L *(1 + m / n ))

The expected number of elements per bucket is n / m. The probability that the sampling attempt succeeds is (n / m) / L. The expected number of attempts needed to find a bucket is therefore L * m / n. Together with the O(L) cost of retrieving the element from the bucket, the expected running time is O(L * (1 + m / n)).

这篇关于从链接哈希表中有效地挑选一个随机元素?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆