数据结构以实现随机删除和插入,其中元素在[a,b]中加权 [英] Data structure to achieve random delete and insert where elements are weighted in [a,b]

查看:86
本文介绍了数据结构以实现随机删除和插入,其中元素在[a,b]中加权的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想设计一种数据结构和算法,这样,在给定一个元素数组的情况下,每个元素的权重均根据[a,b],我可以实现恒定的时间插入和删除.删除是随机执行的,其中删除元素的概率与其权重成正比.

I would like to design a data structure and algorithm such that, given an array of elements, where each element has a weight according to [a,b], I can achieve constant time insertion and deletion. The deletion is performed randomly where the probability of an element being deleted is proportional to its weight.

我不认为有一种确定性算法可以在恒定时间内完成两个操作,但是我认为应该有随机算法可以完成此操作?

I do not believe there is a deterministic algorithm that can achieve both operations in constant time, but I think there are there randomized algorithms that should be can accomplish this?

推荐答案

我不知道O(1)最坏的情况是否不可能;我认为没有任何特殊原因.但是,肯定有一个简单的数据结构可以达到O(1)的预期时间.

I don't know if O(1) worst-case time is impossible; I don't see any particular reason it should be. But it's definitely possible to have a simple data structure which achieves O(1) expected time.

这个想法是要存储一个成对的(动态数组(或两个平行阵列),其中每一项都与其重量配对;插入是通过在O(1)摊销时间内追加来完成的,并且可以通过与最后一个元素交换索引来删除该元素,以便可以在O(1)的时间从数组末尾删除它.要从加权分布中采样随机元素,请选择一个随机索引,并在半开间隔[0,2)中生成一个随机数.如果它小于元素的权重,请选择该索引处的元素,否则重复此过程,直到选择了元素.想法是每个索引均被选择的可能性相同,它被保留而不是被拒绝的概率与其权重成正比.

The idea is to store a dynamic array of pairs (or two parallel arrays), where each item is paired with its weight; insertion is done by appending in O(1) amortised time, and an element can be removed by index by swapping it with the last element so that it can be removed from the end of the array in O(1) time. To sample a random element from the weighted distribution, choose a random index and generate a random number in the half-open interval [0, 2); if it is less than the element's weight, select the element at that index, otherwise repeat this process until an element is selected. The idea is that each index is equally likely to be chosen, and the probability it gets kept rather than rejected is proportional to its weight.

这是拉斯维加斯算法,这意味着它有望在有限的时间内完成,但完成的可能性极低.当每个权重正好为1时,采样元素所需的迭代次数将最高.在这种情况下,它遵循 p = 1/2的几何分布,因此其期望值为2,该常数与数据结构中元素的数量无关.

This is a Las Vegas algorithm, meaning it is expected to complete in a finite time, but with very low probability it can take arbitrarily long to complete. The number of iterations required to sample an element will be highest when every weight is exactly 1, in which case it follows a geometric distribution with parameter p = 1/2, so its expected value is 2, a constant which is independent of the number of elements in the data structure.

通常,如果所有的权重均在区间[ a b ]中,则实数为0< a < = b ,则预期的迭代次数最多为 b/a .这始终是一个常数,但如果下限 a 相对于 b 较小,则可能是一个较大的常数(即,需要多次迭代才能选择一个样本).

In general, if all weights are in an interval [a, b] for real numbers 0 < a <= b, then the expected number of iterations is at most b/a. This is always a constant, but it is potentially a large constant (i.e. it takes many iterations to select a single sample) if the lower bound a is small relative to b.

这篇关于数据结构以实现随机删除和插入,其中元素在[a,b]中加权的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆