惰性洗牌算法 [英] Lazy Shuffle Algorithms

查看:32
本文介绍了惰性洗牌算法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一大堆要以随机顺序迭代的元素.但是,我无法修改列表,也不想创建它的副本,因为 1) 它很大 2) 可以预期迭代会提前取消.

I have a large list of elements that I want to iterate in random order. However, I cannot modify the list and I don't want to create a copy of it either, because 1) it is large and 2) it can be expected that the iteration is cancelled early.

List<T> data = ...;
Iterator<T> shuffled = shuffle(data);
while (shuffled.hasNext()) {
    T t = shuffled.next();
    if (System.console().readLine("Do you want %s?", t).startsWith("y")) {
        return t;
    }
}
System.out.println("That's all");
return t;

我正在寻找一种算法,上面的代码可以在 O(n) 中运行(最好只需要 O(log n) 空间),所以缓存较早生成的元素不是一种选择.我不在乎算法是否有偏见(只要不明显).

I am looking for an algorithm were the code above would run in O(n) (and preferably require only O(log n)space), so caching the elements that were produced earlier is not an option. I don't care if the algorithm is biased (as long as it's not obvious).

(我在我的问题中使用了伪 Java,但如果您愿意,您也可以使用其他语言)

(I uses pseudo-Java in my question, but you can use other languages if you wish)

这是我目前得到的最好的.

Here is the best I got so far.

Iterator<T> shuffle(final List<T> data) {
    int p = data.size();
    while ((data.size() % p) == 0) p = randomPrime();
    return new Iterator<T>() {
        final int prime = p;
        int n = 0, i = 0;
        public boolean hasNext() { return i < data.size(); }
        public T next() {
            i++; n += prime;
            return data.get(n);
        }
    }
}

迭代O(n)中的所有元素,恒定空间,但显然有偏差,因为它只能产生data.size()排列.

Iterating all elements in O(n), constant space, but obviously biased as it can produce only data.size() permutations.

推荐答案

我所知道的处理索引的最简单的改组方法.如果 List 不是 ArrayList,如果您尝试使用以下之一(LinkedList确实有一个 get 通过 ID,但它是 O(n),所以你最终会得到 O(n^2) 时间.

The easiest shuffling approaches I know of work with indices. If the List is not an ArrayList, you may end up with a very inefficient algorithm if you try to use one of the below (a LinkedList does have a get by ID, but it's O(n), so you'll end up with O(n^2) time).

如果 O(n) 空间很好,我假设它不是,我推荐 Fisher-Yates/Knuth shuffle,时间复杂度为 O(n) 且易于实现.您可以对其进行优化,以便在获取第一个元素之前只需执行一次操作,但您需要随时跟踪修改后的列表的其余部分.

If O(n) space is fine, which I'm assuming it's not, I'd recommend the Fisher-Yates / Knuth shuffle, it's O(n) time and is easy to implement. You can optimise it so you only need to perform a single operation before being able to get the first element, but you'll need to keep track of the rest of the modified list as you go.

我的解决方案:

好的,所以这根本不是随机的,但如果你想要小于 O(n) 的空间,我看不出更好的方法.

Ok, so this is not very random at all, but I can't see a better way if you want less than O(n) space.

需要 O(1) 空间和 O(n) 时间.

It takes O(1) space and O(n) time.

可能有一种方法可以稍微提高空间使用率并获得更多随机结果,但我还没有弄清楚.

There may be a way to push it up the space usage a little and get more random results, but I haven't figured that out yet.

它与相对素数有关.这个想法是,给定 2 个相对素数 a(生成器)和 b,当你遍历 a % b, 2a% b, 3a % b, 4a % b, ..., 你会看到每一个整数 0, 1, 2, ..., b-2,b-1,这也会在看到任何整数两次之前发生.不幸的是,我没有证明的链接(维基百科链接可能会提到或暗示它,我没有检查太多细节).

It has to do with relative primes. The idea is that, given 2 relative primes a (the generator) and b, when you loop through a % b, 2a % b, 3a % b, 4a % b, ..., you will see every integer 0, 1, 2, ..., b-2, b-1, and this will also happen before seeing any integer twice. Unfortunately I don't have a link to a proof (the wikipedia link may mention or imply it, I didn't check in too much detail).

我开始增加长度直到我们得到一个质数,因为这意味着任何其他数字都将是一个相对质数,这是一个更少的限制(并且跳过任何大于原始长度的数字),然后生成一个随机数,并将其用作生成器.

I start off by increasing the length until we get a prime, since this implies that any other number will be a relative prime, which is a whole lot less limiting (and just skip any number greater than the original length), then generate a random number, and use this as the generator.

我正在迭代并打印出所有值,但是在给定当前值的情况下,修改以生成下一个值应该很容易.

I'm iterating through and printing out all the values, but it should be easy enough to modify to generate the next one given the current one.

注意我用我的 nextInt 跳过了 1len-1,因为它们会产生 1,2,3,......,3,2,1 分别,但您可以包含这些,但如果长度低于某个阈值,则可能不包含.

Note I'm skipping 1 and len-1 with my nextInt, since these will produce 1,2,3,... and ...,3,2,1 respectively, but you can include these, but probably not if the length is below a certain threshold.

您可能还想生成一个随机数,以将生成器乘以 (mod the length) 开始.

You may also want to generate a random number to multiply the generator by (mod the length) to start from.

Java 代码:

static Random gen = new Random();

static void printShuffle(int len)
{
  // get first prime >= len
  int newLen = len-1;
  boolean prime;
  do
  {
     newLen++;
     // prime check
     prime = true;
     for (int i = 2; prime && i < len; i++)
        prime &= (newLen % i != 0);
  }
  while (!prime);

  long val = gen.nextInt(len-3) + 2;
  long oldVal = val;
  do
  {
     if (val < len)
        System.out.println(val);
     val = (val + oldVal) % newLen;
  }
  while (oldVal != val);
}

这篇关于惰性洗牌算法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆