生成没有相邻相等元素的列表的所有排列 [英] Generate all permutations of a list without adjacent equal elements

查看:28
本文介绍了生成没有相邻相等元素的列表的所有排列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我们对一个列表进行排序时,比如

When we sort a list, like

a = [1,2,3,3,2,2,1]
sorted(a) => [1, 1, 2, 2, 2, 3, 3]

相等的元素在结果列表中总是相邻的.

equal elements are always adjacent in the resulting list.

我怎样才能实现相反的任务 - 对列表进行洗牌,以便相等的元素永远不会(或尽可能少)相邻?

How can I achieve the opposite task - shuffle the list so that equal elements are never (or as seldom as possible) adjacent?

例如,对于上面的列表,可能的解决方案之一是

For example, for the above list one of the possible solutions is

p = [1,3,2,3,2,1,2]

更正式地,给定一个列表a,生成它的排列p,以最小化对数p[i]==p[i+1].

More formally, given a list a, generate a permutation p of it that minimizes the number of pairs p[i]==p[i+1].

由于列表很大,因此无法生成和过滤所有排列.

Since the lists are large, generating and filtering all permutations is not an option.

额外问题:如何有效地生成所有这些排列?

Bonus question: how to generate all such permutations efficiently?

这是我用来测试解决方案的代码:https://gist.github.com/gebrkn/9f550094b3d24a35aebd

This is the code I'm using to test the solutions: https://gist.github.com/gebrkn/9f550094b3d24a35aebd

UPD:在这里选择获胜者是一个艰难的选择,因为很多人都发布了很好的答案.@VincentvanderWeele@David Eisenstat@Coady@enrico.bacis@srgerg 提供了可以完美生成最佳排列的函数.@tobias_k 和 David 也回答了奖金问题(生成所有排列).对大卫的正确性证明的补充.

UPD: Choosing a winner here was a tough choice, because many people posted excellent answers. @VincentvanderWeele, @David Eisenstat, @Coady, @enrico.bacis and @srgerg provided functions that generate the best possible permutation flawlessly. @tobias_k and David also answered the bonus question (generate all permutations). Additional points to David for the correctness proof.

@VincentvanderWeele 的代码似乎是最快的.

The code from @VincentvanderWeele appears to be the fastest.

推荐答案

这与 Thijser 当前不完整的伪代码一致.这个想法是采用其余项目类型中最常见的,除非它刚刚被采用.(另请参阅Coady 对该算法的实现.)

This is along the lines of Thijser's currently incomplete pseudocode. The idea is to take the most frequent of the remaining item types unless it was just taken. (See also Coady's implementation of this algorithm.)

import collections
import heapq


class Sentinel:
    pass


def david_eisenstat(lst):
    counts = collections.Counter(lst)
    heap = [(-count, key) for key, count in counts.items()]
    heapq.heapify(heap)
    output = []
    last = Sentinel()
    while heap:
        minuscount1, key1 = heapq.heappop(heap)
        if key1 != last or not heap:
            last = key1
            minuscount1 += 1
        else:
            minuscount2, key2 = heapq.heappop(heap)
            last = key2
            minuscount2 += 1
            if minuscount2 != 0:
                heapq.heappush(heap, (minuscount2, key2))
        output.append(last)
        if minuscount1 != 0:
            heapq.heappush(heap, (minuscount1, key1))
    return output

正确性证明

对于两个项目类型,计数为 k1 和 k2,如果 k1 k2,则为 k1 - k2 - 1 个缺陷.= 的情况很明显.其他是对称的;少数元素的每个实例最多防止总共 k1 + k2 - 1 个可能的缺陷中的两个.

Proof of correctness

For two item types, with counts k1 and k2, the optimal solution has k2 - k1 - 1 defects if k1 < k2, 0 defects if k1 = k2, and k1 - k2 - 1 defects if k1 > k2. The = case is obvious. The others are symmetric; each instance of the minority element prevents at most two defects out of a total of k1 + k2 - 1 possible.

这个贪心算法通过以下逻辑返回最优解.如果前缀(部分解决方案)扩展为最佳解决方案,我们称它为安全.显然,空前缀是安全的,如果安全前缀是一个完整的解决方案,那么该解决方案是最佳的.只需归纳表明,每一个贪婪的步骤都保持安全.

This greedy algorithm returns optimal solutions, by the following logic. We call a prefix (partial solution) safe if it extends to an optimal solution. Clearly the empty prefix is safe, and if a safe prefix is a whole solution then that solution is optimal. It suffices to show inductively that each greedy step maintains safety.

贪婪步骤引入缺陷的唯一方法是如果只剩下一种项目类型,在这种情况下只有一种方法可以继续,并且这种方法是安全的.否则,让 P 成为所考虑步骤之前的(安全)前缀,让 P' 成为紧随其后的前缀,让 S 成为扩展 P 的最佳解决方案.如果 S 也扩展了 P',那么我们就完成了.否则,令 P' = Px and S = PQ and Q = yQ',其中 x 和 y 是项,Q 和 Q' 是序列.

The only way that a greedy step introduces a defect is if only one item type remains, in which case there is only one way to continue, and that way is safe. Otherwise, let P be the (safe) prefix just before the step under consideration, let P' be the prefix just after, and let S be an optimal solution extending P. If S extends P' also, then we're done. Otherwise, let P' = Px and S = PQ and Q = yQ', where x and y are items and Q and Q' are sequences.

首先假设P不以y结尾.根据算法的选择,x 在 Q 中的频率至少与 y 一样.考虑仅包含 x 和 y 的 Q 的最大子串.如果第一个子串的 x 至少与 y 一样多,则可以重写它而不会引入以 x 开头的额外缺陷.如果第一个子串的 y 比 x 多,那么其他一些子串的 x 比 y 多,我们可以重写这些子串而没有额外的缺陷,以便 x 先行.在这两种情况下,我们都会根据需要找到扩展 P' 的最优解 T.

Suppose first that P does not end with y. By the algorithm's choice, x is at least as frequent in Q as y. Consider the maximal substrings of Q containing only x and y. If the first substring has at least as many x's as y's, then it can be rewritten without introducing additional defects to begin with x. If the first substring has more y's than x's, then some other substring has more x's than y's, and we can rewrite these substrings without additional defects so that x goes first. In both cases, we find an optimal solution T that extends P', as needed.

现在假设 P 确实以 y 结尾.通过将第一个出现的 x 移到前面来修改 Q.这样做时,我们最多引入一个缺陷(x 过去所在的位置)并消除一个缺陷(yy).

Suppose now that P does end with y. Modify Q by moving the first occurrence of x to the front. In doing so, we introduce at most one defect (where x used to be) and eliminate one defect (the yy).

这是tobias_k 的答案加上有效的测试,以检测当前正在考虑的选择何时以某种方式全局受限.渐近运行时间是最优的,因为生成的开销在输出长度的数量级上.不幸的是,最坏情况的延迟是二次的;可以使用更好的数据结构将其简化为线性(最佳).

This is tobias_k's answer plus efficient tests to detect when the choice currently under consideration is globally constrained in some way. The asymptotic running time is optimal, since the overhead of generation is on the order of the length of the output. The worst-case delay unfortunately is quadratic; it could be reduced to linear (optimal) with better data structures.

from collections import Counter
from itertools import permutations
from operator import itemgetter
from random import randrange


def get_mode(count):
    return max(count.items(), key=itemgetter(1))[0]


def enum2(prefix, x, count, total, mode):
    prefix.append(x)
    count_x = count[x]
    if count_x == 1:
        del count[x]
    else:
        count[x] = count_x - 1
    yield from enum1(prefix, count, total - 1, mode)
    count[x] = count_x
    del prefix[-1]


def enum1(prefix, count, total, mode):
    if total == 0:
        yield tuple(prefix)
        return
    if count[mode] * 2 - 1 >= total and [mode] != prefix[-1:]:
        yield from enum2(prefix, mode, count, total, mode)
    else:
        defect_okay = not prefix or count[prefix[-1]] * 2 > total
        mode = get_mode(count)
        for x in list(count.keys()):
            if defect_okay or [x] != prefix[-1:]:
                yield from enum2(prefix, x, count, total, mode)


def enum(seq):
    count = Counter(seq)
    if count:
        yield from enum1([], count, sum(count.values()), get_mode(count))
    else:
        yield ()


def defects(lst):
    return sum(lst[i - 1] == lst[i] for i in range(1, len(lst)))


def test(lst):
    perms = set(permutations(lst))
    opt = min(map(defects, perms))
    slow = {perm for perm in perms if defects(perm) == opt}
    fast = set(enum(lst))
    print(lst, fast, slow)
    assert slow == fast


for r in range(10000):
    test([randrange(3) for i in range(randrange(6))])

这篇关于生成没有相邻相等元素的列表的所有排列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆