Python:对依赖项列表进行排序 [英] Python: sorting a dependency list

查看:130
本文介绍了Python:对依赖项列表进行排序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果使用内置的sorted()函数可以解决我的问题,或者我需要自己解决问题,我正在尝试解决-使用cmp的老派相对容易.

I'm trying to work out if my problem is solvable using the builtin sorted() function or if I need to do myself - old school using cmp would have been relatively easy.

我的数据集如下:


x = [
('business', Set('fleet','address'))
('device', Set('business','model','status','pack'))
('txn', Set('device','business','operator'))
....

排序规则基本上应适用于所有N& Y,其中Y> N,x [N] [0]不在x [Y] [1]

The sort rule should be basically for all value of N & Y where Y > N, x[N][0] not in x[Y][1]

尽管我使用的python 2.6的cmp参数仍然可用,但我仍在尝试使此Python 3安全.

Although I'm using Python 2.6 where the cmp argument is still available I'm trying to make this Python 3 safe.

那么,可以使用一些lambda魔术和key参数来完成此操作吗?

So, can this be done using some lambda magic and the key argument?

-==更新==-

感谢Eli&温斯顿!我真的不认为使用钥匙会行得通,或者我是否怀疑这不是理想的鞋拔解决方案.

Thanks Eli & Winston! I didn't really think using key would work, or if it could I suspected it would be a shoe horn solution which isn't ideal.

因为我的问题是数据库表的依赖关系,所以我不得不对Eli的代码进行少量补充,以从依赖关系列表中删除一项(在一个设计良好的数据库中,这不会发生,但是谁住在那个神奇的完美世界中?)

Because my problem was for database table dependencies I had to make a minor addition to Eli's code to remove an item from its list of dependencies (in a well designed database this wouldn't happen, but who lives in that magical perfect world?)

我的解决方案:

def topological_sort(source):
    """perform topo sort on elements.

    :arg source: list of ``(name, set(names of dependancies))`` pairs
    :returns: list of names, with dependancies listed first
    """
    pending = [(name, set(deps)) for name, deps in source]        
    emitted = []
    while pending:
        next_pending = []
        next_emitted = []
        for entry in pending:
            name, deps = entry
            deps.difference_update(set((name,)), emitted) # <-- pop self from dep, req Py2.6
            if deps:
                next_pending.append(entry)
            else:
                yield name
                emitted.append(name) # <-- not required, but preserves original order
                next_emitted.append(name)
        if not next_emitted:
            raise ValueError("cyclic dependancy detected: %s %r" % (name, (next_pending,)))
        pending = next_pending
        emitted = next_emitted

推荐答案

您想要的东西称为拓扑排序.虽然可以使用内置的sort()来实现,但比较麻烦,最好直接在python中实现拓扑排序.

What you want is called a topological sort. While it's possible to implement using the builtin sort(), it's rather awkward, and it's better to implement a topological sort directly in python.

为什么会很尴尬?如果您在Wiki页面上研究这两种算法,它们都依赖于一组运行的标记节点",很难理解为sort()形式的概念,因为key=xxx(甚至是cmp=xxx)最适合与无状态比较功能一起使用,尤其是因为timsort不能保证检查元素的顺序.我很确定确实使用sort()的所有解决方案都将最终解决为了避免无状态问题,每次调用key/cmp函数时都要冗余计算一些信息.

Why is it going to be awkward? If you study the two algorithms on the wiki page, they both rely on a running set of "marked nodes", a concept that's hard to contort into a form sort() can use, since key=xxx (or even cmp=xxx) works best with stateless comparison functions, particularly because timsort doesn't guarantee the order the elements will be examined in. I'm (pretty) sure any solution which does use sort() is going to end up redundantly calculating some information for each call to the key/cmp function, in order to get around the statelessness issue.

以下是我一直在使用的算法(对一些javascript库相关性进行排序):

The following is the alg I've been using (to sort some javascript library dependancies):

基于Winston Ewert的解决方案对此做了很大的修改

def topological_sort(source):
    """perform topo sort on elements.

    :arg source: list of ``(name, [list of dependancies])`` pairs
    :returns: list of names, with dependancies listed first
    """
    pending = [(name, set(deps)) for name, deps in source] # copy deps so we can modify set in-place       
    emitted = []        
    while pending:
        next_pending = []
        next_emitted = []
        for entry in pending:
            name, deps = entry
            deps.difference_update(emitted) # remove deps we emitted last pass
            if deps: # still has deps? recheck during next pass
                next_pending.append(entry) 
            else: # no more deps? time to emit
                yield name 
                emitted.append(name) # <-- not required, but helps preserve original ordering
                next_emitted.append(name) # remember what we emitted for difference_update() in next pass
        if not next_emitted: # all entries have unmet deps, one of two things is wrong...
            raise ValueError("cyclic or missing dependancy detected: %r" % (next_pending,))
        pending = next_pending
        emitted = next_emitted


旁注:可以cmp()函数插入到key=xxx中,如该python bug跟踪程序


Sidenote: it is possible to shoe-horn a cmp() function into key=xxx, as outlined in this python bug tracker message.

这篇关于Python:对依赖项列表进行排序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆