为什么我的归并在Python这么慢? [英] Why is my MergeSort so slow in Python?

查看:314
本文介绍了为什么我的归并在Python这么慢?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一些麻烦理解这种行为。 我测量的执行时间与使用timeit模块,并得到以下结果 10000 循环:

I'm having some troubles understanding this behaviour. I'm measuring the execution time with the timeit-module and get the following results for 10000 cycles:

  • 合并: 1.22722930395
  • 泡泡:0.810706578175
  • 选择:0.469924766812

这是我的$ C $下归并:

def mergeSort(array):
    if len(array) <= 1:
        return array
    else:
        left = array[:len(array)/2]
        right = array[len(array)/2:]
        return merge(mergeSort(left),mergeSort(right))

def merge(array1,array2):
    merged_array=[]
    while len(array1) > 0 or len(array2) > 0:

        if array2 and not array1:
            merged_array.append(array2.pop(0))

        elif (array1 and not array2) or array1[0] < array2[0]:
            merged_array.append(array1.pop(0))

        else:
            merged_array.append(array2.pop(0))
    return merged_array

修改

我已经改变了列表操作使用指针和我的测试,现在与1000的随机数从0-1000列表工作。 (顺便说一句:我改成只有10次点击这里)

I've changed the list operations to use pointers and my tests now work with a list of 1000 random numbers from 0-1000. (btw: I changed to only 10 cycles here)

结果:

  • 合并: 0.0574434420723
  • 泡泡:1.74780097558
  • 选择:0.362952293025

这是我的改写合并的定义:

def merge(array1, array2):
    merged_array = []
    pointer1, pointer2 = 0, 0
    while pointer1 < len(array1) and pointer2 < len(array2):
        if array1[pointer1] < array2[pointer2]:
            merged_array.append(array1[pointer1])
            pointer1 += 1
        else:
            merged_array.append(array2[pointer2])
            pointer2 += 1
    while pointer1 < len(array1):
        merged_array.append(array1[pointer1])
        pointer1 += 1

    while pointer2 < len(array2):
        merged_array.append(array2[pointer2])
        pointer2 += 1

    return merged_array

似乎工作pretty的现在好了:)

seems to work pretty well now :)

推荐答案

对于初学者来说:我无法重现你的时序结果,对100个周期的大小10000列表的详尽的基准测试和你原来的代码段)过帐为主旨的 timeit =HTTPS://gist.github .COM / 1146063>此处。我发现以下结果单次运行的平均时间:

For starters : I cannot reproduce your timing results, on 100 cycles and lists of size 10000. The exhaustive benchmark with timeit of all implementations discussed in this answer (including bubblesort and your original snippet) is posted as a gist here. I find the following results for the average duration of a single run :

  • 在Python的原生(TIM)排序:0.0144600081444
  • 冒泡排序:26.9620819092
  • (你的)原始归并:0.224888720512

现在,让你的功能,更快,你可以​​做几件事情。

Now, to make your function faster, you can do a few things.

  • 修改:嗯,很显然,我是错的,一个(感谢<一href="http://www.reddit.com/r/Python/comments/jm2kz/tail_call_elimination_by_hand_in_python_efficient/c2d8s3n">cwillu). 长度计算需要O(1)在python 但是,删除无用的计算无处不仍然提高事情有点(原归并:0.224888720512,没有长归并:0.195795390606):

  • Edit : Well, apparently, I was wrong on that one (thanks cwillu). Length computation takes O(1) in python. But removing useless computation everywhere still improves things a bit (Original Mergesort: 0.224888720512, no-length Mergesort: 0.195795390606):

def nolenmerge(array1,array2):
    merged_array=[]
    while array1 or array2:
        if not array1:
            merged_array.append(array2.pop(0))
        elif (not array2) or array1[0] < array2[0]:
            merged_array.append(array1.pop(0))
        else:
            merged_array.append(array2.pop(0))
    return merged_array

def nolenmergeSort(array):
    n  = len(array)
    if n <= 1:
        return array
    left = array[:n/2]
    right = array[n/2:]
    return nolenmerge(nolenmergeSort(left),nolenmergeSort(right))

  • 二,在<一个建议href="http://stackoverflow.com/questions/7063697/why-is-my-mergesort-so-slow-in-python/7063913#7063913">this回答, 弹出(0)是线性的。重写你的合并到弹出()在结束

  • Second, as suggested in this answer, pop(0) is linear. Rewrite your merge to pop() at the end:

    def fastmerge(array1,array2):
        merged_array=[]
        while array1 or array2:
            if not array1:
                merged_array.append(array2.pop())
            elif (not array2) or array1[-1] > array2[-1]:
                merged_array.append(array1.pop())
            else:
                merged_array.append(array2.pop())
        merged_array.reverse()
        return merged_array
    

    这是一次快:无LEN归并:0.195795390606,无LEN归并+ fastmerge:0.126505711079

    This is again faster: no-len Mergesort: 0.195795390606, no-len Mergesort+fastmerge: 0.126505711079

    三 - ,这只会是有用的,是如果你使用确实尾部调用优化,没有它的语言,它的一个坏主意 - 您的通话合并到合并不是的tail-recursive ;它要求既(归并左)(归并右)递归虽然有剩余的工作在呼叫(合并)

    Third - and this would only be useful as-is if you were using a language that does tail call optimization, without it , it's a bad idea - your call to merge to merge is not tail-recursive; it calls both (mergeSort left) and (mergeSort right) recursively while there is remaining work in the call (merge).

    但你可以通过使用 CPS (这会耗尽堆栈大小为使合并尾递归即使是少量的列表,如果你不这样做TCO):

    But you can make the merge tail-recursive by using CPS (this will run out of stack size for even modest lists if you don't do tco):

    def cps_merge_sort(array):
        return cpsmergeSort(array,lambda x:x)
    
    def cpsmergeSort(array,continuation):
        n  = len(array)
        if n <= 1:
            return continuation(array)
        left = array[:n/2]
        right = array[n/2:]
        return cpsmergeSort (left, lambda leftR:
                             cpsmergeSort(right, lambda rightR:
                                          continuation(fastmerge(leftR,rightR))))
    

    一旦做到这一点,你可以做TCO的手动推迟通过递归的正常功能(的trampolining~~V ,解释如这里,招原定盖伊斯蒂尔)。 蹦床和CPS工作的伟大在一起

    Once this is done, you can do TCO by hand to defer the call stack management done by recursion to the while loop of a normal function (trampolining, explained e.g. here, trick originally due to Guy Steele). Trampolining and CPS work great together.

    您写的thunk功能,即记录和延迟的应用:它需要一个函数及其参数,并返回一个函数,返回(适用于那些论点,即原来的功能)

    You write a thunking function, that "records" and delays application: it takes a function and its arguments, and returns a function that returns (that original function applied to those arguments).

    thunk = lambda name, *args: lambda: name(*args)
    

    您然后写一个蹦床,管理调用的thunk:它适用于一个thunk,直到在thunk返回一个结果(相对于其它的thunk)

    You then write a trampoline that manages calls to thunks: it applies a thunk until the thunk returns a result (as opposed to another thunk)

    def trampoline(bouncer):
        while callable(bouncer):
            bouncer = bouncer()
        return bouncer
    

    然后,所有剩下的就是冻结(咚)所有的从原来的CPS函数的递归调用,让蹦床解开他们以正确的顺序。你的函数现在返回一个thunk,没有递归(并放弃自己的帧),在每次调用:

    Then all that's left is to "freeze" (thunk) all your recursive calls from the original CPS function, to let the trampoline unwrap them in proper sequence. Your function now returns a thunk, without recursion (and discarding its own frame), at every call:

    def tco_cpsmergeSort(array,continuation):
        n  = len(array)
        if n <= 1:
            return continuation(array)
        left = array[:n/2]
        right = array[n/2:]
        return thunk (tco_cpsmergeSort, left, lambda leftR:
                      thunk (tco_cpsmergeSort, right, lambda rightR:
                             (continuation(fastmerge(leftR,rightR)))))
    
    mycpomergesort = lambda l: trampoline(tco_cpsmergeSort(l,lambda x:x))
    

  • 可悲的是这并没有这么快(递归归并排序:0.126505711079,这trampolined版本:0​​.170638551712)。好吧,我猜递归合并排序算法的栈爆破实际上是在温和:只要你走出数组切片递归模式最左边的路径,算法开始返回(安培;除去帧)。因此,对于10K大小的​​列表,你会得到一个函数栈最log_2(10 000)= 14 ... pretty的微薄之力。

    Sadly this does not go that fast (recursive mergesort:0.126505711079, this trampolined version : 0.170638551712). OK, I guess the stack blowup of the recursive merge sort algorithm is in fact modest : as soon as you get out of the leftmost path in the array-slicing recursion pattern, the algorithm starts returning (& removing frames). So for 10K-sized lists, you get a function stack of at most log_2(10 000) = 14 ... pretty modest.

    您可以做这个幌子<一个稍微复杂的基于堆栈的TCO消除href="http://stackoverflow.com/questions/2116662/help-me-understand-inorder-traversal-without-using-recursion/2116755#2116755">SO答案给出了:

    You can do slightly more involved stack-based TCO elimination in the guise of this SO answer gives:

        def leftcomb(l):
            maxn,leftcomb = len(l),[]
            n = maxn/2
            while maxn > 1:
                leftcomb.append((l[n:maxn],False))
                maxn,n = n,n/2
            return l[:maxn],leftcomb
    
        def tcomergesort(l):
            l,stack = leftcomb(l)
            while stack: # l sorted, stack contains tagged slices
                i,ordered = stack.pop()
                if ordered:
                    l = fastmerge(l,i)
                else:
                    stack.append((l,True)) # store return call
                    rsub,ssub = leftcomb(i)
                    stack.extend(ssub) #recurse
                    l = rsub
            return l
    

    这正好只有一点点快(trampolined归并:0.170638551712,这个基于堆栈的版本:0.144994809628)。显然,堆建设巨蟒确实在我们原来的合并排序的递归调用是pretty的便宜。

    But this goes only a tad faster (trampolined mergesort: 0.170638551712, this stack-based version:0.144994809628). Apparently, the stack-building python does at the recursive calls of our original merge sort is pretty inexpensive.

    最后的结果?我的机器(Ubuntu的整洁的股票的Python 2.7.1+),平均运行时序(满分100次的-except为Bubblesort-,大小10000的列表,包含尺寸0-10000000的随机整数)的有:

    The final results ? on my machine (Ubuntu natty's stock Python 2.7.1+), the average run timings (out of of 100 runs -except for Bubblesort-, list of size 10000, containing random integers of size 0-10000000) are:

    • 在Python的原生(TIM)排序:0.0144600081444
    • 冒泡排序:26.9620819092
    • 原归并:0.224888720512
    • 无LEN归并:0.195795390606
    • 无LEN归并+ fastmerge:0.126505711079
    • trampolined CPS归并+ fastmerge:0.170638551712
    • 基于堆栈的归并+ fastmerge:0.144994809628

    这篇关于为什么我的归并在Python这么慢?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆