我如何向量化此python count排序,使其绝对尽可能快? [英] How can I vectorize this python count sort so it is absolutely as fast as it can be?

查看:125
本文介绍了我如何向量化此python count排序,使其绝对尽可能快?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在某些情况下,我试图用python写一个计数排序法来击败内置的音色排序法.现在,它优于内置的排序函数,但仅适用于非常大的数组(长度为100万个整数和更长的整数,我没有尝试超过1000万个整数),并且范围不超过10,000个.此外,胜利是狭窄的,计数排序只能在专门为其量身定制的随机列表中以可观的优势获胜.

I am trying to write a count sort in python to beat the built-in timsort in certain situations. Right now it beats the built in sorted function, but only for very large arrays (1 million integers in length and longer, I haven't tried over 10 million) and only for a range no larger than 10,000. Additionally, the victory is narrow, with count sort only winning by a significant margin in random lists specifically tailored to it.

我已经阅读了有关通过向量化python代码可以获得的惊人性能提升的信息,但是我并不特别了解如何做到这一点或如何在这里使用它.我想知道如何将这些代码向量化以加快速度,并欢迎其他性能建议.

I have read about astounding performance gains that can be gained from vectorizing python code, but I don't particularly understand how to do it or how it could be used here. I would like to know how I can vectorize this code to speed it up, and any other performance suggestions are welcome.

仅适用于python和stdlibs的当前最快版本:

Current fastest version for just python and stdlibs:

from itertools import chain, repeat

def untimed_countsort(unsorted_list):
    counts = {}
    for num in unsorted_list:
        try:
            counts[num] += 1
        except KeyError:
            counts[num] = 1

    sorted_list = list(
        chain.from_iterable(
            repeat(num, counts[num])
            for num in xrange(min(counts), max(counts) + 1)))
    return sorted_list

  • 这里最重要的是原始速度,因此牺牲更多空间以获得速度增长是完全公平的游戏.

    • All that counts is raw speed here, so sacrificing even more space for speed gains is completely fair game.

      我意识到代码已经很简短了,所以我不知道还有多少空间可以提高速度.

      I realize the code is fairly short and clear already, so I don't know how much room there is for improvement in speed.

      如果有人对代码进行了更改以使其更短,只要它没有使其变慢,那也将很棒.

      If anyone has a change to the code to make it shorter, as long as it doesn't make it slower, that would be awesome as well.

      执行时间减少了将近80%!现在,在我当前的测试中,速度是Timsort的三倍!

      Execution time is down almost 80%! Now three times as fast as Timsort on my current tests!

      通过长枪射击来做到这一点的绝对最快的方法是将这种单线与numpy配合使用:

      The absolute fastest way to do this by a LONG shot is using this one-liner with numpy:

      def np_sort(unsorted_np_array):
          return numpy.repeat(numpy.arange(1+unsorted_np_array.max()), numpy.bincount(unsorted_np_array))
      

      它的运行速度比纯python版本快10到15倍,比Timsort快40倍.它需要一个numpy数组并输出一个numpy数组.

      This runs about 10-15 times faster than the pure python version, and about 40 times faster than Timsort. It takes a numpy array in and outputs a numpy array.

      推荐答案

      使用numpy,此函数可简化为以下功能:

      With numpy, this function reduces to the following:

      def countsort(unsorted):
          unsorted = numpy.asarray(unsorted)
          return numpy.repeat(numpy.arange(1+unsorted.max()), numpy.bincount(unsorted))
      

      当我在间隔[0,10000)上对100000个随机整数进行尝试时,运行速度快了40倍. bincount 进行计数,然后 repeat 从计数转换为排序数组

      This ran about 40 times faster when I tried it on 100000 random ints from the interval [0, 10000). bincount does the counting, and repeat converts from counts to a sorted array.

      这篇关于我如何向量化此python count排序,使其绝对尽可能快?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆