我如何向量化此python count排序,使其绝对尽可能快? [英] How can I vectorize this python count sort so it is absolutely as fast as it can be?
问题描述
在某些情况下,我试图用python写一个计数排序法来击败内置的音色排序法.现在,它优于内置的排序函数,但仅适用于非常大的数组(长度为100万个整数和更长的整数,我没有尝试超过1000万个整数),并且范围不超过10,000个.此外,胜利是狭窄的,计数排序只能在专门为其量身定制的随机列表中以可观的优势获胜.
I am trying to write a count sort in python to beat the built-in timsort in certain situations. Right now it beats the built in sorted function, but only for very large arrays (1 million integers in length and longer, I haven't tried over 10 million) and only for a range no larger than 10,000. Additionally, the victory is narrow, with count sort only winning by a significant margin in random lists specifically tailored to it.
我已经阅读了有关通过向量化python代码可以获得的惊人性能提升的信息,但是我并不特别了解如何做到这一点或如何在这里使用它.我想知道如何将这些代码向量化以加快速度,并欢迎其他性能建议.
I have read about astounding performance gains that can be gained from vectorizing python code, but I don't particularly understand how to do it or how it could be used here. I would like to know how I can vectorize this code to speed it up, and any other performance suggestions are welcome.
仅适用于python和stdlibs的当前最快版本:
Current fastest version for just python and stdlibs:
from itertools import chain, repeat
def untimed_countsort(unsorted_list):
counts = {}
for num in unsorted_list:
try:
counts[num] += 1
except KeyError:
counts[num] = 1
sorted_list = list(
chain.from_iterable(
repeat(num, counts[num])
for num in xrange(min(counts), max(counts) + 1)))
return sorted_list
-
这里最重要的是原始速度,因此牺牲更多空间以获得速度增长是完全公平的游戏.
All that counts is raw speed here, so sacrificing even more space for speed gains is completely fair game.
我意识到代码已经很简短了,所以我不知道还有多少空间可以提高速度.
I realize the code is fairly short and clear already, so I don't know how much room there is for improvement in speed.
如果有人对代码进行了更改以使其更短,只要它没有使其变慢,那也将很棒.
If anyone has a change to the code to make it shorter, as long as it doesn't make it slower, that would be awesome as well.
执行时间减少了将近80%!现在,在我当前的测试中,速度是Timsort的三倍!
Execution time is down almost 80%! Now three times as fast as Timsort on my current tests!
通过长枪射击来做到这一点的绝对最快的方法是将这种单线与numpy配合使用:
The absolute fastest way to do this by a LONG shot is using this one-liner with numpy:
def np_sort(unsorted_np_array): return numpy.repeat(numpy.arange(1+unsorted_np_array.max()), numpy.bincount(unsorted_np_array))
它的运行速度比纯python版本快10到15倍,比Timsort快40倍.它需要一个numpy数组并输出一个numpy数组.
This runs about 10-15 times faster than the pure python version, and about 40 times faster than Timsort. It takes a numpy array in and outputs a numpy array.
推荐答案
使用numpy,此函数可简化为以下功能:
With numpy, this function reduces to the following:
def countsort(unsorted): unsorted = numpy.asarray(unsorted) return numpy.repeat(numpy.arange(1+unsorted.max()), numpy.bincount(unsorted))
当我在间隔[0,10000)上对100000个随机整数进行尝试时,运行速度快了40倍.
bincount
进行计数,然后repeat
从计数转换为排序数组This ran about 40 times faster when I tried it on 100000 random ints from the interval [0, 10000).
bincount
does the counting, andrepeat
converts from counts to a sorted array.这篇关于我如何向量化此python count排序,使其绝对尽可能快?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!