Swift使用哪种通用排序算法?它在排序的数据上表现不佳 [英] Which general purpose sorting algorithm does Swift use? It does not perform well on sorted data

查看:121
本文介绍了Swift使用哪种通用排序算法?它在排序的数据上表现不佳的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在使用Swift标准库sort()函数的Array类型进行探测.令我惊讶的是,我注意到它在已经分类的数据上表现不佳.

I have been picking and probing at Swift standard libraries sort() function for its Array type. To my surprise I have noticed it performs poorly on already-sorted data.

对经过混洗的Int数组进行排序似乎比对已经排序的相同数组进行排序要快5倍.与按排序顺序对同一对象进行排序相比,对经过改组的对象数组进行排序的速度大约快4倍(我确信对对象数组与Int数组进行排序使用了不同的算法,所以我对两者进行了排序以消除偏差).

Sorting an array of Int which is shuffled seems to be 5x faster than sorting that very same array when it is already sorted. Sorting an array of shuffled objects is about 4x faster than sorting the very same one already in sorted order (sorting object array vs Int array use different algorithms I am sure so I sorted both to eliminate bias).

这些是结果:

Shuffled Int array sort time: 1.3961209654808
Shuffled ColorObject array sort time: 3.14633798599243
NOnshuffled Int array sort time: 7.34714204072952
NOnshuffled ColorObject array sort time: 10.9310839772224

以下是我的代码供参考:

For reference below is my code:

class ElapsedTimer {

    let startTime: CFAbsoluteTime
    var endTime: CFAbsoluteTime?

    init() {
        startTime = CFAbsoluteTimeGetCurrent()
    }

    func stop() -> CFAbsoluteTime {
        endTime = CFAbsoluteTimeGetCurrent()
        return duration!
    }

    var duration: CFAbsoluteTime? {
        if let endTime = endTime {
            return endTime - startTime
        } else {
            return nil
        }
    }
}

public class CountedColor {
    public private(set) var count: Int
    public private(set) var color: UIColor

    public init(color: UIColor, colorCount: Int) {
        self.count = colorCount
        self.color = color
    }
}

var distributedIntArray = [Int]()

for value in 1..<1000000 {
    distributedIntArray.append(value)
}

var distributedCountedColorArray = distributedIntArray.map{ CountedColor(color: UIColor.white, colorCount: $0) }

distributedCountedColorArray.shuffle()
distributedIntArray.shuffle()

var timer = ElapsedTimer()
distributedIntArray.sort()
print("Shuffled Int array sort time: \(timer.stop())")
timer = ElapsedTimer()

distributedCountedColorArray.sort{ return $0.count < $1.count }
print("Shuffled Color array sort time: \(timer.stop())")
timer = ElapsedTimer()

distributedIntArray.sort()
print("NOnshuffled Int array sort time: \(timer.stop())")
timer = ElapsedTimer()

distributedCountedColorArray.sort{ return $0.count < $1.count }
print("Non shuffled Color array sort time: \(timer.stop())")

我的数组shuffle()方法是从这篇文章中提取的.我的ElapsedTimer只是包装并使用CACurrentMediaTime()函数.

My array shuffle() method was pulled from this post. My ElapsedTimer simply wraps and uses CACurrentMediaTime() functions.

我的问题是为什么我会看到这种行为?尤其是当我对对象数组进行排序时,肯定应该使用通用排序. swift使用哪种通用排序算法?当然,最坏情况和平均情况不会像mergeSort一样.

My question is why am I seeing this behavior? Especially when I am sorting the object array which should surely be using a general purpose sort. What kind of general purpose sorting algorithm is swift using? It surely can’t be one where the worst case and average case are the same like mergeSort.

推荐答案

Swift使用 Introsort .看着源代码,我们看到了所选的枢轴是第一个元素. Introsort上的Wikipedia页面上说:

Swift uses Introsort. Looking at the source code we see that the chosen pivot is the first element. The wikipedia page on Introsort says:

(...),关键操作之一是选择枢轴: 围绕列表进行分区的元素.最简单的枢轴 选择算法是采用第一个或最后一个元素 列表作为枢纽,导致排序或排序情况下的不良行为 几乎排序的输入.

(...), one of the critical operations is choosing the pivot: the element around which the list is partitioned. The simplest pivot selection algorithm is to take the first or the last element of the list as the pivot, causing poor behavior for the case of sorted or nearly sorted input.

因此,在给定实现选择的情况下,Swift的排序性能对于排序后的输入最差是完全可以预见的.

Thus it is entirely predictable, given the implementation choice, that Swift's sorting performance is worst for sorted inputs.

我为想要轻松复制OP声明的人建立了完整的基准: https://github .com/lemire/Daniel-Lemire-s-blog/tree/master/extra/swift/sort

I have built a complete benchmark for people who want to easily reproduce the OP's claims : https://github.com/lemire/Code-used-on-Daniel-Lemire-s-blog/tree/master/extra/swift/sort

作为参考,GNU ISO C ++标准库使用3位中值枢轴(根据stl_algo.h标头).

For reference, the GNU ISO C++ standard library uses a median-of-3 pivot (as per the stl_algo.h header).

这篇关于Swift使用哪种通用排序算法?它在排序的数据上表现不佳的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆