为什么冒泡排序效率不高? [英] Why bubble sort is not efficient?

查看:47
本文介绍了为什么冒泡排序效率不高?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 node.js 开发后端项目,并将实现排序产品功能.我研究了一些文章,有几篇文章说冒泡排序效率不高.冒泡排序在我之前的项目中使用过,我很惊讶为什么它很糟糕.谁能解释为什么它效率低下?如果您能用 c 编程或汇编命令解释,将不胜感激.

I am developing backend project using node.js and going to implement sorting products functionality. I researched some articles and there were several articles saying bubble sort is not efficient. Bubble sort was used in my previous projects and I was surprised why it is bad. Could anyone explain about why it is inefficient? If you can explain by c programming or assembler commands it would be much appreciated.

推荐答案

冒泡排序的时间复杂度为 O(N^2),因此与 O(N log N) 排序相比,它对于大型数组来说是垃圾.

Bubble Sort has O(N^2) time complexity so it's garbage for large arrays compared to O(N log N) sorts.

在 JS 中,如果可能,请使用 JS 运行时可能能够通过预编译自定义代码处理的内置排序函数,而不必对排序函数进行 JIT 编译.标准库排序应该(通常?)为 JS 解释器/JIT 进行良好的调整,以有效处理,并使用有效算法的有效实现.

In JS, if possible use built-in sort functions that the JS runtime might be able to handle with pre-compiled custom code, instead of having to JIT-compile your sort function. The standard library sort should (usually?) be well-tuned for the JS interpreter / JIT to handle efficiently, and use an efficient implementation of an efficient algorithm.

这个答案的其余部分假设了一个用例,比如在预先编译的语言中对整数数组进行排序,比如编译为原生 asm 的 C.如果您对以一个成员为键的结构数组进行排序,则不会有太大变化,尽管如果您对 char* 字符串与包含 <代码>int.(冒泡排序对于所有这些交换的情况都是不利的.)

The rest of this answer is assuming a use-case like sorting an array of integers in an ahead-of-time compiled language like C compiled to native asm. Not much changes if you're sorting an array of structs with one member as the key, although cost of compare vs. swap can vary if you're sorting char* strings vs. large structs containing an int. (Bubble Sort is bad for any of these cases with all that swapping.)

请参阅冒泡排序:考古算法分析 更多关于它为什么受欢迎"的信息;(或广泛教授/讨论)尽管是最糟糕的 O(N^2) 类型之一,包括一些历史/教学法的事故.还包括一个有趣的定量分析,它实际上(正如有时声称的那样)是否是使用几个代码指标最容易编写或理解的方法之一.

See Bubble Sort: An Archaeological Algorithmic Analysis for more about why it's "popular" (or widely taught / discussed) despite being one the worst O(N^2) sorts, including some accidents of history / pedagogy. Also including an interesting quantitative analysis of whether it's actually (as sometimes claimed) one of the easiest to write or understand using a couple code metrics.

对于简单的 O(N^2) 排序是合理选择的小问题(例如,快速排序或合并排序的 N <= 32 元素基本情况),通常使用插入排序,因为它具有良好的最佳-case 性能(在已经排序的情况下快速通过,在几乎排序的情况下效率高).

For small problems where a simple O(N^2) sort is a reasonable choice (e.g. the N <= 32 element base case of a Quick Sort or Merge Sort), Insertion Sort is often used because it has good best-case performance (one quick pass in the already-sorted case, and efficient in almost-sorted cases).

冒泡排序(对于没有进行任何交换的传递提前退出)在一些几乎排序的情况下也不可怕,但比插入排序更糟糕.但是一个元素每次只能向列表的前面移动一步,所以如果最小的元素接近末尾但完全排序,它仍然需要冒泡排序 O(N^2) 工作.维基百科解释了兔子和海龟.

A Bubble Sort (with an early-out for a pass that didn't do any swaps) is also not horrible in some almost-sorted cases but is worse than Insertion Sort. But an element can only move toward the front of the list one step per pass, so if the smallest element is near the end but otherwise fully sorted, it still takes Bubble Sort O(N^2) work. Wikipedia explains Rabbits and turtles.

插入排序没有这个问题:接近末尾的一个小元素一旦到达就会被有效地插入(通过复制前面的元素来打开一个间隙).(并且达到它只需要比较已经排序的元素来确定并继续执行零实际插入工作).开始附近的大元素最终会快速向上移动,只需要稍微多一些工作:每个要检查的新元素都必须在该大元素之前插入,毕竟其他元素.所以这是两次比较并且实际上是一次交换,不像冒泡排序每一步交换一次,它是好"的.方向.尽管如此,插入排序的坏方向还是比冒泡排序的坏"方向好得多.方向.

Insertion Sort doesn't have this problem: a small element near the end will get inserted (by copying earlier elements to open up a gap) efficiently once it's reached. (And reaching it only requires comparing already-sorted elements to determine that and move on with zero actual insertion work). A large element near the start will end up moving upwards quickly, with only slightly more work: each new element to be examined will have to be inserted before that large element, after all others. So that's two compares and effectively a swap, unlike the one swap per step Bubble Sort would do in it's "good" direction. Still, Insertion Sort's bad direction is vastly better than Bubble Sort's "bad" direction.

有趣的事实:在真实 CPU 上进行小数组排序的最新技术可以包括使用压缩最小/最大指令的 SIMD 网络排序,以及用于执行多个比较器"的向量混洗;并行.

Fun fact: state of the art for small-array sorting on real CPUs can include SIMD Network Sorts using packed min/max instructions, and vector shuffles to do multiple "comparators" in parallel.

交换模式可能比插入排序更随机,并且对于 CPU 分支预测器来说更难以预测.因此导致比插入排序更多的分支错误预测.

The pattern of swapping is probably more random than Insertion Sort, and less predictable for CPU branch predictors. Thus leading to more branch mispredicts than Insertion Sort.

我自己还没有测试过,但想想插入排序如何移动数据:内循环的每次完整运行都会将一组元素向右移动,为新元素打开一个间隙.该组的大小可能在外循环迭代中保持相当稳定,因此有合理的机会预测该内循环中循环分支的模式.

I haven't tested this myself, but think about how Insertion Sort moves data: each full run of the inner loop moves a group of elements to the right to open up a gap for a new element. The size of that group might stay fairly constant across outer-loop iterations so there's a reasonable chance of predicting the pattern of the loop branch in that inner loop.

但是冒泡排序并没有对部分排序的组进行太多创建;交换模式不太可能重复1.

But Bubble Sort doesn't do so much creation of partially-sorted groups; the pattern of swapping is unlikely to repeat1.

我搜索了我刚刚编造的这个猜测的支持,并确实找到了一些:插入排序比冒泡排序更好?引用维基百科:

I searched for support for this guess I just made up, and did find some: Insertion sort better than Bubble sort? quotes Wikipedia:

冒泡排序与现代 CPU 硬件的交互也很差.它产生的写入次数至少是插入排序的两倍,缓存未命中次数是其两倍,并且渐近地更多分支预测错误.

Bubble sort also interacts poorly with modern CPU hardware. It produces at least twice as many writes as insertion sort, twice as many cache misses, and asymptotically more branch mispredictions.

(IDK 如果写入次数"是基于源的幼稚分析,或者查看经过适当优化的 asm):

(IDK if that "number of writes" was naive analysis based on the source, or looking at decently optimized asm):

这就引出了另一点:冒泡排序很容易编译成低效的代码.交换的概念实现实际上存储到内存中,然后重新读取它刚刚写入的元素.根据编译器的智能程度,这实际上可能发生在 asm 中,而不是在下一次循环迭代中在寄存器中重用该值.在这种情况下,您将在内部循环中存在存储转发延迟,从而创建循环携带的依赖链.并且还会对缓存读取端口/加载指令吞吐量造成潜在瓶颈.

That brings up another point: Bubble Sort can very easily compile into inefficient code. The notional implementation of swapping actually stores into memory, then re-reads that element it just wrote. Depending on how smart your compiler is, this might actually happen in the asm instead of reusing that value in a register in the next loop iteration. In that case, you'd have store-forwarding latency inside the inner loop, creating a loop-carried dependency chain. And also creating a potential bottleneck on cache read ports / load instruction throughput.

脚注 1: 除非您重复对同一个小数组进行排序;我在我的 Skylake CPU 上尝试过一次,使用我为 这个代码高尔夫问题(代码高尔夫版本故意让性能变得糟糕,仅针对机器代码大小进行了优化;IIRC 我基准测试的版本避免了存储转发stallslocked 指令,如 xchg mem,reg).

Footnote 1: Unless you're sorting the same tiny array repeatedly; I tried that once on my Skylake CPU with a simplified x86 asm implementation of Bubble Sort I wrote for this code golf question (the code-golf version is intentionally horrible for performance, optimized only for machine-code size; IIRC the version I benchmarked avoided store-forwarding stalls and locked instructions like xchg mem,reg).

我发现每次使用相同的输入数据(在重复循环中复制一些 SIMD 指令),Skylake 中的 IT-TAGE 分支预测器学习"了特定 ~13 元素冒泡排序的整个分支模式,导致 perf stat 报告低于 1% 的分支错误预测,IIRC.所以它毕竟没有证明我对冒泡排序的大量错误预测,直到我增加了一些数组大小.:P

I found that with the same input data every time (copied with a few SIMD instructions in a repeat loop), the IT-TAGE branch predictors in Skylake "learned" the whole pattern of branching for a specific ~13-element Bubble Sort, leading to perf stat reporting under 1% branch mispredicts, IIRC. So it didn't demonstrate the tons of mispredicts I was expecting from Bubble Sort after all, until I increased the array size some. :P

这篇关于为什么冒泡排序效率不高?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆