usort()排序算法如何工作? [英] How the usort() sorting algorithm works?

查看:86
本文介绍了usort()排序算法如何工作?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个usort()示例,并添加了一些echo语句以查看代码的工作方式:

<?php
function list_cmp($a, $b) {
    global $order;
    echo "\$a=$a, \$b=$b </br>";

    foreach ($order as $key => $value) {
        echo "\$value=$value </br>";
        if ($a == $value) {
            echo "\$a=\$value, returing 0. </br>";
            return 0;
        }
        if ($b == $value) {
            echo "\$b=\$value, returing 1. </br>";
            return 1;
        }
    }
}

$order[0] = 1;
$order[1] = 3;
$order[2] = 4;
$order[3] = 2;

$array[0] = 2;
$array[1] = 1;
$array[2] = 3;
$array[3] = 4;
$array[4] = 2;
array[5] = 1;
$array[6] = 2;

usort($array, "list_cmp");
?>

代码的输出是这样的:

$a=2, $b=1 
$value=1 
$b=$value, returing 1. 
$a=2, $b=3 
$value=1 
$value=3 
$b=$value, returing 1. 
$a=1, $b=3 
$value=1 
$a=$value, returing 0. 
$a=2, $b=4 
$value=1 
$value=3 
$value=4 
$b=$value, returing 1. 
$a=3, $b=4 
$value=1 
$value=3 
$a=$value, returing 0. 
$a=2, $b=2 
$value=1 
$value=3 
$value=4 
$value=2 
$a=$value, returing 0. 
$a=2, $b=1 
$value=1 
$b=$value, returing 1. 
$a=2, $b=1 
$value=1 
$b=$value, returing 1. 
$a=4, $b=1 
$value=1 
$b=$value, returing 1. 
$a=3, $b=1 
$value=1 
$b=$value, returing 1. 
$a=1, $b=1 
$value=1 
$a=$value, returing 0. 
$a=2, $b=2 
$value=1 
$value=3 
$value=4 
$value=2 
$a=$value, returing 0. 

创建12对$ a- $ b对的机制是什么-2-1、2-3、1-3、2-4、3-4、2-2、2-1、2-1(同样吗?),4-1、3-1、1-1、2-2.上面的代码返回1,1,0,1,0,0,1,1,1,1,0,0.还有,根据返回的值对数组进行排序的机制是什么?我正在尝试了解usort()机制的工作方式.

谢谢.

解决方案

  1. 比较器如何工作?

我不确定这是否是问题的一部分,但是要弄清楚比较器是如何工作的: 您有一个由有序列表$order指定的命令和一个特殊的比较器回调 list_cmp,该回调(应)返回所有参数

  • $a小于$b(return -1或值< 0)
  • $a大于$b(return 1或值> 0)
  • $a等于$b(return 0)

list_cmp通过查找其订单表而不是检查

来做到这一点
  • $a具有较小(或相等)的顺序,在这种情况下,循环会早于return 0或如果
  • 退出
  • $b具有较小的顺序,在这种情况下,循环会以return 1提前退出.

请注意,根据PHP文档,这是错误的,该文档指出希望将正/负/0作为返回值.仅当您知道内部仅检查comparator($a,$b) > 0时才是正确的,这意味着它仅检查$b是否小于且不等于$a,从而将其作为比较order of $a <= order of $b.如果代码开始检查$a小于且不等于$b,则很容易中断.

  1. 快速排序排序如何工作?

对于初学者来说,我假设您使用的是PHP 7或更高版本.在这种情况下,您会遇到6-15个元素大小的数组的特殊情况. PHP 7+似乎并没有在短列表中使用quicksort,而是使用了Insertion Sort Variant(由于硬件相关的东西(例如缓存/代码局部性),因此插入速度可能更快).您可以检查排序源代码f.e.在 Github PHP Mirror(例如:PHP 7.0)上Zend/Zend_sort.c第177-198行).

代码分3个步骤工作:

  1. 比较:比较相邻元素array[j]array[j+1],如果array[j] <= array[j+1]继续,则转到2.
  2. 查找插入点:现在,如果是array[j] > array[j+1],则向后扫描以找到array[x] < array[j+1] <= array[x+1]对应的x < j的点(显然只有在x击中起点之前)
  3. 插入:将元素x+1 ... j向上移动一个,使其变为x+2 ... j+1,然后将前一个元素插入位置x+1

如果您将该代码应用于配对(2-1、2-3、1-3、2-4、3-4、2-2、2-1、2-1、4-1、3- 1、1-1、2-2),那么代码的作用就显而易见了.

-- [2,1],3,4,2,1,2 -> 1./2./3. compare [2,1], find and insert 1 before 2
-- 1,[2,3],4,2,1,2 -> 1./2. compare [2,3], find insert point for 3 (since order of 3 < order of 2)
-- [1,3],2,4,2,1,2 -> 3. compare [1,3], found insert point for 3 before 2
-- 1,3,[2,4],2,1,2 -> 1./2. compare [2,4], find insert point for 4 (since order of 4 < order of 2) 
-- 1,[3,4],2,2,1,2 -> 3. compare [3,4], found insert point for 4 before 2
-- 1,3,4,[2,2],1,2 -> 1. compare [2,2], skip
-- 1,3,4,2,[2,1],2 -> 1./2. compare [2,1], find insert point for 1
-- 1,3,4,[2,1],2,2 -> 2. compare [2,1], find insert point for 1
-- 1,3,[4,1],2,2,2 -> 2. compare [4,1], find insert point for 1
-- 1,[3,1],4,2,2,2 -> 2. compare [3,1], find insert point for 1
-- [1,1],3,4,2,2,2 -> 3. compare [1,1], fond insert point for 1 before 3
-- 1,1,3,4,2,[2,2] -> 1. compare [2,2], skip
-- sorted: 1,1,3,4,2,2,2

PS: 在这里,您已经看到,通过比较模式推导即使是简单的排序算法(22行代码)的工作也非常复杂. PHP 7 quicksort的实现大约是代码行的10倍,并且进行了一些怪异的优化(除了由于数据透视选择和递归而导致的正常疯狂之外).

在大多数情况下,最好忽略深入的实现细节,而仅将其简化为所需的内容.对于排序算法,典型的问题是它是否稳定/不稳定并且在O(log n)中以O(n)的内存消耗执行.在诸如 Quicksort Dance 或任何其他可视化效果或一本很好的旧版(e)书或带有示例的网页中,有更简单的方法来学习这些优化实现背后的核心算法.

-已编辑

为插入排序添加了一个(错误,未优化,不安全的)php实现,以实现其工作方式的另一种可视化:

<?php

function my_usort($A, $comparator) {
  // Start .. End Positions
  $current_pos = 0;
  $last_pos = count($A)-1;
  // Outer Loop: each step checks that A[0] up to A[current_pos] is sorted. 
  // When the algorithm finishes we know that A[0] ... A[last_pos] is sorted
  while($current_pos < $last_pos) {
    echo "Sorted Subarray from \$A is " . json_encode(array_slice($A, 0, $current_pos+1)) . "<br>\n";
    echo "\$A looks like this now: " . json_encode($A) . 
    ", comparing [" . $A[$current_pos] . "," . $A[$current_pos +1] . "] (verify step)<br>\n";
    // "Verification Step"
    // At this point A[0] ... A[current_pos] is sorted.
    // Check A[current_pos] <= A[current_pos +1]
    if($comparator($A[$current_pos], $A[$current_pos +1]) > 0) {
      // nope, A[current_pos] > A[current_pos +1] (list_cmp/comparator returns value > 0)
      // "Insertion Step" start, find the correct position for A[current_pos+1] in the already
      // sorted list A[0] ... A[current_pos]
      $insert_point = $current_pos;
      // Swap the missmatching Neighbor pair
      echo "swapping: \$A[" . $insert_point . "] and \$A[" . ($insert_point+1) . "]<br>\n";
      $tmp = $A[$insert_point +1];
      $A[$insert_point +1] = $A[$insert_point];
      $A[$insert_point] = $tmp;
      $sorted_up_to_current_pos = false;
      // Inner Loop: find correct insertion point
      while($insert_point > 0 && !$sorted_up_to_current_pos) {
        echo "\$A looks like this now: " . json_encode($A) . 
        ", comparing [" . $A[$insert_point-1] . "," . $A[$insert_point] . "] (insertion step)<br>\n";
      // "Insertion Step", Swap the missmatching Neighbor pairs until A[0] ... A[current_pos] is sorted again
      if($comparator($A[$insert_point-1], $A[$insert_point]) > 0) {
          // Swap the missmatching Neighbor pair
          echo "swapping: \$A[" . ($insert_point-1) . "] and \$A[" . $insert_point . "]<br>\n";
          $tmp = $A[$insert_point];
          $A[$insert_point] = $A[$insert_point-1];
          $A[$insert_point-1] = $tmp;
          // goto new pair
          $insert_point = $insert_point -1;
        } else {
          // found correct spot, done
          $sorted_up_to_current_pos = true;
        }
      }
      $A[$insert_point] = $tmp;
      echo "\$A looks like this now: " . json_encode($A) . ", insertion done<br>\n";
    }
    $current_pos = $current_pos + 1;
  }
  echo "Sorted Array \$A is " . json_encode(array_slice($A, 0, $current_pos+1)) . "<br>\n";
}

function list_cmp($a, $b) {
  global $order;
  //echo "\$a=$a, \$b=$b </br>\n";

  foreach ($order as $key => $value) {
      //echo "\$value=$value </br>\n";
      if ($a == $value) {
          echo "\$a=\$value, returing 0. </br>\n";
          return 0;
      }
      if ($b == $value) {
          echo "\$b=\$value, returing 1. </br>\n";
          return 1;
      }
  }
}

$order[0] = 1;
$order[1] = 3;
$order[2] = 4;
$order[3] = 2;

$array[0] = 2;
$array[1] = 1;
$array[2] = 3;
$array[3] = 4;
$array[4] = 2;
$array[5] = 1;
$array[6] = 2;

my_usort($array, "list_cmp");

现在已完成输出,其中包含当前排序的数组,位置:

Sorted Subarray from $A is [2] 
$A looks like this now: [2,1,3,4,2,1,2], comparing [2,1] (verify step) 
$b=$value, returing 1.  
swapping: $A[0] and $A[1] 
$A looks like this now: [1,2,3,4,2,1,2], insertion done 
Sorted Subarray from $A is [1,2] 
$A looks like this now: [1,2,3,4,2,1,2], comparing [2,3] (verify step) 
$b=$value, returing 1.  
swapping: $A[1] and $A[2] 
$A looks like this now: [1,3,2,4,2,1,2], comparing [1,3] (insertion step) 
$a=$value, returing 0.  
$A looks like this now: [1,3,2,4,2,1,2], insertion done 
Sorted Subarray from $A is [1,3,2] 
$A looks like this now: [1,3,2,4,2,1,2], comparing [2,4] (verify step) 
$b=$value, returing 1.  
swapping: $A[2] and $A[3] 
$A looks like this now: [1,3,4,2,2,1,2], comparing [3,4] (insertion step) 
$a=$value, returing 0.  
$A looks like this now: [1,3,4,2,2,1,2], insertion done 
Sorted Subarray from $A is [1,3,4,2] 
$A looks like this now: [1,3,4,2,2,1,2], comparing [2,2] (verify step) 
$a=$value, returing 0.  
Sorted Subarray from $A is [1,3,4,2,2] 
$A looks like this now: [1,3,4,2,2,1,2], comparing [2,1] (verify step) 
$b=$value, returing 1.  
swapping: $A[4] and $A[5] 
$A looks like this now: [1,3,4,2,1,2,2], comparing [2,1] (insertion step) 
$b=$value, returing 1.  
swapping: $A[3] and $A[4] 
$A looks like this now: [1,3,4,1,2,2,2], comparing [4,1] (insertion step) 
$b=$value, returing 1.  
swapping: $A[2] and $A[3] 
$A looks like this now: [1,3,1,4,2,2,2], comparing [3,1] (insertion step) 
$b=$value, returing 1.  
swapping: $A[1] and $A[2] 
$A looks like this now: [1,1,3,4,2,2,2], comparing [1,1] (insertion step) 
$a=$value, returing 0.  
$A looks like this now: [1,1,3,4,2,2,2], insertion done 
Sorted Subarray from $A is [1,1,3,4,2,2] 
$A looks like this now: [1,1,3,4,2,2,2], comparing [2,2] (verify step) 
$a=$value, returing 0.  
Sorted Array $A is [1,1,3,4,2,2,2] 

I have an usort() example and I added some echo statements to see how the code works:

<?php
function list_cmp($a, $b) {
    global $order;
    echo "\$a=$a, \$b=$b </br>";

    foreach ($order as $key => $value) {
        echo "\$value=$value </br>";
        if ($a == $value) {
            echo "\$a=\$value, returing 0. </br>";
            return 0;
        }
        if ($b == $value) {
            echo "\$b=\$value, returing 1. </br>";
            return 1;
        }
    }
}

$order[0] = 1;
$order[1] = 3;
$order[2] = 4;
$order[3] = 2;

$array[0] = 2;
$array[1] = 1;
$array[2] = 3;
$array[3] = 4;
$array[4] = 2;
array[5] = 1;
$array[6] = 2;

usort($array, "list_cmp");
?>

The output of the code is this:

$a=2, $b=1 
$value=1 
$b=$value, returing 1. 
$a=2, $b=3 
$value=1 
$value=3 
$b=$value, returing 1. 
$a=1, $b=3 
$value=1 
$a=$value, returing 0. 
$a=2, $b=4 
$value=1 
$value=3 
$value=4 
$b=$value, returing 1. 
$a=3, $b=4 
$value=1 
$value=3 
$a=$value, returing 0. 
$a=2, $b=2 
$value=1 
$value=3 
$value=4 
$value=2 
$a=$value, returing 0. 
$a=2, $b=1 
$value=1 
$b=$value, returing 1. 
$a=2, $b=1 
$value=1 
$b=$value, returing 1. 
$a=4, $b=1 
$value=1 
$b=$value, returing 1. 
$a=3, $b=1 
$value=1 
$b=$value, returing 1. 
$a=1, $b=1 
$value=1 
$a=$value, returing 0. 
$a=2, $b=2 
$value=1 
$value=3 
$value=4 
$value=2 
$a=$value, returing 0. 

What is the mechanism of creating the 12 $a-$b pairs - 2-1, 2-3, 1-3, 2-4, 3-4, 2-2, 2-1, 2-1 (the same again?), 4-1, 3-1, 1-1, 2-2. The above code returns 1,1,0,1,0,0,1,1,1,1,0,0. And also what is the mechanism of sorting the array, based on the values that get returned? I am trying to understand how the usort() mechanism works.

Thanks.

解决方案

  1. How does the comparator work?

I'm not sure wether this was part of the question, but to be clear about how the comparator works: You have an order specified by ordered list $order and a special comparator callback list_cmp which (should) return wether argument

  • $a is smaller than $b (return -1 or a value < 0)
  • $a greater than $b (return 1 or a value > 0)
  • $a equal $b (return 0)

list_cmp does this by looking up its order table and rather checking if

  • $a has a smaller (or equal) order in which case the loop exits early with return 0 or if
  • $b has a smaller order in which case the loop exits early with return 1.

Be aware that this is wrong according to the PHP Documentation which states it wants positive/negative/0 as return values. This is only correct if you know that the internals only check for comparator($a,$b) > 0, meaning it only checks if $b is smaller than and not equal to $a, making this a comparison order of $a <= order of $b. It can easily break if the code starts to check for $a being smaller than and not equal to $b.

  1. How does the quicksort sorting work?

For starters I'm assuming you are using PHP 7 or higher. In that case you hit a special case with 6-15 Element sized arrays. PHP 7+ does not seem to use quicksort for short lists, instead it uses an Insertion Sort Variant (which is most probably faster due to Hardware related stuff like caching/code locality). You can check the sorting source code f.e. on the Github PHP Mirror (as an example: PHP 7.0 Zend/Zend_sort.c Line 177-198).

The code works in 3 steps:

  1. comparision: compare neighbor elements array[j] and array[j+1], if array[j] <= array[j+1] move on, else goto 2.
  2. find insertion point: now if array[j] > array[j+1], scan backwards to find the point where array[x] < array[j+1] <= array[x+1] for x < j (obviously only until x hits the start)
  3. insert: shift elements x+1 ... j one up such that it becomes x+2 ... j+1 and insert former element at position x+1

If you apply that code to the pairings (2-1, 2-3, 1-3, 2-4, 3-4, 2-2, 2-1, 2-1, 4-1, 3-1, 1-1, 2-2) it gets obvious what the code does.

-- [2,1],3,4,2,1,2 -> 1./2./3. compare [2,1], find and insert 1 before 2
-- 1,[2,3],4,2,1,2 -> 1./2. compare [2,3], find insert point for 3 (since order of 3 < order of 2)
-- [1,3],2,4,2,1,2 -> 3. compare [1,3], found insert point for 3 before 2
-- 1,3,[2,4],2,1,2 -> 1./2. compare [2,4], find insert point for 4 (since order of 4 < order of 2) 
-- 1,[3,4],2,2,1,2 -> 3. compare [3,4], found insert point for 4 before 2
-- 1,3,4,[2,2],1,2 -> 1. compare [2,2], skip
-- 1,3,4,2,[2,1],2 -> 1./2. compare [2,1], find insert point for 1
-- 1,3,4,[2,1],2,2 -> 2. compare [2,1], find insert point for 1
-- 1,3,[4,1],2,2,2 -> 2. compare [4,1], find insert point for 1
-- 1,[3,1],4,2,2,2 -> 2. compare [3,1], find insert point for 1
-- [1,1],3,4,2,2,2 -> 3. compare [1,1], fond insert point for 1 before 3
-- 1,1,3,4,2,[2,2] -> 1. compare [2,2], skip
-- sorted: 1,1,3,4,2,2,2

PS: Here you already see that it is quite complex to derive the work of even a simple sorting algorithm (22 lines of code) by its comparison patterns. The PHP 7 quicksort implementation is around 10 times as big in lines of code and has some freaky optimizations (besides normal madness due to pivot selection and recursions).

Most of the times it is better to ignore in depth implementation details and only reduce it to the stuff that is needed. Typical questions for a sorting algorithm would be if it is stable/unstable and performing in O(log n) with O(n) Memory consumption. There are easier ways to learn the core algorithms behind those optimized implementations like the Quicksort Dance or any other visualization or a good old (e)book or webpage with examples.

-- Edited

Added a (bad, unoptimized, unsafe) php implementation for insertion sort for another visualization of how it works:

<?php

function my_usort($A, $comparator) {
  // Start .. End Positions
  $current_pos = 0;
  $last_pos = count($A)-1;
  // Outer Loop: each step checks that A[0] up to A[current_pos] is sorted. 
  // When the algorithm finishes we know that A[0] ... A[last_pos] is sorted
  while($current_pos < $last_pos) {
    echo "Sorted Subarray from \$A is " . json_encode(array_slice($A, 0, $current_pos+1)) . "<br>\n";
    echo "\$A looks like this now: " . json_encode($A) . 
    ", comparing [" . $A[$current_pos] . "," . $A[$current_pos +1] . "] (verify step)<br>\n";
    // "Verification Step"
    // At this point A[0] ... A[current_pos] is sorted.
    // Check A[current_pos] <= A[current_pos +1]
    if($comparator($A[$current_pos], $A[$current_pos +1]) > 0) {
      // nope, A[current_pos] > A[current_pos +1] (list_cmp/comparator returns value > 0)
      // "Insertion Step" start, find the correct position for A[current_pos+1] in the already
      // sorted list A[0] ... A[current_pos]
      $insert_point = $current_pos;
      // Swap the missmatching Neighbor pair
      echo "swapping: \$A[" . $insert_point . "] and \$A[" . ($insert_point+1) . "]<br>\n";
      $tmp = $A[$insert_point +1];
      $A[$insert_point +1] = $A[$insert_point];
      $A[$insert_point] = $tmp;
      $sorted_up_to_current_pos = false;
      // Inner Loop: find correct insertion point
      while($insert_point > 0 && !$sorted_up_to_current_pos) {
        echo "\$A looks like this now: " . json_encode($A) . 
        ", comparing [" . $A[$insert_point-1] . "," . $A[$insert_point] . "] (insertion step)<br>\n";
      // "Insertion Step", Swap the missmatching Neighbor pairs until A[0] ... A[current_pos] is sorted again
      if($comparator($A[$insert_point-1], $A[$insert_point]) > 0) {
          // Swap the missmatching Neighbor pair
          echo "swapping: \$A[" . ($insert_point-1) . "] and \$A[" . $insert_point . "]<br>\n";
          $tmp = $A[$insert_point];
          $A[$insert_point] = $A[$insert_point-1];
          $A[$insert_point-1] = $tmp;
          // goto new pair
          $insert_point = $insert_point -1;
        } else {
          // found correct spot, done
          $sorted_up_to_current_pos = true;
        }
      }
      $A[$insert_point] = $tmp;
      echo "\$A looks like this now: " . json_encode($A) . ", insertion done<br>\n";
    }
    $current_pos = $current_pos + 1;
  }
  echo "Sorted Array \$A is " . json_encode(array_slice($A, 0, $current_pos+1)) . "<br>\n";
}

function list_cmp($a, $b) {
  global $order;
  //echo "\$a=$a, \$b=$b </br>\n";

  foreach ($order as $key => $value) {
      //echo "\$value=$value </br>\n";
      if ($a == $value) {
          echo "\$a=\$value, returing 0. </br>\n";
          return 0;
      }
      if ($b == $value) {
          echo "\$b=\$value, returing 1. </br>\n";
          return 1;
      }
  }
}

$order[0] = 1;
$order[1] = 3;
$order[2] = 4;
$order[3] = 2;

$array[0] = 2;
$array[1] = 1;
$array[2] = 3;
$array[3] = 4;
$array[4] = 2;
$array[5] = 1;
$array[6] = 2;

my_usort($array, "list_cmp");

Output is complete now with currently sorted array, positions:

Sorted Subarray from $A is [2] 
$A looks like this now: [2,1,3,4,2,1,2], comparing [2,1] (verify step) 
$b=$value, returing 1.  
swapping: $A[0] and $A[1] 
$A looks like this now: [1,2,3,4,2,1,2], insertion done 
Sorted Subarray from $A is [1,2] 
$A looks like this now: [1,2,3,4,2,1,2], comparing [2,3] (verify step) 
$b=$value, returing 1.  
swapping: $A[1] and $A[2] 
$A looks like this now: [1,3,2,4,2,1,2], comparing [1,3] (insertion step) 
$a=$value, returing 0.  
$A looks like this now: [1,3,2,4,2,1,2], insertion done 
Sorted Subarray from $A is [1,3,2] 
$A looks like this now: [1,3,2,4,2,1,2], comparing [2,4] (verify step) 
$b=$value, returing 1.  
swapping: $A[2] and $A[3] 
$A looks like this now: [1,3,4,2,2,1,2], comparing [3,4] (insertion step) 
$a=$value, returing 0.  
$A looks like this now: [1,3,4,2,2,1,2], insertion done 
Sorted Subarray from $A is [1,3,4,2] 
$A looks like this now: [1,3,4,2,2,1,2], comparing [2,2] (verify step) 
$a=$value, returing 0.  
Sorted Subarray from $A is [1,3,4,2,2] 
$A looks like this now: [1,3,4,2,2,1,2], comparing [2,1] (verify step) 
$b=$value, returing 1.  
swapping: $A[4] and $A[5] 
$A looks like this now: [1,3,4,2,1,2,2], comparing [2,1] (insertion step) 
$b=$value, returing 1.  
swapping: $A[3] and $A[4] 
$A looks like this now: [1,3,4,1,2,2,2], comparing [4,1] (insertion step) 
$b=$value, returing 1.  
swapping: $A[2] and $A[3] 
$A looks like this now: [1,3,1,4,2,2,2], comparing [3,1] (insertion step) 
$b=$value, returing 1.  
swapping: $A[1] and $A[2] 
$A looks like this now: [1,1,3,4,2,2,2], comparing [1,1] (insertion step) 
$a=$value, returing 0.  
$A looks like this now: [1,1,3,4,2,2,2], insertion done 
Sorted Subarray from $A is [1,1,3,4,2,2] 
$A looks like this now: [1,1,3,4,2,2,2], comparing [2,2] (verify step) 
$a=$value, returing 0.  
Sorted Array $A is [1,1,3,4,2,2,2] 

这篇关于usort()排序算法如何工作?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆