在将重复项移动到末尾时排序数组? [英] Sorting an array while moving duplicates to the end?

查看:138
本文介绍了在将重复项移动到末尾时排序数组?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是我朋友编程课程中的一个问题。



Q。如何排序 int s然后安排它们,使所有重复的元素出现在数组的末尾?



例如,给定输入



  {5,2,7,6,1,1,5,6,2} 

输出将是

  {1 ,2,5,6,7,1,2,5,6} 

请注意,是排序和重复的数字在7之后,这是数组中的最大值。



这必须实现使用任何Java库包/ utils



我建议首先使用插入或气泡排序对数组进行排序,然后再遍历数组,执行以下操作:

$ (int i = 0; i< nums.length - 2; i ++){
for(int j = i + 1 ; j< nums.length; j ++){
// current and n ext相同,将元素向上移动
//并将下一个数字放在结尾。
if(nums [i] == nums [j]){
int temp = nums [j]; (int k = j; k {
nums [k] = nums [k + 1];
}
nums [nums.length - 1] = temp;
break;
}
}
}

这就是上面的代码) - 当我尝试这个,我认为这可以通过使用较少的代码,更有效地实现。可能我给了一个错误的建议。



任何想法?

解决方案

p>根据您的问题的参数,有很多解决方法。



如果您不允许使用O(n)外部存储器,那么一个选项是使用标准排序算法在O(n log n)时间内对阵列进行排序,然后运行第二遍,将重复项移动到最后(如 ve建议)。您上面发布的代码采用O(n 2 )时间,但我认为可以使用稍微更复杂的算法在O(n log n)时间内完成此步骤。这个想法有两个步骤。在第一步中,在O(n log n)时间内,将所有未重复的元素以排序顺序带到前端,并将所有重复的元素以非排序顺序返回。一旦你这样做,你然后使用排序算法从第一步在O(n log n)时间排序阵列的后半部分。



我不会去代码排序数组。我真的很喜欢排序,但是有如此多的其他好的资源,如何排列阵列,这不是很好地利用我的时间/空间来进入它们。如果它有帮助,这里链接到 heapsort quicksort smoothsort ,所有这些都在O(n log n)时间运行。 Heapsort和smoothsort只使用O(1)外部存储器,而在最坏的情况下,quicksort可以使用O(n)(尽管很好的实现可以使用可爱的技巧将其限制为O(log n))。



有趣的代码是将所有非重复元素都带到范围前端的逻辑。直观地,代码通过存储两个指针 - 读指针和写指针来工作。读取指针指向要读取的下一个元素,而写入指针指向下一个唯一元素应放置的位置。例如,给定这个数组:

  1 1 1 1 2 2 3 4 5 5 

我们从最初指向1的读写指针开始:

 写v 
1 1 1 1 2 2 3 4 5 5
阅读^

接下来,我们跳过读取的指针到下一个不是1的元素。这找到2:

 <$新评新新新旗新新新新旗新新旗旗哨旗新新新新新新旗新新旗200新新新新新旗新新200 200 200新新新新新旗新新旗200新新新新旗200新新新新旗200新新新新旗200新新新新旗200新新新新新旗200新新新新新新新旗200新新新旗新1992新新旗新新款: $ b 

然后,我们将写指针碰到下一个位置:

 写入v 
1 1 1 1 2 2 3 4 5 5
阅读^

现在,我们将2 200新X- 200 200 -40 200 200 -40 -40 200 200 -40 200 200 -40 200 200 -40 200 200 -40 200 200 -40 200 200 -40 200 200 200 200 200 200: b读^

将读指针提升到不是2的下一个值:

 写v 
1 2 1 1 1 2 3 4 5 X-454545454545 X- 20045 X-454545 X-454545 X-454545 X-454545 X-454545 X-454545 X-新新p醒醒醒新新新旗新新新新新旗新新旗旗新新新新旗新新旗新新旗新新旗新新200新新旗新新旗新新200新新新新旗新新200新新新新旗新新200新新旗新新旗新旗新旗新旗新新旗新新旗新新旗新新旗新新旗新新旗新新旗新新旗旗旗新出/ pre>

再次,我们交换读和写指向的值,并将写指针向前移动,然后将读指针移动到下一个唯一值:

 写入v 
1 2 3 1 1 2 1 4 5 5
阅读$

再次收益

 写v 
1 2 3 4 1 2 1 1 5 5
阅读^

并且最后的迭代给出

 写入v 
1 2 3 4 5 2 1 1 1 5
读^

如果我们现在从写指针排序到读指针, / p>

 写v 
1 2 3 4 5 1 1 1 2 5
阅读^

和宾果!我们已经找到了我们正在寻找的答案。



在(未经测试,对不起...)Java代码中,此修复步骤可能如下所示:

  int read = 0; 
int write = 0;

while(读取< array.length){
/ *交换通过读写指向的值。 * /
int temp = array [write];
array [write] = array [read];
array [read] = temp;

/ *将读取指针前进到下一个唯一值。由于我们
*将唯一的值移动到写入位置,所以我们将值
*与数组[写入]而不是数组[读取]进行比较。
* /
while(read< array.length&&& array [write] == array [read])
++ read;

/ *提前写入指针。 * /
++写; X-454545454545 X- 20045 X- 20045 X- 20045 X- 20045 X- 20045 X- 20045 X- 20045 X- 20045 X- 20045 X- (n log n)算法的问题。由于重新排序步骤使用O(1)内存,因此整体内存使用将为O(1)(对于像smoothsort或heapsort)或O(log n)(对于像quicksort这样的东西)。



编辑:在与朋友聊天之后,我认为基于quicksort的修改,有一个更为优雅的解决方案。通常,当您运行quicksort时,您最终将阵列分区为三个区域:

  + -------- -------- + ---------------- + ---------------- + 
|值<枢轴| values = pivot |值>枢轴|
+ ---------------- + ---------------- + ----------- 200新新新新新旗新新新新旗新新旗新新旗新新旗旗新1992新新新新旗新新旗新新旗新新旗旗新1992新新新新旗新新旗新200新新新新旗新新旗新新旗新200新新新新旗新新旗新200新新新旗新新旗新新旗新新新新新新新旗新新旗新新新新新新新新旗新新旗新新新新新新新新新新款新旗旗新新旗新新旗新新旗新新新新新新新新新款200 。但是,我们可以修改我们的版本的问题。我们需要一个原始的旋转算法,它在数组中占用两个相邻的值块,并在O(n)时间内进行交换。它不会改变这些块中元素的相对顺序。例如,我们可以使用旋转来转换数组

  1 2 3 4 5 6 7 8 

into

  3 4 5 6 7 8 1 2 

,可以在O(n)时间内执行此操作。



通过使用Bentley-McIlroy三通分区algortihm(描述 here ),使用O(1)额外的空间,将阵列元素重新排列到上面所示的配置中。接下来,我们应用旋转来重新排序元素,使它们如下所示:

  + -------- -------- + ---------------- + ---------------- + 
|值<枢轴|值>枢轴| values = pivot |
+ ---------------- + ---------------- + ----------- ----- +

接下来,我们执行交换,以便我们正好移动一个副本枢轴元件至少与枢轴一样大的元件组。这可能有额外的枢纽后面的副本。然后,我们递归地将排序算法应用于<和>范围。当我们这样做时,结果的数组将如下所示:

  + --------- + ----------- + --------- + ------------- + --------- + 
| <枢轴| dup<枢轴| >枢轴| dup>枢轴| = pivot |
+ --------- + ------------- + --------- + ----------- - + --------- +

然后我们对范围进行两次旋转把它放在最后的顺序。首先,使用值大于枢轴的数值旋转小于枢轴的重复值。这给了

  + --------- + --------- + --- ---------- + ------------- + --------- + 
| <枢轴| >枢轴| dup<枢轴| dup>枢轴| = pivot |
+ --------- + --------- + ------------- + ----------- - + --------- +

此时,第一个范围是独特的元素按升序排列:

  + ------------------ --- + ------------- + ------------- + --------- + 
|排序独特的电子| dup<枢轴| dup>枢轴| = pivot |
+ --------------------- + ------------- + --------- ---- + --------- +

最后,最后一轮的重复元素大于枢轴和元素等于枢轴以产生以下结果:

  + ------ --------------- + ------------- + --------- + ---------- --- + 
|排序独特的电子| dup<枢轴| = pivot | dup>枢轴|
+ --------------------- + ------------- + --------- + ------------- +

请注意,最后三个块只是排序的重复值:

  + ------------------ --- + ------------------------------------- + 
|排序独特的电子|排序重复元素|
+ --------------------- + ----------------------- -------------- +

和瞧!我们有一切按照我们想要的顺序。使用与正常快速排序相同的分析,以及我们只在每个级别(三次旋转)进行O(n)工作的事实,这在最好的情况下适用于O(n log n)与O(log n)内存使用。在O(log n)存储器的最坏情况下,它仍然是O(n 2 ),但是发生的概率极低。



strong>如果允许使用O(n)内存,,一个选项是在存储键/值对的所有元素中构建一个平衡的二叉搜索树,其中每个键都是数组,它的值是它出现的次数。您可以按如下格式对数组进行排序:


  1. 对于数组中的每个元素:


    • 如果BST中已经存在该元素,则增加其计数。

    • 否则,添加一个新节点到BST,该元素的数量为1。 / li>

  2. 做BST的乱序。当遇到节点时,输出其密钥。

  3. 进行BST的第二次排序。当遇到一个节点时,如果它的计数大于1,则输出该节点的n-1个副本,其中n是它出现的次数。

该算法的运行时间为O(n log n),但是从头开始编写BST是相当棘手的。它还需要外部空间,我不确定你是否被允许。



但是,如果允许的外部空间和排序的数组很小并包含小整数,您可以使用修改后的计数排序修改上述方法。只需用足够大的数组将原始数组中的每个整数替换为BST即可。这样可以将运行时间减少到O(n + k),使用O(k),其中k是数组中最大的元素。



希望这有帮助! / p>

This was a question in one my friend's programming class.

Q. How do you sort an array of ints and then arrange them such that all duplicate elements appear at the end of the array?

For example, given the input

{5, 2, 7, 6, 1, 1, 5, 6, 2}

The output would be

{1, 2, 5, 6, 7, 1, 2, 5, 6}

Note that the numbers are sorted and duplicate numbers are after 7, which is the maximum in the array.

This has to be achieved with out using any Java library packages/utils.

I suggested to sort the array first using insertion or bubble sort, and then go over the array, perform something like the following :

for (int i = 0; i < nums.length - 2; i++) {
    for (int j = i + 1; j < nums.length; j++) {
        //current and next are same, move elements up
        //and place the next number at the end.
        if (nums[i] == nums[j]) {
            int temp = nums[j];
            for (int k = j; k < nums.length - 1; k++) {
                nums[k] = nums[k + 1];
            }
            nums[nums.length - 1] = temp;
            break;
        }
    }
}

I tried this myself later (and that is how the code above) - As I try this out, I think this could be achieved by using less code, be more efficiently. And may be I gave a wrong advice.

Any thoughts?

解决方案

Depending on the parameters of your problem, there are many approaches to solving this.

If you are not allowed to use O(n) external memory, then one option would be to use a standard sorting algorithm to sort the array in-place in O(n log n) time, then to run a second pass over it to move the duplicates to the end (as you've suggested). The code you posted above takes O(n2) time, but I think that this step can be done in O(n log n) time using a slightly more complicated algorithm. The idea works in two steps. In the first step, in O(n log n) time you bring all non-duplicated elements to the front in sorted order and bring all the duplicates to the back in non-sorted order. Once you've done that, you then sort the back half of the array in O(n log n) time using the sorting algorithm from the first step.

I'm not going to go into the code to sort the array. I really love sorting, but there are so many other good resources on how to sort arrays in-place that it's not a good use of my time/space here to go into them. If it helps, here's links to Java implementations of heapsort, quicksort, and smoothsort, all of which runs in O(n log n) time. Heapsort and smoothsort use only O(1) external memory, while quicksort can use O(n) in the worst case (though good implementations can limit this to O(log n) using cute tricks).

The interesting code is the logic to bring all the non-duplicated elements to the front of the range. Intuitively, the code works by storing two pointers - a read pointer and a write pointer. The read pointer points to the next element to read, while the write pointer points to the location where the next unique element should be placed. For example, given this array:

1 1 1 1 2 2 3 4 5 5

We start with the read and write pointers initially pointing at 1:

write  v
       1 1 1 1 2 2 3 4 5 5
read   ^

Next, we skip the read pointer ahead to the next element that isn't 1. This finds 2:

write  v
       1 1 1 1 2 2 3 4 5 5
read           ^

Then, we bump the write pointer to the next location:

write    v
       1 1 1 1 2 2 3 4 5 5
read           ^

Now, we swap the 2 into the spot held by the write pointer:

write    v
       1 2 1 1 1 2 3 4 5 5
read           ^

advance the read pointer to the next value that isn't 2:

write    v
       1 2 1 1 1 2 3 4 5 5
read               ^

then advance the write pointer:

write      v
       1 2 1 1 1 2 3 4 5 5
read               ^

Again, we exchange the values pointed at by 'read' and 'write' and move the write pointer forward, then move the read pointer to the next unique value:

write        v
       1 2 3 1 1 2 1 4 5 5
read                 ^

Once more yields

write          v
       1 2 3 4 1 2 1 1 5 5
read                   ^

and the final iteration gives

write            v
       1 2 3 4 5 2 1 1 1 5
read                      ^

If we now sort from the write pointer to the read pointer, we get

write            v
       1 2 3 4 5 1 1 1 2 5
read                      ^

and bingo! We've got the answer we're looking for.

In (untested, sorry...) Java code, this fixup step might look like this:

int read = 0;
int write = 0;

while (read < array.length) {
     /* Swap the values pointed at by read and write. */
     int temp = array[write];
     array[write] = array[read];
     array[read] = temp;

     /* Advance the read pointer forward to the next unique value.  Since we
      * moved the unique value to the write location, we compare values
      * against array[write] instead of array[read].
      */
     while (read < array.length && array[write] == array[read])
         ++ read;

     /* Advance the write pointer. */
     ++ write;
}

This algorithm runs in O(n) time, which leads to an overall O(n log n) algorithm for the problem. Since the reordering step uses O(1) memory, the overall memory usage would be either O(1) (for something like smoothsort or heapsort) or O(log n) (for something like quicksort).

EDIT: After talking this over with a friend, I think that there is a much more elegant solution to the problem based on a modification of quicksort. Typically, when you run quicksort, you end up partitioning the array into three regions:

 +----------------+----------------+----------------+
 | values < pivot | values = pivot | values > pivot |
 +----------------+----------------+----------------+

The recursion then sorts the first and last regions to put them into sorted order. However, we can modify this for our version of the problem. We'll need as a primitive the rotation algorithm, which takes two adjacent blocks of values in an array and exchanges them in O(n) time. It does not change the relative order of the elements in those blocks. For example, we could use rotation to convert the array

1 2 3 4 5 6 7 8

into

3 4 5 6 7 8 1 2

and can do so in O(n) time.

The modified version of quicksort would work by using the Bentley-McIlroy three-way partition algortihm (described here) to, using O(1) extra space, rearrange the array elements into the configuration shown above. Next, we apply a rotation to reorder the elements so that they look like this:

 +----------------+----------------+----------------+
 | values < pivot | values > pivot | values = pivot |
 +----------------+----------------+----------------+

Next, we perform a swap so that we move exactly one copy of the pivot element into the set of elements at least as large as the pivot. This may have extra copies of the pivot behind. We then recursively apply the sorting algorithm to the < and > ranges. When we do this, the resulting array will look like this:

 +---------+-------------+---------+-------------+---------+
 | < pivot | dup < pivot | > pivot | dup > pivot | = pivot |
 +---------+-------------+---------+-------------+---------+

We then apply two rotations to the range to put it into the final order. First, rotate the duplicate values less than the pivot with the values greater than the pivot. This gives

 +---------+---------+-------------+-------------+---------+
 | < pivot | > pivot | dup < pivot | dup > pivot | = pivot |
 +---------+---------+-------------+-------------+---------+

At this point, this first range is the unique elements in ascending order:

 +---------------------+-------------+-------------+---------+
 | sorted unique elems | dup < pivot | dup > pivot | = pivot |
 +---------------------+-------------+-------------+---------+

Finally, do one last rotation of the duplicate elements greater than the pivot and the elements equal to the pivot to yield this:

 +---------------------+-------------+---------+-------------+
 | sorted unique elems | dup < pivot | = pivot | dup > pivot |
 +---------------------+-------------+---------+-------------+

Notice that these last three blocks are just the sorted duplicate values:

 +---------------------+-------------------------------------+
 | sorted unique elems |      sorted duplicate elements      |
 +---------------------+-------------------------------------+

and voila! We've got everything in the order we want. Using the same analysis that you'd do for normal quicksort, plus the fact that we're only doing O(n) work at each level (three extra rotations), this works out to O(n log n) in the best case with O(log n) memory usage. It's still O(n2) in the worst case with O(log n) memory, but that happens with extremely low probability.

If you are allowed to use O(n) memory, one option would be to build a balanced binary search tree out of all of the elements that stores key/value pairs, where each key is an element of the array and the value is the number of times it appears. You could then sort the array in your format as follows:

  1. For each element in the array:
    • If that element already exists in the BST, increment its count.
    • Otherwise, add a new node to the BST with that element having count 1.
  2. Do an inorder walk of the BST. When encountering a node, output its key.
  3. Do a second inorder walk of the BST. When encountering a node, if it has count greater than one, output n - 1 copies of that node, where n is the number of times it appears.

The runtime of this algorithm is O(n log n), but it would be pretty tricky to code up a BST from scratch. It also requires external space, which I'm not sure you're allowed to do.

However, if you are allowed external space and the arrays you are sorting are small and contain small integers, you could modify the above approach by using a modified counting sort. Just replace the BST with an array large enough for each integer in the original array to be a key. This reduces the runtime to O(n + k), with memory usage O(k), where k is the largest element in the array.

Hope this helps!

这篇关于在将重复项移动到末尾时排序数组?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆