算法不au​​xillay存储删除数组重复元素 [英] algorithm removing duplicate elements in array without auxillay storage

查看:157
本文介绍了算法不au​​xillay存储删除数组重复元素的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我工作的这个著名的面试问题在阵列删除重复的元素,而无需使用 auxillary存储和preserving秩序;

我读了一堆帖子; <一href="http://stackoverflow.com/questions/1532819/algorithm-efficient-way-to-remove-duplicate-integers-from-an-array">Algorithm:有效的方法来从一个数组删除重复的整数,<一个href="http://stackoverflow.com/questions/9613960/removing-duplicates-from-an-array-using-c?lq=1">Removing使用C 从阵列重复。

他们在ç无论是实施(不解释)或者的Java code 提供的只是失败时有连续的重复,如 [1,1,1,3,3]

我不太相信使用 C ,我的背景是的Java 。所以,我实现了code自己; 它遵循这样的:

  1. 使用两个循环,外循环遍历数组和内环检查重复,如果present用NULL替换它。
  2. 然后我去了重复的置换空数组并取出空的元素,并与下一个非空元素替换它。
  3. 总运行时间我现在看到的是为O(n ^ 2)+ O(n)的〜O(N ^ 2)。看完上面的文章中,我明白这是我们能做的最好的,如果没有分类和辅助存储是允许的。 我的code是在这里:我正在寻找方法来优化任何进一步的(如果有可能)或者好/ simplisitc逻辑;

     公共类RemoveDup {
        公共静态无效的主要(字串[] args){
            整数[] ARR2 = {3,45,1,2,3,3,3,3,2,1,45,2,10};
                整数[] RES = removeDup(ARR2);
                    的System.out.println(Arrays.toString(RES));
                }
              私有静态整数[] removeDup(整数[]数据){
                INT大小= data.length;
                诠释计数= 1;
                    的for(int i = 0; I&LT;大小;我++){
                        整数TEMP =数据[I]
                        对于(INT J = I + 1; J&LT;大小与功放;&安培; TEMP = NULL;!J ++){
                            如果(数据[J] ==临时){
                                数据[J] = NULL;
                            }
                        }
                    }
                    的for(int i = 1; I&LT;大小;我++){
                        整型电流=数据[I]
                        如果(数据[I]!= NULL){
                            数据[统计++] =电流;
                        }
                    }
    
                    返回Arrays.copyOf(数据统计);
    
             }
     

    }

编辑1;从@keshlam格式化code抛出ArrayIndexOutofBound异常:

 私有静态诠释removeDupes(INT []数组){
        的System.out.println(方法调用);
        如果(array.length 2)
          返回array.length;

        诠释特大= 1; //首先是始终保持

     对于(INT考虑= 1;考虑&LT; array.length ++考虑){

          对于(INT比较= 0;比较&LT;特大++相比){
            如果(阵列[考虑]!=阵列[比较])
                阵列[特大++] =阵列[考虑] //已经present;前进到下一个比较
           别人休息;
          //如果我们在这里,我们知道这是新的,因此追加到输出
          //数组[如雷贯耳++] =阵列[考虑] //可以先测试一下,不值得。

        }

      }
        的System.out.println(Arrays.toString(阵列));
         //长度是最后一次写入的位置加1
        返回如雷贯耳;
    }
 

解决方案

好了,这里是我的答案,这应该是O(N * N)最坏的情况。 (对于较小的常数,因为即使最坏情况我测试Ñ对 - 平均 - 1/2 N,但是这是计算机科学,而不是软件工程和一个单纯的2X加速不显著由于@Alexandru 指出了这一点。)

1)分割光标(输入和输出分别高级),

2)每一个新的值仅需要相比,现在包括已经保持,并比较如果发现匹配可以停止。 (hint关键字是增量)

3)第一元件不需要被测试

4)我趁着标记继续在那里我可以代替破发中之前,设置一个标志然后进行测试的标志。出来是一回事;这是一个比较优雅。

4.5),我可以测试是否如雷贯耳==考虑,而不是复制的,如果这是真的。但是测试它大概需要尽可能多的周期因为这样做的可能,不必要的复制,大多数情况下,他们会的没有的是一样的,所以它更容易只是让一个可能的冗余副本发生

5)我不会重新复制的关键功能的数据;我已经排除了复制的打印作业到一个单独的功能,以明确 removeDupes 并选择目标磁盘阵列以及在堆栈上的几个自动变量在运行完全。我也不会花时间清零在数组的最后剩余的元素;可能被浪费的工作(如在这种情况下)。虽然我不认为这实际上改变了正规的复杂性。

 进口java.util.Arrays中;

公共类RemoveDupes {

  私有静态诠释removeDupes(最终诠释[]数组){
    如果(array.length 2)
      返回array.length;

    诠释特大= 1; //首先是始终保持

    outerloop:为(INT考虑= 1;考虑&LT; array.length ++考虑){

      对于(INT比较= 0;比较&LT;特大++比较)
        如果(阵列[考虑] ==阵列[比较])
          继续outerloop; //已经present;前进到下一个比较

      //如果我们在这里,我们知道这是新的,因此追加到输出
      阵列[特大++] =阵列[考虑] //可以先测试一下,不值得。
    }

    返回如雷贯耳; //长度是最后一次写入的位置加1
  }

  私有静态无效printRemoveDupes(INT []数组){
    INT满足newLength = removeDupes(阵列);
    的System.out.println(Arrays.toString(Arrays.copyOfRange(阵列,0,满足newLength)));
  }

  公共静态无效的主要(最终字串[] args){
    printRemoveDupes(新INT [] {3,45,1,2,3,3,3,3,2,1,45,2,10});
    printRemoveDupes(新INT [] {2,2,3,3});
    printRemoveDupes(新INT [] {1,1,1,1,1,1,1,1});
  }
}
 

后加:由于有关我的解释一点4人EX pressed混乱,这里的重写而循环标记继续

 的(INT考虑= 1;考虑&LT; array.length ++考虑){
  布尔matchfound = FALSE;

  对于(INT比较= 0;比较&LT;特大++相比){
    如果(阵列[考虑] ==阵列[比较]){
      matchfound = TRUE;
      打破;
    }

    如果(!matchFound)//只将它添加到输出,如果未找到
      阵列[特大++] =阵列[考虑]
}
 

希望有所帮助。标记的继续是Java一个很少使用的功能,所以它不是太奇怪,有些人以前没见过它。这是非常有用的,但它确实使$ C C难以阅读$;我可能不会在任何使用它要复杂得多比这个简单的算法。

I am working on this famous interview question on removing duplicate elements in array without using auxillary storage and preserving the order;

I have read a bunch of posts; Algorithm: efficient way to remove duplicate integers from an array, Removing Duplicates from an Array using C.

They are either implemented in C (without explanation) or the Java Code provided just fails when there is consecutive duplicates such as [1,1,1,3,3].

I am not quite confident with using C, my background is Java. So I implemented the code myself; it follows like this:

  1. use two loops, the outer-loop traverses the array and inner loop checks for duplicates and if present replace it with null.
  2. Then I go over the duplicate-replaced-null array and remove null elements and replacing it with the next non-null element.
  3. The total run-time I see now is O(n^2)+O(n) ~ O(n^2). Reading the above posts, I understood this is the best we can do, if no sorting and auxiliary storage is allowed. My code is here: I am looking for ways to optimize any further (if there is a possibility) or a better/simplisitc logic;

    public class RemoveDup {
        public static void main (String[] args){
            Integer[]  arr2={3,45,1,2,3,3,3,3,2,1,45,2,10};
                Integer[] res= removeDup(arr2);
                    System.out.println(Arrays.toString(res));
                }
              private static Integer[] removeDup(Integer[] data) {
                int size = data.length;
                int count = 1;
                    for (int i = 0; i < size; i++) {
                        Integer temp = data[i];
                        for (int j = i + 1; j < size && temp != null; j++) {
                            if (data[j] == temp) {
                                data[j] = null;
                            }
                        }
                    }
                    for (int i = 1; i < size; i++) {
                        Integer current = data[i];
                        if (data[i] != null) {
                            data[count++] = current;
                        }
                    }
    
                    return Arrays.copyOf(data, count);
    
             }
    

    }

EDIT 1; Reformatted code from @keshlam throws ArrayIndexOutofBound Exception:

private static int removeDupes(int[] array) {
        System.out.println("method called");
        if(array.length < 2)
          return array.length;

        int outsize=1; // first is always kept

     for (int consider = 1; consider < array.length; ++consider) {

          for(int compare=0;compare<outsize;++compare) {
            if(array[consider]!=array[compare])
                array[outsize++]=array[consider]; // already present; advance to next compare
           else break;
          // if we get here, we know it's new so append it to output
          //array[outsize++]=array[consider]; // could test first, not worth it. 

        }

      }
        System.out.println(Arrays.toString(array));
         // length is last written position plus 1
        return outsize;
    }

解决方案

OK, here's my answer, which should be O(N*N) worst case. (With smaller constant, since even worstcase I'm testing N against -- on average -- 1/2 N, but this is computer science rather than software engineering and a mere 2X speedup isn't significant. Thanks to @Alexandru for pointing that out.)

1) Split cursor (input and output advanced separately),

2) Each new value only has to be compared to what's already been kept, and compare can stop if a match is found. (The hint keyword was "incremental")

3) First element need not be tested.

4) I'm taking advantage of labelled continue where I could have instead set a flag before breaking and then tested the flag. Comes out to the same thing; this is a bit more elegant.

4.5) I could have tested whether outsize==consider and not copied if that was true. But testing for it would take about as many cycles as doing the possibly-unnecessary copy, and the majority case is that they will not be the same, so it's easier to just let a possibly redundant copy take place.

5) I'm not recopying the data in the key function; I've factored out the copy-for-printing operation to a separate function to make clear that removeDupes does run entirely in the target array plus a few automatic variables on the stack. And I'm not spending time zeroing out the leftover elements at the end of the array; that may be wasted work (as in this case). Though I don't think it would actually change the formal complexity.

import java.util.Arrays;

public class RemoveDupes {

  private static int removeDupes(final int[] array) {
    if(array.length < 2)
      return array.length;

    int outsize=1; // first is always kept

    outerloop: for (int consider = 1; consider < array.length; ++consider) {

      for(int compare=0;compare<outsize;++compare)
        if(array[consider]==array[compare])
          continue outerloop; // already present; advance to next compare

      // if we get here, we know it's new so append it to output
      array[outsize++]=array[consider]; // could test first, not worth it. 
    }

    return outsize; // length is last written position plus 1
  }

  private static void printRemoveDupes(int[] array) {
    int newlength=removeDupes(array);
    System.out.println(Arrays.toString(Arrays.copyOfRange(array, 0, newlength)));
  }

  public static void main(final String[] args) {
    printRemoveDupes(new int[] { 3, 45, 1, 2, 3, 3, 3, 3, 2, 1, 45, 2, 10 });
    printRemoveDupes(new int[] { 2, 2, 3, 3 });
    printRemoveDupes(new int[] { 1, 1, 1, 1, 1, 1, 1, 1 });
  }
}

LATE ADDITION: Since folks expressed confusion about point 4 in my explanation, here's the loop rewritten without labelled continue:

for (int consider = 1; consider < array.length; ++consider) {
  boolean matchfound=false;

  for(int compare=0;compare<outsize;++compare) {
    if(array[consider]==array[compare]) {
      matchfound=true;
      break;
    }

    if(!matchFound) // only add it to the output if not found
      array[outsize++]=array[consider];
}

Hope that helps. Labelled continue is a rarely-used feature of Java, so it isn't too surprising that some folks haven't seen it before. It's useful, but it does make code harder to read; I probably wouldn't use it in anything much more complicated than this simple algorithm.

这篇关于算法不au​​xillay存储删除数组重复元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆