如何有效地从数组中删除重复无需使用set [英] How to efficiently remove duplicates from an array without using Set

查看:97
本文介绍了如何有效地从数组中删除重复无需使用set的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我被要求写我自己实施一个数组来删除重复值。以下是我已经创建。但是,一个有1,000,000元素测试之后花了很长的时间才能完成。有什么我可以做些什么来改善我的算法或任何错误删除?

我需要写我自己的实现 - 而不是使用设置 HashSet的等或任何其他工具,比如迭代器。只是一个数组来删除重复。

 公共静态INT [] removeDuplicates(INT [] ARR){    INT结束= arr.length;    的for(int i = 0; I< END;我++){
        对于(INT J = I + 1; J< END; J ++){
            如果(ARR [I] == ARR [J]){
                INT shiftLeft = j的;
                对于(INT K = J + 1; K<结束; k ++,shiftLeft ++){
                    ARR [shiftLeft] = ARR [K];
                }
                结束 - ;
                j--;
            }
        }
    }    INT [] =白名单新INT [结束]
    的for(int i = 0; I< END;我++){
        白名单[I] =改编[I]
    }
    返回白名单;
}


解决方案

由于这个问题是仍然得到了很多的关注,我决定从codereview复制的答案回答。


  

您正在遵循同样的理念为冒泡排序,这是
  非常,非常,非常缓慢。你有没有试过这样:


  
  

      
  • 排序您的无序数组快速排序。快速排序快得多
      比冒泡排序(我知道,你是不是排序,但你的算法
      遵循的是几乎一样冒泡排序遍历数组)。


  •   
  • 然后开始删除重复(重复值将相邻
      其他)。在fo​​r循环中,你可以有两个指标:源
      目的地。 (每次循环复制源到目标,除非他们
      是相同的,并递增1两者)。每当你找到一个
      复制你增加源(和不执行复制)。
      @morgano


  •   

I was asked to write my own implementation to remove duplicated values in an array. Here is what I have created. But after tests with 1,000,000 elements it took very long time to finish. Is there something that I can do to improve my algorithm or any bugs to remove ?

I need to write my own implementation - not to use Set, HashSet etc. Or any other tools such as iterators. Simply an array to remove duplicates.

public static int[] removeDuplicates(int[] arr) {

    int end = arr.length;

    for (int i = 0; i < end; i++) {
        for (int j = i + 1; j < end; j++) {
            if (arr[i] == arr[j]) {                  
                int shiftLeft = j;
                for (int k = j+1; k < end; k++, shiftLeft++) {
                    arr[shiftLeft] = arr[k];
                }
                end--;
                j--;
            }
        }
    }

    int[] whitelist = new int[end];
    for(int i = 0; i < end; i++){
        whitelist[i] = arr[i];
    }
    return whitelist;
}

解决方案

Since this question is still getting a lot of attention, I decided to answer it by copying answer from codereview.

You're following the same philosophy as the bubble sort, which is very, very, very slow. Have you tried this?:

  • Sort your unordered array with quicksort. Quicksort is much faster than bubble sort (I know, you are not sorting, but the algorithm you follow is almost the same as bubble sort to traverse the array).

  • Then start removing duplicates (repeated values will be next to each other). In a for loop you could have two indices: source and destination. (On each loop you copy source to destination unless they are the same, and increment both by 1). Every time you find a duplicate you increment source (and don't perform the copy). @morgano

这篇关于如何有效地从数组中删除重复无需使用set的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆