如何有效地从数组中删除重复无需使用set [英] How to efficiently remove duplicates from an array without using Set
问题描述
我被要求写我自己实施一个数组来删除重复值。以下是我已经创建。但是,一个有1,000,000元素测试之后花了很长的时间才能完成。有什么我可以做些什么来改善我的算法或任何错误删除?
我需要写我自己的实现 - 而不是使用设置
, HashSet的
等或任何其他工具,比如迭代器。只是一个数组来删除重复。
公共静态INT [] removeDuplicates(INT [] ARR){ INT结束= arr.length; 的for(int i = 0; I< END;我++){
对于(INT J = I + 1; J< END; J ++){
如果(ARR [I] == ARR [J]){
INT shiftLeft = j的;
对于(INT K = J + 1; K<结束; k ++,shiftLeft ++){
ARR [shiftLeft] = ARR [K];
}
结束 - ;
j--;
}
}
} INT [] =白名单新INT [结束]
的for(int i = 0; I< END;我++){
白名单[I] =改编[I]
}
返回白名单;
}
由于这个问题是仍然得到了很多的关注,我决定从codereview复制的答案回答。
您正在遵循同样的理念为冒泡排序,这是
非常,非常,非常缓慢。你有没有试过这样:
排序您的无序数组快速排序。快速排序快得多
比冒泡排序(我知道,你是不是排序,但你的算法
遵循的是几乎一样冒泡排序遍历数组)。
然后开始删除重复(重复值将相邻
其他)。在for循环中,你可以有两个指标:源
目的地。 (每次循环复制源到目标,除非他们
是相同的,并递增1两者)。每当你找到一个
复制你增加源(和不执行复制)。
@morgano
块引用>I was asked to write my own implementation to remove duplicated values in an array. Here is what I have created. But after tests with 1,000,000 elements it took very long time to finish. Is there something that I can do to improve my algorithm or any bugs to remove ?
I need to write my own implementation - not to use
Set
,HashSet
etc. Or any other tools such as iterators. Simply an array to remove duplicates.public static int[] removeDuplicates(int[] arr) { int end = arr.length; for (int i = 0; i < end; i++) { for (int j = i + 1; j < end; j++) { if (arr[i] == arr[j]) { int shiftLeft = j; for (int k = j+1; k < end; k++, shiftLeft++) { arr[shiftLeft] = arr[k]; } end--; j--; } } } int[] whitelist = new int[end]; for(int i = 0; i < end; i++){ whitelist[i] = arr[i]; } return whitelist; }
解决方案Since this question is still getting a lot of attention, I decided to answer it by copying answer from codereview.
You're following the same philosophy as the bubble sort, which is very, very, very slow. Have you tried this?:
Sort your unordered array with quicksort. Quicksort is much faster than bubble sort (I know, you are not sorting, but the algorithm you follow is almost the same as bubble sort to traverse the array).
Then start removing duplicates (repeated values will be next to each other). In a for loop you could have two indices: source and destination. (On each loop you copy source to destination unless they are the same, and increment both by 1). Every time you find a duplicate you increment source (and don't perform the copy). @morgano
这篇关于如何有效地从数组中删除重复无需使用set的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!