如何在不使用 Set 的情况下有效地从数组中删除重复项 [英] How to efficiently remove duplicates from an array without using Set
问题描述
我被要求编写自己的实现来删除数组中的重复值.这是我创建的.但是在对 1,000,000 个元素进行测试之后,需要很长时间才能完成.我可以做些什么来改进我的算法或删除任何错误?
I was asked to write my own implementation to remove duplicated values in an array. Here is what I have created. But after tests with 1,000,000 elements it took very long time to finish. Is there something that I can do to improve my algorithm or any bugs to remove ?
我需要编写自己的实现——而不是使用Set
、HashSet
等.或者任何其他工具,例如迭代器.只是一个用于删除重复项的数组.
I need to write my own implementation - not to use Set
, HashSet
etc. Or any other tools such as iterators. Simply an array to remove duplicates.
public static int[] removeDuplicates(int[] arr) {
int end = arr.length;
for (int i = 0; i < end; i++) {
for (int j = i + 1; j < end; j++) {
if (arr[i] == arr[j]) {
int shiftLeft = j;
for (int k = j+1; k < end; k++, shiftLeft++) {
arr[shiftLeft] = arr[k];
}
end--;
j--;
}
}
}
int[] whitelist = new int[end];
for(int i = 0; i < end; i++){
whitelist[i] = arr[i];
}
return whitelist;
}
推荐答案
由于这个问题还是很受关注,所以我决定复制来自 Code Review.SE 的这个答案:
Since this question is still getting a lot of attention, I decided to answer it by copying this answer from Code Review.SE:
您遵循与冒泡排序相同的哲学,即非常非常非常缓慢.你试过这个吗?:
You're following the same philosophy as the bubble sort, which is very, very, very slow. Have you tried this?:
使用 quicksort 对无序数组进行排序.快速排序要快得多比冒泡排序(我知道,你不是排序,而是你的算法follow 几乎和冒泡排序一样来遍历数组).
Sort your unordered array with quicksort. Quicksort is much faster than bubble sort (I know, you are not sorting, but the algorithm you follow is almost the same as bubble sort to traverse the array).
然后开始删除重复项(重复值将在每个其他).在 for
循环中,您可以有两个索引:source
和目的地
.(在每个循环中,您将 source
复制到 destination
除非它们是相同的,并且都加 1).每次你找到一个复制您的增量源(并且不执行复制).@摩根诺
Then start removing duplicates (repeated values will be next to each
other). In a for
loop you could have two indices: source
and
destination
. (On each loop you copy source
to destination
unless they
are the same, and increment both by 1). Every time you find a
duplicate you increment source (and don't perform the copy).
@morgano
这篇关于如何在不使用 Set 的情况下有效地从数组中删除重复项的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!