如何在不使用 Set 的情况下有效地从数组中删除重复项 [英] How to efficiently remove duplicates from an array without using Set

查看:43
本文介绍了如何在不使用 Set 的情况下有效地从数组中删除重复项的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我被要求编写自己的实现来删除数组中的重复值.这是我创建的.但是在对 1,000,000 个元素进行测试之后,需要很长时间才能完成.我可以做些什么来改进我的算法或删除任何错误?

I was asked to write my own implementation to remove duplicated values in an array. Here is what I have created. But after tests with 1,000,000 elements it took very long time to finish. Is there something that I can do to improve my algorithm or any bugs to remove ?

我需要编写自己的实现——而不是使用SetHashSet 等.或者任何其他工具,例如迭代器.只是一个用于删除重复项的数组.

I need to write my own implementation - not to use Set, HashSet etc. Or any other tools such as iterators. Simply an array to remove duplicates.

public static int[] removeDuplicates(int[] arr) {

    int end = arr.length;

    for (int i = 0; i < end; i++) {
        for (int j = i + 1; j < end; j++) {
            if (arr[i] == arr[j]) {                  
                int shiftLeft = j;
                for (int k = j+1; k < end; k++, shiftLeft++) {
                    arr[shiftLeft] = arr[k];
                }
                end--;
                j--;
            }
        }
    }

    int[] whitelist = new int[end];
    for(int i = 0; i < end; i++){
        whitelist[i] = arr[i];
    }
    return whitelist;
}

推荐答案

由于这个问题还是很受关注,所以我决定复制来自 Code Review.SE 的这个答案:

Since this question is still getting a lot of attention, I decided to answer it by copying this answer from Code Review.SE:

您遵循与冒泡排序相同的哲学,即非常非常非常缓慢.你试过这个吗?:

You're following the same philosophy as the bubble sort, which is very, very, very slow. Have you tried this?:

  • 使用 quicksort 对无序数组进行排序.快速排序要快得多比冒泡排序(我知道,你不是排序,而是你的算法follow 几乎和冒泡排序一样来遍历数组).

  • Sort your unordered array with quicksort. Quicksort is much faster than bubble sort (I know, you are not sorting, but the algorithm you follow is almost the same as bubble sort to traverse the array).

然后开始删除重复项(重复值将在每个其他).在 for 循环中,您可以有两个索引:source目的地.(在每个循环中,您将 source 复制到 destination 除非它们是相同的,并且都加 1).每次你找到一个复制您的增量源(并且不执行复制).@摩根诺

Then start removing duplicates (repeated values will be next to each other). In a for loop you could have two indices: source and destination. (On each loop you copy source to destination unless they are the same, and increment both by 1). Every time you find a duplicate you increment source (and don't perform the copy). @morgano

这篇关于如何在不使用 Set 的情况下有效地从数组中删除重复项的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆