阵列删除重复的元素 [英] Array remove duplicate elements

查看:106
本文介绍了阵列删除重复的元素的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个排序的数组,什么是删除元素的所有副本的最好方法,如果present?

例如:

  A [1,5,2,6,8,9,1,1,10,3,2,4,1,3,11,3]
 

,这样操作后,阵列应该像

  A [1,5,2,6,8,9,10,3,4,11]
 

解决方案

检查每一个环节都对所有其他元素

天真的解决方案是检查每个元素对所有其他元素。这是一种浪费,并产生一个O(N 2 )解决方案,即使你只是去前进。

排序,然后删除重复

一个更好的解决方案是排序的数组,然后检查每个元一旁边找到重复的。请选择一个有效的排序,这是O(n log n)的。

与排序为基础的解决方案的缺点是顺序不会被维持。一个额外的步骤可以照顾这不过。把所有条目(独特的有序阵列中)到一个哈希表,其中有O(1)访问。然后遍历原始数组。对于每个元素,检查它是否是哈希表中。如果是,把它添加到结果,并从哈希表中删除它。你将结束与一个合成数组具有原始的,在相同的位置,它的第一次出现的每个元件是顺序

线性排序整数

如果你正在处理的一些固定的范围内,你可以做,甚至用基数排序更好的整数。如果假设的数字都在0到百万例如的范围内,就可以分配一些1000001位向量。对于原始数组中的每个元素,您将根据其值的相应位(如13个结果中设置14位值)。然后遍历原始数组,检查它是否在位向量。如果是,把它添加到结果数组和清除从该位向量该位。这是O(n)和交易空间换取时间。

哈希表解决方案

这使我们对所有的最佳解决方案:排序实际上是一种干扰,虽然有用。创建具有O(1)访问一个哈希表。遍历原来的列表中。如果不是,在哈希表已经,将它添加到结果阵列并将其添加到散列表中。如果它是在哈希表中,将其忽略。

这是迄今为止最好的解决方案。那么,为什么其他人呢?因为像这样的问题是关于适应的知识,你有(或者应该有)的问题,并根据您制作成一个解决方案的假设加以完善。不断发展的一个溶液和理解了思维背后远比反刍的溶液更为有用。

此外,哈希表并不总是可用。以嵌入式系统或某事在空间是非常有限的。你可以实现一个快速排序在少数C $ CS运$,远远低于任何哈希表可能是。

I have an unsorted array, what is the best method to remove all the duplicates of an element if present?

e.g:

a[1,5,2,6,8,9,1,1,10,3,2,4,1,3,11,3]

so after that operation the array should look like

 a[1,5,2,6,8,9,10,3,4,11]

解决方案

Check every element against every other element

The naive solution is to check every element against every other element. This is wasteful and yields an O(n2) solution, even if you only go "forward".

Sort then remove duplicates

A better solution is sort the array and then check each element to the one next to it to find duplicates. Choose an efficient sort and this is O(n log n).

The disadvantage with the sort-based solution is order is not maintained. An extra step can take care of this however. Put all entries (in the unique sorted array) into a hashtable, which has O(1) access. Then iterate over the original array. For each element, check if it is in the hash table. If it is, add it to the result and delete it from the hash table. You will end up with a resultant array that has the order of the original with each element being in the same position as its first occurrence.

Linear sorts of integers

If you're dealing with integers of some fixed range you can do even better by using a radix sort. If you assume the numbers are all in the range of 0 to 1,000,000 for example, you can allocate a bit vector of some 1,000,001. For each element in the original array, you set the corresponding bit based on its value (eg a value of 13 results in setting the 14th bit). Then traverse the original array, check if it is in the bit vector. If it is, add it to the result array and clear that bit from the bit vector. This is O(n) and trades space for time.

Hash table solution

Which leads us to the best solution of all: the sort is actually a distraction, though useful. Create a hashtable with O(1) access. Traverse the original list. If it is not in the hashtable already, add it to the result array and add it to the hash table. If it is in the hash table, ignore it.

This is by far the best solution. So why the rest? Because problems like this are about adapting knowledge you have (or should have) to problems and refining them based on the assumptions you make into a solution. Evolving a solution and understanding the thinking behind it is far more useful than regurgitating a solution.

Also, hash tables are not always available. Take an embedded system or something where space is VERY limited. You can implement an quick sort in a handful of opcodes, far fewer than any hash table could be.

这篇关于阵列删除重复的元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆