如何从阵列中删除重复项 [英] How do I remove duplicates from an array
问题描述
说我有一个字符串数组:
Say I have an array of strings:
let arrayOfStrings = ["a", "b", "a", "c", "a", "d"]
我将如何去除重复项?
推荐答案
您可以使用数组函数contains(_:)
来检查元素是否已包含在数组中,但是速度相当慢,对于大型数组,它会赢表现不佳. (1.)最好将条目复制到Set
中,并使用Set
操作查找并删除重复项.集合经过优化,可以快速测试集合成员资格,因此if aSet.contains(item)
比if anArray.contains(item)
快很多.
You can use the array function contains(_:)
to check if an element is already part of the array, but that is fairly slow, and for large arrays it won’t perform well. (1.) Better to copy the entries into a Set
and use Set
operations to find and remove the duplicates. Sets are optimized to make testing for set membership fast, so if aSet.contains(item)
is a lot faster than if anArray.contains(item)
.
如果您不关心保留项目的顺序,则可以简单地将数组复制到集合中,然后再复制回数组.但是,这确实意味着结果数组中的项目将以不同的顺序进行.
If you don't care about preserving the order of your items, you can simply copy your array into a set and then back to an array. However, that does mean that the items in the resulting array will be in a different order.
在保留顺序的同时从字符串数组中删除重复项的函数可能看起来像这样:
A function to remove duplicates from an array of strings, while preserving the order, might look like this:
func uniqueElementsFrom(array: [String]) -> [String] {
//Create an empty Set to track unique items
var set = Set<String>()
let result = array.filter {
guard !set.contains($0) else {
//If the set already contains this object, return false
//so we skip it
return false
}
//Add this item to the set since it will now be in the array
set.insert($0)
//Return true so that filtered array will contain this item.
return true
}
return result
}
如果使用以下代码调用它:
If you call it with code like this:
let arrayOfStrings = ["a", "b", "a", "c", "a", "d"]
let uniqueStrings = uniqueElementsFrom(array:arrayOfStrings)
print("Unique elements from \(arrayOfStrings) = \n" +
"\(uniqueStrings)")
输出为
来自["a","b","a","c","a","d"] =
Unique elements from ["a", "b", "a", "c", "a", "d"] =
["a","b","c","d"]
["a", "b", "c", "d"]
但是,该函数仅适用于字符串数组.如果我们可以编写一个可以从任何类型的数组中删除重复项的函数,那将是很好的.
However, that function only works with arrays of strings. It would be good if we could write a function that could remove duplicates from any kind of array.
这是泛型的工作.但是有一个陷阱.集合只能包含符合Hashable
协议的对象,因为集合使用散列来加快对集合成员资格的测试.
This is a job for Generics. There is a catch however. Sets can only contain objects that conform to the Hashable
protocol, since Sets use hashes to make testing for set membership faster.
我们可以使用泛型重写uniqueElementsFrom(array:)
函数以采用符合Hashable
协议的任何数组.该代码如下所示:
We can rewrite the uniqueElementsFrom(array:)
function to take any array that conforms to the Hashable
protocol using Generics. That code looks like this:
func uniqueElementsFrom<T: Hashable>(array: [T]) -> [T] {
var set = Set<T>()
let result = array.filter {
guard !set.contains($0) else {
return false
}
set.insert($0)
return true
}
return result
}
函数名称后的<T: Hashable>
位表示此函数的其余部分将引用未指定的T类型.您唯一可以确定的是T类型将符合Hashable协议."
The <T: Hashable>
bit after the function name says "The rest of this function will refer to a type T which is unspecified. The only thing you can be sure of is that the type T will conform to the Hashable protocol."
这种uniqueElementsFrom(array:)
函数形式适用于元素为Hashable
的任何数组.
This form of the uniqueElementsFrom(array:)
function will work on any array who’s elements are Hashable
.
(1.)对于数组,contains(_:)
具有O(n)
的性能,因此遍历数组,测试该数组以查看其是否包含每个具有contains(_:)
的新元素都具有的性能.这几乎是O(n^2)
,实际上,确实对除小阵列以外的任何东西都是不利的.我非常确定Set
的contains(_:)
函数具有恒定的时间性能,因此整个过程将具有O(n)
性能.
(1.) For arrays, contains(_:)
has O(n)
performance, and so looping through an array, testing the array to see if it contains each new element with contains(_:)
has performance that's almost O(n^2)
, which is really, really bad for anything but a small array. I'm pretty sure that Set
's contains(_:)
function has constant time performance, so the whole process would have O(n)
performance.
这篇关于如何从阵列中删除重复项的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!