如何删除多个并行数组中的重复数据 [英] How to remove duplicated data in multiple parallel arrays
问题描述
我试图迅速删除具有多个相同值
的某些相同精确顺序的数组元素。
I'm trying to remove certain array elements of the same exact order in swift which have multiple identical values ie.
假设现在我有3个数组
array1 = [a,b,c,d,d,c,d]
array2 = [1,2,3,4,4,3,4]
array3 = [aa,bb,cc,dd,dd,cc,dd]
问题是:我需要从数组中删除全部具有3个重复值的元素
The problem is: I need to remove from the array the elements which have all 3 duplicated values altogether
这意味着,我需要删除数组1、2和3中索引为[4],[5],[6]的元素。
Which means, I need to get rid of elements with index: [4], [5], [6] from arrays 1, 2 and 3.
ps。 3个数组必须位于单独的数组中,并且不能重新排列其顺序,因为它们具有彼此相关的一些关键信息
ps. 3 arrays have to be in separated arrays and can't rearrange its order since they have some critical information related to each other
任何建议将不胜感激。
推荐答案
Tuples是 Equatable
(假设它们的元素是 Equatable
)直到arity 6,我们可以利用此处将三个数组压缩为3个元组的序列,识别重复的3元组元素,并从中删除与这些元组相关的索引原始的三个数组。但是,元组不是 Hashable
,因此我们可以不使用三元组而是使用实用程序 Hashable
类型存储三个值(三元组确实匿名输入)。
Tuples are Equatable
(given that their elements are Equatable
) up to arity 6, which we could make use of here to zip the three arrays into a sequence of 3-tuples, identifying repeated 3-tuple elements, and removing the indices associated with these tuples from the original three arrays. Tuples are not, however, Hashable
, so instead of using 3-tuples we could fall back on a utility Hashable
type storing the three values (that the 3-tuple did type anonymously).
实用程序类型:
struct ZippedElement: Hashable {
let a: String
let b: Int
let c: String
init(_ a: String, _ b: Int, _ c: String) {
self.a = a
self.b = b
self.c = c
}
// Use a very simple common hashValue calculation, simply
// falling back on the hashValue of the Int member.
var hashValue: Int { return b.hashValue }
static func ==(lhs: ZippedElement, rhs: ZippedElement) -> Bool {
return lhs.a == rhs.a && lhs.b == rhs.b && lhs.c == rhs.c
}
}
哪个允许我们通过 array3
在 array1
上执行过滤/变异操作,如下所示:
Which allows us to perform the filtering/mutating operations on array1
through array3
as follows:
var seen = Set<ZippedElement>()
zip(zip(array1, array2), array3)
.map { ZippedElement($0.0, $0.1, $1) }
.enumerated().filter { !seen.insert($1).inserted }
.map { $0.offset }.reversed()
.forEach {
array1.remove(at: $0)
array2.remove(at: $0)
array3.remove(at: $0)
}
因此,每个数组中的最后三个元素被删除:
With, as a result, the last three elements being removed in each array:
print(array1) // ["a", "b", "c", "d"]
print(array2) // [1, 2, 3, 4]
print(array3) // ["aa", "bb", "cc", "dd"]
然而,您的示例数据设置对于此处的不同解决方案并不构成很多挑战,因此@dasblinkenlight会问一个好问题:
Your example data setup doesn't pose many challenges for the different solutions here, however, so @dasblinkenlight asks a good question:
如果我替换了最后一个
dd $ c,它将改变期望的结果吗? $ c>的
array3
和dx
?
在这种情况下,我相信我们大多数人都假设所有原始数组中的第7个元素都应保留,作为所有三个数组中的垂直 zip组合,应保留第7个元素(/
In this case, I believe most of us assume that the 7th element in all the original arrays should be kept, as the "vertical" zip combination over all three arrays, for the 7th element (/column), is unique.
对于上述修改示例,采用与上述相同的方法:
Applying the same approach as above for such a modified example:
var array1 = ["a", "b", "c", "d", "d", "c", "d"]
var array2 = [ 1, 2, 3, 4, 4, 3, 4]
var array3 = ["aa", "bb", "cc", "dd", "dd", "cc", "dx"]
/* ^^ obs */
var seen = Set<ZippedElement>()
zip(zip(array1, array2), array3)
.map { ZippedElement($0.0, $0.1, $1) }
.enumerated().filter { !seen.insert($1).inserted }
.map { $0.offset }.reversed()
.forEach {
print($0)
array1.remove(at: $0)
array2.remove(at: $0)
array3.remove(at: $0)
}
print(array1) // ["a", "b", "c", "d", "d"]
print(array2) // [1, 2, 3, 4, 4]
print(array3) // ["aa", "bb", "cc", "dx"]
/* ^^ ok */
询问您的问题的另一条评论@SteveKuo的文章,指出了我们大多数人在解决所有诸如此类问题(索引跟踪单独的数组...)时所想的事情(超出了某种有趣的算法练习):
Another comment to your question is asked by @SteveKuo, stating what is on probably on most of our minds (in excess of a somewhat fun algorithmic exercise) for all questions such as this one (index-tracking separate arrays ...):
似乎更合适的数据结构是创建array1 / 2/3属性的struct / class / tuple。
Seems like a more appropriate data structure is to create struct/class/tuple of the array1/2/3 attributes.
我相信这是您应该在此处使用的核心答案,因此即使您明确声明
And I believe this is the core answer you should take with you here, so even if you explicitly state
... ps。 3个数组必须位于单独的数组中。
... ps. 3 arrays have to be in separated arrays
您可能希望使用单个自定义类型的数组。
You probably want a single array of a custom type instead.
这篇关于如何删除多个并行数组中的重复数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!