如何删除多个并行数组中的重复数据 [英] How to remove duplicated data in multiple parallel arrays

查看:114
本文介绍了如何删除多个并行数组中的重复数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图迅速删除具有多个相同值
的某些相同精确顺序的数组元素。

I'm trying to remove certain array elements of the same exact order in swift which have multiple identical values ie.

假设现在我有3个数组

array1 = [a,b,c,d,d,c,d]
array2 = [1,2,3,4,4,3,4]
array3 = [aa,bb,cc,dd,dd,cc,dd]

问题是:我需要从数组中删除全部具有3个重复值的元素

The problem is: I need to remove from the array the elements which have all 3 duplicated values altogether

这意味着,我需要删除数组1、2和3中索引为[4],[5],[6]的元素。

Which means, I need to get rid of elements with index: [4], [5], [6] from arrays 1, 2 and 3.

ps。 3个数组必须位于单独的数组中,并且不能重新排列其顺序,因为它们具有彼此相关的一些关键信息

ps. 3 arrays have to be in separated arrays and can't rearrange its order since they have some critical information related to each other

任何建议将不胜感激。

推荐答案

Tuples是 Equatable (假设它们的元素是 Equatable )直到arity 6,我们可以利用此处将三个数组压缩为3个元组的序列,识别重复的3元组元素,并从中删除与这些元组相关的索引原始的三个数组。但是,元组不是 Hashable ,因此我们可以不使用三元组而是使用实用程序 Hashable 类型存储三个值(三元组确实匿名输入)。

Tuples are Equatable (given that their elements are Equatable) up to arity 6, which we could make use of here to zip the three arrays into a sequence of 3-tuples, identifying repeated 3-tuple elements, and removing the indices associated with these tuples from the original three arrays. Tuples are not, however, Hashable, so instead of using 3-tuples we could fall back on a utility Hashable type storing the three values (that the 3-tuple did type anonymously).

实用程序类型:

struct ZippedElement: Hashable {
    let a: String
    let b: Int
    let c: String

    init(_ a: String, _ b: Int, _ c: String) {
        self.a = a
        self.b = b
        self.c = c
    }

    // Use a very simple common hashValue calculation, simply
    // falling back on the hashValue of the Int member.
    var hashValue: Int { return b.hashValue }

    static func ==(lhs: ZippedElement, rhs: ZippedElement) -> Bool {
        return lhs.a == rhs.a && lhs.b == rhs.b && lhs.c == rhs.c
    }
}

哪个允许我们通过 array3 array1 上执行过滤/变异操作,如下所示:

Which allows us to perform the filtering/mutating operations on array1 through array3 as follows:

var seen = Set<ZippedElement>()
zip(zip(array1, array2), array3)
    .map { ZippedElement($0.0, $0.1, $1) }
    .enumerated().filter { !seen.insert($1).inserted }
    .map { $0.offset }.reversed()
    .forEach {
        array1.remove(at: $0)
        array2.remove(at: $0)
        array3.remove(at: $0)
    }

因此,每个数组中的最后三个元素被删除:

With, as a result, the last three elements being removed in each array:

print(array1) // ["a", "b", "c", "d"]
print(array2) // [1, 2, 3, 4]
print(array3) // ["aa", "bb", "cc", "dd"]






然而,您的示例数据设置对于此处的不同解决方案并不构成很多挑战,因此@dasblinkenlight会问一个好问题:


Your example data setup doesn't pose many challenges for the different solutions here, however, so @dasblinkenlight asks a good question:


如果我替换了最后一个 dd array3 dx

在这种情况下,我相信我们大多数人都假设所有原始数组中的第7个元素都应保留,作为所有三个数组中的垂直 zip组合,应保留第7个元素(/

In this case, I believe most of us assume that the 7th element in all the original arrays should be kept, as the "vertical" zip combination over all three arrays, for the 7th element (/column), is unique.

对于上述修改示例,采用与上述相同的方法:

Applying the same approach as above for such a modified example:

var array1 = ["a",  "b",  "c",  "d",  "d",  "c",  "d"]
var array2 = [ 1,    2,    3,    4,    4,    3,    4]
var array3 = ["aa", "bb", "cc", "dd", "dd", "cc", "dx"]
                                               /*  ^^ obs */

var seen = Set<ZippedElement>()
zip(zip(array1, array2), array3)
    .map { ZippedElement($0.0, $0.1, $1) }
    .enumerated().filter { !seen.insert($1).inserted }
    .map { $0.offset }.reversed()
    .forEach {
        print($0)
        array1.remove(at: $0)
        array2.remove(at: $0)
        array3.remove(at: $0)
    }

print(array1) // ["a", "b", "c", "d", "d"]
print(array2) // [1, 2, 3, 4, 4]
print(array3) // ["aa", "bb", "cc", "dx"]
                                  /* ^^ ok */






询问您的问题的另一条评论@SteveKuo的文章,指出了我们大多数人在解决所有诸如此类问题(索引跟踪单独的数组...)时所想的事情(超出了某种有趣的算法练习):


Another comment to your question is asked by @SteveKuo, stating what is on probably on most of our minds (in excess of a somewhat fun algorithmic exercise) for all questions such as this one (index-tracking separate arrays ...):


似乎更合适的数据结构是创建array1 / 2/3属性的struct / class / tuple。

Seems like a more appropriate data structure is to create struct/class/tuple of the array1/2/3 attributes.

我相信这是您应该在此处使用的核心答案,因此即使您明确声明

And I believe this is the core answer you should take with you here, so even if you explicitly state


... ps。 3个数组必须位于单独的数组中。

... ps. 3 arrays have to be in separated arrays

您可能希望使用单个自定义类型的数组。

You probably want a single array of a custom type instead.

这篇关于如何删除多个并行数组中的重复数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆