在 Swift 中,一个高效的函数,可以根据谓词将一个数组分成 2 个数组 [英] In Swift, an efficient function that separates an array into 2 arrays based on a predicate

查看:41
本文介绍了在 Swift 中,一个高效的函数,可以根据谓词将一个数组分成 2 个数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我希望创建一个与 filter 非常接近的函数,除了它也保留不匹配的结果并保持排序顺序.例如,假设您想过滤掉数组中可被 3 整除的数字,但仍保留不能被 3 整除的数字列表.

I'm looking to create a function that operates very closely to filter, except that it keeps the non-matching results as well, and maintains sort order. For instance, say you wanted to filter out numbers divisible by 3 in an array and still keep the list of numbers that aren't divisible by 3.

使用filter,你只能得到能被3整除的数字列表,原始列表保持不变.然后您可以使用相反的谓词再次过滤原始列表,但这是不必要的第二遍.代码如下所示:

With filter, you only get the list of numbers divisible by 3 and the original list stays unchanged. You can then filter the original list again with the opposite predicate, but this is an unnecessary second pass. The code looks like this:

let numbers = [1,2,3,4,5,6,7,8,9,10]
let divisibleBy3 = numbers.filter { $0 % 3 == 0 } // [3,6,9]
let theRest = numbers.filter { $0 % 3 != 0 }      // [1,2,4,5,7,8,10]

确实,这很可读,但它执行 2 遍的事实对我来说似乎效率低下,尤其是在谓词更复杂的情况下.这是实际需要的检查次数的两倍.

It's true that this is pretty readable, but the fact that it does 2 passes seems inefficient to me, especially if the predicate were more complicated. That's twice as many checks as are actually needed.

我的下一个尝试是扩展 Collection 并创建一个我称之为 separate 的函数.此函数将获取集合并一次一个地遍历元素,并将它们添加到匹配列表或非匹配列表中.代码如下所示:

My next attempt was extending Collection and making a function I called separate. This function would take the collection and go through the elements one at a time and add them to either the matching list or the non-matching list. The code looks like this:

extension Collection {
    func separate(predicate: (Generator.Element) -> Bool) -> (matching: [Generator.Element], notMatching: [Generator.Element]) {
        var groups: ([Generator.Element],[Generator.Element]) = ([],[])
        for element in self {
            if predicate(element) {
                groups.0.append(element)
            } else {
                groups.1.append(element)
            }
        }
        return groups
    }
}

然后我可以像这样使用这个函数:

I can then use the function like this:

let numbers = [1,2,3,4,5,6,7,8,9,10]
let groups = numbers.separate { $0 % 3 == 0 }
let matching = groups.matching       // [3,6,9]
let notMatching = groups.notMatching // [1,2,4,5,7,8,10]

这也很干净,但我唯一不喜欢的是我使用元组作为返回类型.也许其他人会不同意,但我更喜欢返回与 self 相同的类型以进行链接.但从技术上讲,您可以只获取 .matching.notMatching,这与 self 的类型相同,并且您可以链接其中任何一个其中.

This is also pretty clean, but the only thing I don't like is that I'm using a tuple as a return type. Maybe others will disagree, but I'd prefer returning the same type as self for chaining. But technically, you can just grab either .matching or .notMatching, which is the same type as self, and you can chain off of either of them.

我关于 separate 返回元组的问题导致我尝试创建一个函数来修改 self列表,并在最后返回匹配列表.返回的列表是您的匹配项,并且从这些值中删除了数组.两个数组中都保留了顺序.代码如下所示:

My issue with separate returning a tuple led me to try to make a function that would modify self by removing the matches as it found them and adding them to a new list, and returning the matches list at the end. The returned list is your matches, and the array is trimmed of those values. Order is preserved in both arrays. The code looks like this:

extension Array {
    mutating func removeIf(predicate: (Element) -> Bool) -> [Element] {
        var removedCount: Int = 0
        var removed: [Element] = []

        for (index,element) in self.enumerated() {
            if predicate(element) {
                removed.append(self.remove(at: index-removedCount))
                removedCount += 1
            }
        }

        return removed
    }
}

它是这样使用的:

var numbers = [1,2,3,4,5,6,7,8,9,10]
let divisibleBy3 = numbers.removeIf { $0 % 3 == 0 }
// divisibleBy3: [3,6,9]
// numbers:      [1,2,4,5,7,8,10]

这个函数必须在 Array 的扩展中实现,因为删除特定索引处的元素的概念不适用于常规的 Collections(Array 定义为public struct Array : RandomAccessCollection, MutableCollection,它直接定义了remove(at:)函数,而不是get它来自继承或协议).

This function had to be implemented in an extension of Array, because the concept of removing an element at a particular index doesn't apply to regular Collections (an Array is defined as public struct Array<Element> : RandomAccessCollection, MutableCollection, and it directly defines the remove(at:) function, rather than getting it from inheritance or a protocol).

我是代码重用的忠实粉丝,在提出选项 3 后,我意识到我可能可以重用选项 2 中的 separate 函数.我想出了这个:

I'm a big fan of code reuse, and after coming up with Option 3, I realized I could probably reuse the separate function from Option 2. I came up with this:

extension Array {
    mutating func removeIf(predicate: (Element) -> Bool) -> [Element] {
        let groups = self.separate(predicate: predicate)
        self = groups.notMatching
        return groups.matching
    }
}

它的用法就像在选项 3 中一样.

And it's used just like in Option 3.

我关心性能,所以我通过 XCTest 的 measure 运行每个 Option,迭代 1000 次.结果如下:

I was concerned with performance, so I ran each Option through XCTest's measure with 1000 iterations. These were the results:

Option 1: 9 ms
Option 2: 7 ms
Option 3: 10 ms
Option 4: 8 ms

<小时>

选项 5:基于 negaipro 的回答

我知道partition,但我不打算考虑它,因为它不保留排序顺序.negaipro 的答案本质上是 partition,但它让我思考.partition 的想法是交换与枢轴点匹配的元素,从而确保结束枢轴点一侧的所有内容都与谓词匹配,而另一侧则不匹配.我接受了这个想法并将动作更改为移动到最后".所以匹配被从它们的位置移除并添加到末尾.


Option 5: Based on negaipro's answer

I knew about partition, but I wasn't going to consider it because it didn't preserve the sort order. negaipro's answer is essentially partition, but it got me thinking. The idea with partition is to swap elements that match with the pivot point, thus ensuring that everything on one side of the end pivot point will all match the predicate and the other side doesn't. I took that idea and changed the action to "move to the end". So matches are removed from their spot and added to the end.

extension Array {
    mutating func swapIfModified(predicate: (Element) -> Bool) -> Int {
        var matchCount = 0
        var index = 0
        while index < (count-matchCount) {
            if predicate(self[index]) {
                append(remove(at: index))
                matchCount += 1
            } else {
                index += 1
            }
        }

        return count-matchCount
    }
}

在我使用包含 10 个数字的数组的初始测试中,它与其他选项相当.但我关心的是 append(remove(at: index)) 行的性能.所以我再次尝试了所有选项,数组从 1 到 1000,这个选项绝对是最慢的.

Under my initial tests using an array with 10 numbers, it was comparable to the other Options. But I was concerned with the performance of the line append(remove(at: index)). So I tried all the Options again with arrays going from 1 to 1000, and this option was definitely the slowest.

这些选项之间没有太大的性能差异.并且由于选项 4 比选项 3 更快并且重用了选项 2 中的代码,我倾向于丢弃选项 3.所以我倾向于在我不关心时使用普通的旧 filter关于未过滤的结果(同样,当我不关心过滤的结果时,因为它只是使用相反的谓词),然后使用 separateremoveIf 当我关心保留过滤和未过滤的结果时.

There isn't a big performance difference between these options. And since Option 4 was faster than Option 3 and reuses the code from Option 2, I'm inclined to throw out Option 3. So I'm leaning towards using plain old filter when I don't care about the unfiltered results (and, likewise, for when I don't care about the filtered results, since it's just using the opposite predicate), and then using either separate or removeIf when I care about keeping both the filtered and unfiltered results.

那么,我是否错过了 Swift 内置的一些功能?有没有更好的方法来实现这一点?我的扩展语法是否遗漏了任何东西(例如,任何可以使它将此概念应用于更多领域的东西)?

So, am I missing something built into Swift that does this already? Is there a better way to accomplish this? Is my extension syntax missing anything (anything that could make it apply this concept to more areas, for example)?

推荐答案

从技术上讲,这不能保证保持秩序,但确实如此.

Technically, this is not guaranteed to preserve order, but it does.

Dictionary(grouping: numbers) { $0.isMultiple(of: 3) }

https://github.com/apple/swift/blob/master/stdlib/public/core/NativeDictionary.swift

这篇关于在 Swift 中,一个高效的函数,可以根据谓词将一个数组分成 2 个数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆