缓慢的 Swift 字符串性能 [英] Slow Swift String Performance

查看:13
本文介绍了缓慢的 Swift 字符串性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试解决回文分区问题.您可以在 https://leetcode.com/problems/palindrome-partitioning/ 中找到问题.

I am trying to solve the Palindrome Partitioning Question. You can find the question in https://leetcode.com/problems/palindrome-partitioning/.

我想出了解决方案:

func partition(_ s: String) -> [[String]] {

    var result: [[String]] = []

    func dfs(string: String, partiton: [String]) {

        if string.characters.count == 0 {
            result.append(partiton)
            return
        }

        for length in 1...string.characters.count {

            let endIndex =  string.index(string.startIndex, offsetBy: length-1)
            let part = string[string.startIndex...endIndex]


            if isPalindrome(part) {
                let leftPart = string[string.index(after: endIndex)..<string.endIndex]
                print("string: (string) part: (part) leftpart: (leftPart)")
                dfs(string: leftPart, partiton: partiton + [part])
            }
        }
    }

    func isPalindrome(_ s: String) -> Bool {
        if String(s.characters.reversed()) == s {
            return true
        } else {
            return false
        }
    }

    dfs(string: s, partiton: [])
    return result
}

但是性能很差.超出时间限制.

But the performance is Bad. Time Limit Exceeded.

但同样的想法与 Python 实现可以通过:

But the same idea with Python implementation can pass:

def partition(self, s):
    res = []
    self.dfs(s, [], res)
    return res

def dfs(self, s, path, res):
    if not s:
        res.append(path)
        return
    for i in range(1, len(s)+1):
        if self.isPal(s[:i]):
            self.dfs(s[i:], path+[s[:i]], res)

def isPal(self, s):
    return s == s[::-1]

这让我想知道如何改进 swift 实现以及为什么 swift 实现比 python 慢.

It make me wonder that how to improve the swift implementation and why the swift implementation is slower than python.

推荐答案

A Swift StringCharacterCharacter 的集合> 表示单个扩展字素簇,可以是一个或多个Unicode 标量.这使得一些诸如跳过前 N 个字符"之类的索引操作变慢.

A Swift String is a collection of Characters, and a Character represents a single extended grapheme cluster, that can be one or more Unicode scalars. That makes some index operations like "skip the first N characters" slow.

但第一个改进是短路"isPalindrome()功能.而不是完全构建反向字符串,比较反序的字符序列并尽快停止发现差异:

But the first improvement is to "short-circuit" the isPalindrome() function. Instead of building the reversed string completely, compare the character sequence with its reversed sequence and stop as soon as a difference is found:

func isPalindrome(_ s: String) -> Bool {
    return !zip(s.characters, s.characters.reversed()).contains { $0 != $1 }
}

s.characters.reversed() 不会反向创建新集合order,它只是从后到前枚举字符.但是,使用 String(s.characters.reversed()) 就像在您的方法中一样,您强制为反向字符串创建一个新集合,这让它变慢了.

s.characters.reversed() does not create a new collection in reverse order, it just enumerates the characters from back to front. With String(s.characters.reversed()) as in your method however, you force the creation of a new collection for the reversed string, that makes it slow.

对于 110 个字符的字符串

For the 110-character string

let string = String(repeating: "Hello world", count: 10)

在我的测试中,这将计算时间从大约 6 秒减少到 1.2 秒.

this reduces the computation time from about 6 sec to 1.2 sec in my test.

接下来,避免像

let endIndex = string.index(string.startIndex, offsetBy: length-1)

并迭代字符索引本身:

func partition(_ s: String) -> [[String]] {

    var result: [[String]] = []

    func dfs(string: String, partiton: [String]) {
        if string.isEmpty {
            result.append(partiton)
            return
        }

        var idx = string.startIndex
        repeat {
            string.characters.formIndex(after: &idx)
            let part = string.substring(to: idx)
            if isPalindrome(part) {
                let leftPart = string.substring(from: idx)
                dfs(string: leftPart, partiton: partiton + [part])
            }
        } while idx != string.endIndex
    }

    func isPalindrome(_ s: String) -> Bool {
        return !zip(s.characters, s.characters.reversed()).contains { $0 != $1 }
    }

    dfs(string: s, partiton: [])
    return result
}

计算时间现在是 0.7 秒.

Computation time is now 0.7 sec.

下一步是完全避免字符串索引,并使用字符数组,因为数组索引很快.更妙的是,使用数组 slices 可以快速创建和引用原始数据数组元素:

The next step is to avoid string indexing totally, and work with an array of characters, because array indexing is fast. Even better, use array slices which are fast to create and reference the original array elements:

func partition(_ s: String) -> [[String]] {

    var result: [[String]] = []

    func dfs(chars: ArraySlice<Character>, partiton: [String]) {

        if chars.isEmpty {
            result.append(partiton)
            return
        }

        for length in 1...chars.count {
            let part = chars.prefix(length)
            if isPalindrome(part) {
                let leftPart = chars.dropFirst(length)
                dfs(chars: leftPart, partiton: partiton + [String(part)])
            }
        }
    }

    func isPalindrome(_ c: ArraySlice<Character>) -> Bool {
        return !zip(c, c.reversed()).contains { $0 != $1 }
    }

    dfs(chars: ArraySlice(s.characters), partiton: [])
    return result
}

计算时间现在是 0.08 秒.

Computation time is now 0.08 sec.

如果您的字符串仅包含基本多语言平面"中的字符(即 <= U+FFFF),那么您可以使用 UTF-16 代码点:

If your string contains only characters in the "basic multilingual plane" (i.e. <= U+FFFF) then you can work with UTF-16 code points instead:

func partition(_ s: String) -> [[String]] {

    var result: [[String]] = []

    func dfs(chars: ArraySlice<UInt16>, partiton: [String]) {

        if chars.isEmpty {
            result.append(partiton)
            return
        }

        for length in 1...chars.count {
            let part = chars.prefix(length)
            if isPalindrome(part) {
                let leftPart = chars.dropFirst(length)
                part.withUnsafeBufferPointer {
                    dfs(chars: leftPart, partiton: partiton + [String(utf16CodeUnits: $0.baseAddress!, count: length)])
                }
            }
        }
    }

    func isPalindrome(_ c: ArraySlice<UInt16>) -> Bool {
        return !zip(c, c.reversed()).contains { $0 != $1 }
    }

    dfs(chars: ArraySlice(s.utf16), partiton: [])
    return result
}

现在 110 个字符的测试字符串的计算时间为 0.04 秒.

Computation time is now 0.04 sec for the 110 character test string.

因此,在使用 Swift 字符串时可能会提高性能的一些技巧是

So some tips which potentially can improve the performance when working with Swift strings are

  • 按顺序迭代字符/索引.避免跳跃"到第 n 个位置.
  • 如果您需要随机"访问所有字符,请转换字符串首先到一个数组.
  • 使用字符串的 UTF-16 视图可能比工作更快带有字符视图.

当然,这取决于实际用例.在这个应用中,我们能够将计算时间从 6 秒减少到 0.04 秒,这是 150 的因数.

Of course it depends on the actual use-case. In this application, we were able to reduce the computation time from 6 sec to 0.04 sec, that is a factor of 150.

这篇关于缓慢的 Swift 字符串性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆