为什么Swift迭代器比数组构建慢? [英] Why are Swift iterators slower than array building?

查看:62
本文介绍了为什么Swift迭代器比数组构建慢?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这与此问题有关,在此假设使用只要不需要存储结果,遍历嵌套数组的生成器(迭代器)对于迭代元素是最佳的,而如果只想展平数组,则使用重复数组串联是最好的.

This is kind of related to this question, where it was assumed that using generators (iterators) to traverse a nested array would be optimal for iterating through the elements, as long as you didn’t need to store the result, while using repeated array concatenation was best if you just wanted to flatten the array.

但是,我决定进行一些测试,并以惰性形式和存储形式实现此功能(将包含Int[Int]的数组[Any]展平),事实证明存储形式为即使只是用于遍历元素,速度也更快!这意味着以某种方式遍历生成器要比在内存中构造一个新数组并在 that 中遍历的时间更长.

However, I decided to do some testing, and implementing this function (that flattens an array [Any] containing either Ints or [Int]s) in both lazy and stored form, it turns out the stored form is faster, even when just used for iterating through the elements! That means that somehow, iterating through the generator takes more time than both constructing a new array in memory, and then iterating through that.

令人难以置信的是,它甚至比同一程序的 python 实现要慢大约5–70%,输入越少,效果就会越差. Swift是使用-O标志构建的.

Incredibly, it is even about 5–70% slower than a python implementation of the same program, which worsens with smaller input. Swift was built with the -O flag.

这是三个测试用例:1.小投入,混合; 2.大输入,[Int]占主导地位,3.大输入,Int占主导地位:

Here were the three test cases 1. small input, mixed; 2. large input, [Int] dominant, 3. large input, Int dominant:

let array1: [Any] = [Array(1...100), Array(101...105), 106, 
                     Array(107...111), 112, 113, 114, Array(115...125)]
let array2: [Any] = Array(repeating: Array(1...5), count: 2000)
let array3: [Any] = Array(repeating: 31, count: 10000)

Python

A1 = [list(range(1, 101)), list(range(101, 106)), 106, 
      list(range(107, 112)), 112, 113, 114, list(range(115, 126))]
A2 = list(range(1, 6)) * 2000
A3 = [31] * 10000

生成器和数组生成器:

func chain(_ segments: [Any]) -> AnyIterator<Int>{
    var i = 0
    var j = 0
    return AnyIterator<Int> {
        while i < segments.count {
            switch segments[i] {
                case let e as Int:
                    i += 1
                    return e
                case let E as [Int]:
                    if j < E.count {
                        let val = E[j]
                        j += 1
                        return val
                    }
                    j = 0
                    i += 1

                default:
                    return nil
            }
        }
        return nil
    }
}


func flatten_array(_ segments: [Any]) -> [Int] {
    var result = [Int]()
    for segment in segments {
        switch segment {
            case let segment as Int:
                result.append(segment)
            case let segment as [Int]:
                result.append(contentsOf: segment)
            default:
                break
        }
    }
    return result
}

Python

def chain(L):
    for i in L:
        if type(i) is int:
            yield i
        elif type(i) is list:
            yield from i


def flatten_list(L):
    result = []
    for i in L:
        if type(i) is int:
            result.append(i)
        elif type(i) is list:
            result.extend(i)
    return result

还有基准测试结果(第一个测试用例循环100000次,其他测试用例循环1000次):

And the benchmark results (100000 loops on the first test case, 1000 on the others):

test case 1 (small mixed input)
    Filling an array                         : 0.068221092224121094 s
    Filling an array, and looping through it : 0.074559926986694336 s
    Looping through a generator              : 1.5902719497680664   s *
    Materializing the generator to an array  : 1.759943962097168    s *

test case 2 (large input, [Int] s)
    Filling an array                         : 0.20634698867797852  s
    Filling an array, and looping through it : 0.21031379699707031  s
    Looping through a generator              : 1.3505551815032959   s *
    Materializing the generator to an array  : 1.4733860492706299   s *

test case 3 (large input, Int s)
    Filling an array                         : 0.27392101287841797  s
    Filling an array, and looping through it : 0.27670192718505859  s
    Looping through a generator              : 0.85304021835327148  s
    Materializing the generator to an array  : 1.0027849674224854   s *

Python

test case 1 (small mixed input)
    Filling an array                         : 0.1622014045715332   s
    Filling an array, and looping through it : 0.4312894344329834   s
    Looping through a generator              : 0.6839139461517334   s
    Materializing the generator to an array  : 0.5300459861755371   s

test case 2 (large input, [int] s)
    Filling an array                         : 1.029205083847046    s
    Filling an array, and looping through it : 1.2195289134979248   s
    Looping through a generator              : 1.0876803398132324   s
    Materializing the generator to an array  : 0.8958714008331299   s

test case 3 (large input, int s)
    Filling an array                         : 1.0181667804718018   s
    Filling an array, and looping through it : 1.244570255279541    s
    Looping through a generator              : 1.1220412254333496   s
    Materializing the generator to an array  : 0.9486079216003418   s

很显然,Swift非常擅长构建数组.但是为什么它的生成器这么慢,甚至在某些情况下甚至比Python还要慢呢? (在表中标记为*.)使用超大输入(> 100,000,000个元素,几乎使Swift崩溃)进行测试表明,即使在极限时,生成器也比数组填充慢了至少两倍.最好是3.25.

Clearly, Swift is very, very good at building arrays. But why are its generators so slow, even slower than Python’s in some cases? (Marked with an * in the table.) Testing with extremely large input ( > 100,000,000 elements, which nearly crashes Swift) suggests that even at the limit, the generator settles out to be slower than array filling by at least a factor of 3.25 in the best case.

如果这确实是该语言的固有特性,那么它会带来一些有趣的含义.例如,常识(无论如何对我来说都是python程序员)都会认为,如果我们试图合成一个不可变的对象(如字符串),我们应该首先将源提供给生成函数以展开它,然后再处理将输出转换为对单个浅序列起作用的joined()方法.相反,看起来最有效的策略是通过数组进行序列化.将源展开到一个中间数组,然后构造该数组的输出.

If this is really intrinsic to the language, it has some funny implications. For example, common sense (to me as a python programmer anyway) would have it that if we were trying to synthesize an immutable object (like a string), we should first feed the source to a generating function to unfold it, and then hand off the output to a joined() method which works on a single shallow sequence. Instead, it looks like the most efficient strategy is serialization via array; unfolding the source to an intermediate array, and then constructing the output from the array.

是否正在构建整个新数组,然后以比原始数组上的延迟迭代更快的速度遍历整个数组?为什么?

Is building an entire new array and then iterating through it that faster that a lazy iteration on the original array? Why?

(可能相关的javascript问题)

这是测试代码:

func time(test_array: [Any], cycles: Int = 1000000) -> (array_iterate: Double, 
                                                        array_store  : Double, 
                                                        generate_iterate: Double, 
                                                        generate_store: Double) {
    func start() -> Double { return Date().timeIntervalSince1970 }
    func lap(_ t0: Double) -> Double {
        return Date().timeIntervalSince1970 - t0
    }
    var t0 = start()

    for _ in 0..<cycles {
        for e in flatten_array(test_array) { e + 1 }
    }
    let ΔE1 = lap(t0)

    t0 = start()
    for _ in 0..<cycles {
        let array: [Int] = flatten_array(test_array)
    }
    let ΔE2 = lap(t0)

    t0 = start()
    for _ in 0..<cycles {
        let G = chain(test_array)
        while let g = G.next() { g + 1 }
    }
    let ΔG1 = lap(t0)

    t0 = start()
    for _ in 0..<cycles {
        let array: [Int] = Array(chain(test_array))
    }
    let ΔG2 = lap(t0)

    return (ΔE1, ΔE2, ΔG1, ΔG2)
}

print(time(test_array: array1, cycles: 100000))
print(time(test_array: array2, cycles: 1000))
print(time(test_array: array3, cycles: 1000))

Python

def time_f(test_array, cycles = 1000000):
    lap = lambda t0: time() - t0
    t0 = time()

    for _ in range(cycles):
        for e in flatten_list(test_array):
            e + 1

    ΔE1 = lap(t0)

    t0 = time()
    for _ in range(cycles):
        array = flatten_list(test_array)

    ΔE2 = lap(t0)

    t0 = time()
    for _ in range(cycles):
        for g in chain(test_array):
            g + 1

    ΔG1 = lap(t0)

    t0 = time()
    for _ in range(cycles):
        array = list(chain(test_array))

    ΔG2 = lap(t0)

    return ΔE1, ΔE2, ΔG1, ΔG2

print(time_f(A1, cycles=100000))
print(time_f(A3, cycles=1000))
print(time_f(A2, cycles=1000))

推荐答案

您问为什么它的[Swift]生成器这么慢,在某些情况下甚至比Python还要慢?"

You asked "why are its [Swift] generators so slow, even slower than Python’s in some cases?"

我的回答是,我认为它们并没有您的结果所显示的那么慢.特别是,我将尝试证明遍历迭代器比为所有测试用例构造数组要快.

My answer is that I do not think that they are nearly as slow as your results may indicate. In particular, I will try to demonstrate that looping through an iterator should be faster than constructing an array for all your test cases.

在较早的工作中(请参阅 http://lemire.me/blog/2016/09/22/swift-versus-java-the-bitset-performance-test/),我发现Swift迭代器大约只有一半在处理位集类时,速度与Java中的等效速度一样快.那不是很好,但是Java在这方面非常有效.同时,Go变得更糟.我向您提出,Swift迭代器可能并不是理想的高效方式,但它们可能仅是原始C代码的两倍.而且性能差距可能与Swift中的函数内联不足有关.

In earlier work (see a related blog post at http://lemire.me/blog/2016/09/22/swift-versus-java-the-bitset-performance-test/), I found that Swift iterators were about half as fast as the equivalent in Java when working over a bitset class. That's not great but Java is very efficient in this respect. Meanwhile, Go does worse. I submit to you that Swift iterators are probably not ideally efficient, but they are probably within a factor of two of what is possible with raw C code. And the performance gap probably has to do with insufficient function inlining in Swift.

我看到您正在使用AnyIterator.我建议从IteratorProtocol派生struct,这具有确保不必进行任何动态调度的好处.这是一种相对有效的可能性:

I see that you are using an AnyIterator. I suggest deriving a struct from IteratorProtocol instead which has the benefit of insuring that there does not have to be any dynamic dispatch. Here is a relatively efficient possibility:

public struct FastFlattenIterator: IteratorProtocol {
   let segments: [Any]
    var i = 0 // top-level index
    var j = 0 // second-level index
    var jmax = 0 // essentially, this is currentarray.count, but we buffer it
    var currentarray : [Int]! // quick reference to an int array to be flatten

   init(_ segments: [Any]) {
       self.segments = segments
   }

   public mutating func next() -> Int? {
     if j > 0 { // we handle the case where we iterate within an array separately
       let val = currentarray[j]
       j += 1
       if j == jmax {
         j = 0
         i += 1
       }
       return val
     }
     while i < segments.count {
        switch segments[i] {
          case let e as Int: // found an integer value
            i += 1
            return e
          case let E as [Int]: // first encounter with an array
            jmax = E.count
            currentarray = E
            if jmax > 0 {
              j = 1
              return E[0]
            }
            i += 1
          default:
            return nil
        }
     }
     return nil
   }
}

在这堂课上,我得到以下数字.对于每个测试用例,前四种方法均来自代码示例,而后两种(快速迭代器)是使用新结构构建的.请注意,通过快速迭代器循环"始终是最快的.

With this class, I am getting the following numbers. For each test case, the first four approaches are taken from your code sample whereas the last two (fast iterator) are build using the new struct. Notice that "Looping through a fast iterator" is always fastest.

test case 1 (small mixed input)
Filling an array                         : 0.0073099999999999997 ms
Filling an array, and looping through it : 0.0069870000000000002 ms
Looping through a generator              : 0.18385799999999999   ms 
Materializing the generator to an array  : 0.18745700000000001   ms 
Looping through a fast iterator          : 0.005372              ms 
Materializing the fast iterator          : 0.015883999999999999  ms

test case 2 (large input, [Int] s)
Filling an array                         : 2.125931            ms
Filling an array, and looping through it : 2.1169820000000001  ms
Looping through a generator              : 15.064767           ms 
Materializing the generator to an array  : 15.45152            ms 
Looping through a fast iterator          : 1.572919            ms
Materializing the fast iterator          : 1.964912            ms 

test case 3 (large input, Int s)
Filling an array                         : 2.9140269999999999  ms
Filling an array, and looping through it : 2.9064290000000002  ms
Looping through a generator              : 9.8297640000000008  ms
Materializing the generator to an array  : 9.8297640000000008  ms 
Looping through a fast iterator          : 1.978038            ms 
Materializing the fast iterator          : 2.2565339999999998  ms 

您将在GitHub上找到我完整的代码示例: https://github.com/lemire/Code-used-on-Daniel-Lemire-s-blog/tree/master/extra/swift/iterators

You will find my complete code sample on GitHub: https://github.com/lemire/Code-used-on-Daniel-Lemire-s-blog/tree/master/extra/swift/iterators

这篇关于为什么Swift迭代器比数组构建慢?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆