Swift巨大的数组字典,非常慢 [英] Swift enormous dictionary of arrays, very slow

查看:90
本文介绍了Swift巨大的数组字典,非常慢的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用dictionary在Swift中进行项目.

I am working on a project in Swift, using a dictionary.

此词典的类型为[String : [Posting]].我要在其中插入大约200k个不同的术语"(键),对于每个术语,我要在列表中附加大约500至1000个对象.我知道这是一种奇怪的做法,但是我没有选择,我必须处理所有这些要素.

This dictionary is of the type [String : [Posting]]. I have around 200k different "terms" (keys) to insert in it, and for each term I have around 500 to 1000 objects to append in a list. I know it's a weird practice but I don't have the choice and I must deal with all those elements.

问题在于,随着字典的变大,这非常慢.我尝试切换到NSMutableDictionary,没有运气.

The issue is that this is very very slow, as the dictionary gets bigger. I tried switching to a NSMutableDictionary, no luck.

每次需要插入元素时,都会调用addTerm函数:

My addTerm function is called everytime I need to insert an element :

   func addTerm(_ term: String, withId id: Int, atPosition position: Int) {

        if self.map[term] == nil {
            self.map[term] = [Posting]()
        }

        if self.map[term]!.last?.documentId == id {
            self.map[term]!.last?.addPosition(position)
        }
        else {
            self.map[term]!.append(Posting(withId: id, atPosition: position, forTerm: term))
        }
    }

EDIT :我意识到不是导致所有这种滞后的字典,而是实际上包含的数组.数组在添加新元素时会过多地重新分配,而我能做的最好的就是用ContiguousArray替换它们.

EDIT: I realized that its not the dictionary that causes all this lag, but its actually the arrays it contains. Arrays re-allocate way too much when adding new elements, and the best I could was to replace them with ContiguousArray.

推荐答案

这是相当普遍的性能陷阱,也可以在以下情况中看到:

This is fairly common performance trap, as also observed in:

  • Dictionary in Swift with Mutable Array as value is performing very slow? How to optimize or construct properly?
  • Swift semantics regarding dictionary access

问题源于以下事实:要在表达式self.map[term]!.append(...)中进行突变的数组是字典存储中基础数组的临时可变副本.这意味着该数组永远不会被唯一引用,因此总是要重新分配其缓冲区.

The issue stems from the fact that the array you're mutating in the expression self.map[term]!.append(...) is a temporary mutable copy of the underlying array in the dictionary's storage. This means that the array is never uniquely referenced and so always has its buffer re-allocated.

这种情况将在Swift 5中通过非官方引入通用访问器来解决,但是在此之前,一种解决方案(如上述Q& As中所述)是使用Dictionarysubscript(_:default:),它来自Swift 4.1可以直接在存储中更改值.

This situation will fixed in Swift 5 with the unofficial introduction of generalised accessors, but until then, one solution (as mentioned in both the above Q&As) is to use Dictionary's subscript(_:default:) which from Swift 4.1 can mutate the value directly in storage.

尽管您的案例并不是应用单个突变的简单案例,所以您需要某种包装函数,以使您能够对可变数组进行范围化访问.

Although your case isn't quite a straightforward case of applying a single mutation, so you need some kind of wrapper function in order to allow you to have scoped access to your mutable array.

例如,它看起来像:

class X {

  private var map: [String: [Posting]] = [:]

  private func withPostings<R>(
    forTerm term: String, mutations: (inout [Posting]) throws -> R
  ) rethrows -> R {
    return try mutations(&map[term, default: []])
  }

  func addTerm(_ term: String, withId id: Int, atPosition position: Int) {

    withPostings(forTerm: term) { postings in
      if let posting = postings.last, posting.documentId == id {
        posting.addPosition(position)
      } else {
        postings.append(Posting(withId: id, atPosition: position, forTerm: term))
      }
    }

  }
  // ...
}

这篇关于Swift巨大的数组字典,非常慢的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆