为什么在Clojure中的瞬态映射中插入1000 000个值会产生一个包含8个项目的映射? [英] Why inserting 1000 000 values in a transient map in Clojure yields a map with 8 items in it?
问题描述
如果我尝试在瞬态向量上做1000 000 assoc!
,我会得到一个包含1000 000个元素的向量
(count
(let [m(transient [])]
(dotimes [i 1000000]
(assoc!mii)) (persistent!m)))
; => 1000000
另一方面,如果我对地图做同样的事情,
(count
(let [m(transient {})]
(dotimes [i 1000000]
(assoc!mii))(persistent!m)))
; => 8
这是发生的原因吗?
临时数据类型的操作不保证它们返回与传入的引用相同的引用。有时实现可能决定返回一个新的(但仍然是瞬时的)
//clojuredocs.org/clojure.core/assoc!\">在 assoc!
上的ClojureDocs页面有一个 nice示例,说明此行为:
;;这里要理解的关键概念是瞬变是
;;不是为了沐浴在地;总是使用值
;;由assoc返回!或其他操作
;;瞬变。
(defn merge2
使用瞬态的`merge'的一个示例实现
[xy]
(reduce
(fn [ res [kv]](assoc!res kv))
(transient x)
y)))
;为什么总是使用返回值,而不是原来的?因为返回
;;值可能是与原始值不同的对象。实现
;;的Clojure瞬态在某些情况下改变内部表示
;;的瞬时集合(例如,当它达到一定大小时)。在这样的
;; case,如果你继续尝试修改原始对象,结果
;;将不正确。
;;想象瞬态如持久集合在你如何编写代码
;;更新它们,除了不同于持久化集合,原始集合
;;您传入的内容应被视为具有未定义的值。只有返回
;;值是可预测的。
我想重复最后一部分,因为它非常重要: 您传入的原始集合应被视为具有未定义的值。只有返回值是可预测的。
这是您的代码的修改版本,可按预期工作:
(count
(let [m(transient {})]
(persistent!
](assoc!acc ii))
m(range 1000000)))))
另一方面,你总是得到8的原因是因为Clojure喜欢使用
clojure.lang.PersistentArrayMap
(一个映射支持通过数组)用于具有8个或更少元素的地图。一旦你超过8,它切换到 clojure.lang.PersistentHashMap
。 user => (类型{1 a 2 a 3 a 4 a 5 a 6 a 7 a 8 a})
clojure.lang.PersistentArrayMap
user => (类型{1 a 2 a 3 a 4 a 5 a 6 a 7 a 8 a 9 a})
clojure.lang.PersistentHashMap
一旦你超过了8个条目,你的临时映射就会将支持数据结构从一个数组对( PersistentArrayMap
)切换到hashtable( PersistentHashMap
),此时 assoc!
返回一个新的引用,而不是只更新旧的引用。 p>
If I try to do 1000 000 assoc!
on a transient vector, I'll get a vector of 1000 000 elements
(count
(let [m (transient [])]
(dotimes [i 1000000]
(assoc! m i i)) (persistent! m)))
; => 1000000
on the other hand, if I do the same with a map, it will only have 8 items in it
(count
(let [m (transient {})]
(dotimes [i 1000000]
(assoc! m i i)) (persistent! m)))
; => 8
Is there a reason why this is happening?
The transient datatypes' operations don't guarantee that they will return the same reference as the one passed in. Sometimes the implementation might decide to return a new (but still transient) map after an assoc!
rather than using the one you passed in.
The ClojureDocs page on assoc!
has a nice example that explains this behavior:
;; The key concept to understand here is that transients are
;; not meant to be `bashed in place`; always use the value
;; returned by either assoc! or other functions that operate
;; on transients.
(defn merge2
"An example implementation of `merge` using transients."
[x y]
(persistent! (reduce
(fn [res [k v]] (assoc! res k v))
(transient x)
y)))
;; Why always use the return value, and not the original? Because the return
;; value might be a different object than the original. The implementation
;; of Clojure transients in some cases changes the internal representation
;; of a transient collection (e.g. when it reaches a certain size). In such
;; cases, if you continue to try modifying the original object, the results
;; will be incorrect.
;; Think of transients like persistent collections in how you write code to
;; update them, except unlike persistent collections, the original collection
;; you passed in should be treated as having an undefined value. Only the return
;; value is predictable.
I'd like to repeat that last part because it's very important: the original collection you passed in should be treated as having an undefined value. Only the return value is predictable.
Here's a modified version of your code that works as expected:
(count
(let [m (transient {})]
(persistent!
(reduce (fn [acc i] (assoc! acc i i))
m (range 1000000)))))
As a side note, the reason you always get 8 is because Clojure likes to use a clojure.lang.PersistentArrayMap
(a map backed by an array) for maps with 8 or fewer elements. Once you get past 8, it switches to clojure.lang.PersistentHashMap
.
user=> (type '{1 a 2 a 3 a 4 a 5 a 6 a 7 a 8 a})
clojure.lang.PersistentArrayMap
user=> (type '{1 a 2 a 3 a 4 a 5 a 6 a 7 a 8 a 9 a})
clojure.lang.PersistentHashMap
Once you get past 8 entries, your transient map switches the backing data structure from an array of pairs (PersistentArrayMap
) to a hashtable (PersistentHashMap
), at which point assoc!
returns a new reference instead of just updating the old one.
这篇关于为什么在Clojure中的瞬态映射中插入1000 000个值会产生一个包含8个项目的映射?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!