为什么在Clojure的瞬态地图中插入1000 000个值可以产生一个有8个项目的地图? [英] Why inserting 1000 000 values in a transient map in Clojure yields a map with 8 items in it?
问题描述
如果我尝试在瞬态矢量上做1000 000 assoc!
,我将得到一个1000 000个元素的矢量
(count
(let [m(transient [])]
(dotimes [i 1000000]
(assoc!mii)) (persistent!m)))
; => 1000000
另一方面,如果我对地图做同样的话,它只会有8项在其中
(count
(let [m(transient {})]
(dotimes [i 1000000]
(assoc!mii))(persistent!m)))
; => 8
是否有原因为什么会发生这种情况?
瞬态数据类型的操作并不保证它们返回与传入的引用相同的引用。有时,实现可能决定返回一个新的(但仍然是暂时的)在 assoc!
之后的地图,而不是使用你传递的那个。
ClojureDocs页面 assoc!
有一个很好的例子解释了这种行为:
;;这里要理解的关键概念是瞬态是
;;不是要扎根;总是使用值
;;由assoc返回!或操作
;;的其他功能关于瞬变。
(defn merge2
使用瞬态的merge的示例实现
[xy]
(persistent!(reduce
(fn [ res [kv]](assoc!res kv))
(transient x)
y)))
;;为什么总是使用返回值,而不是原来的?因为return
;;值可能是与原始对象不同的对象。实现
;; Clojure瞬变在某些情况下会改变内部代表性
;;的瞬时收集(例如当其达到一定大小时)。在这样的
;;情况下,如果继续尝试修改原始对象,结果
;;将不正确。
;;想像瞬间像持久集合,你如何编写代码到
;;更新它们,不同于持久集合,原始集合
;;你进入的应该被视为一个未定义的值。只有返回
;;价值是可预测的。
我想重复上一部分,因为这很重要: 您传入的原始集合应被视为具有未定义的值。只有返回值是可预测的。
以下是您的代码的修改版本,可以按预期工作:
(count
(let [m(transient {})]
(persistent!
(reduce(fn [acc i ](assoc!acc ii))
m(范围1000000)))))
作为一个附注,你总是得到8的原因是因为Clojure喜欢使用一个 clojure.lang.PersistentArrayMap
(一张地图支持通过数组)用于具有8个或更少元素的地图。一旦你超过8,它切换到 clojure.lang.PersistentHashMap
。
用户=> (类型'{1 a 2 a 3 a 4 a 5 a 6 a 7 a 8 a})
clojure.lang.PersistentArrayMap
user => (类型'{1 a 2 a 3 a 4 a 5 a 6 a 7 a 8 a 9 a})
clojure.lang.PersistentHashMap
一旦你超过8个条目,你的瞬态地图将一个数组的支持数据结构( PersistentArrayMap
)切换到hashtable( PersistentHashMap
),此时 assoc!
返回一个新的引用,而不是只更新旧引用。 p>
If I try to do 1000 000 assoc!
on a transient vector, I'll get a vector of 1000 000 elements
(count
(let [m (transient [])]
(dotimes [i 1000000]
(assoc! m i i)) (persistent! m)))
; => 1000000
on the other hand, if I do the same with a map, it will only have 8 items in it
(count
(let [m (transient {})]
(dotimes [i 1000000]
(assoc! m i i)) (persistent! m)))
; => 8
Is there a reason why this is happening?
The transient datatypes' operations don't guarantee that they will return the same reference as the one passed in. Sometimes the implementation might decide to return a new (but still transient) map after an assoc!
rather than using the one you passed in.
The ClojureDocs page on assoc!
has a nice example that explains this behavior:
;; The key concept to understand here is that transients are
;; not meant to be `bashed in place`; always use the value
;; returned by either assoc! or other functions that operate
;; on transients.
(defn merge2
"An example implementation of `merge` using transients."
[x y]
(persistent! (reduce
(fn [res [k v]] (assoc! res k v))
(transient x)
y)))
;; Why always use the return value, and not the original? Because the return
;; value might be a different object than the original. The implementation
;; of Clojure transients in some cases changes the internal representation
;; of a transient collection (e.g. when it reaches a certain size). In such
;; cases, if you continue to try modifying the original object, the results
;; will be incorrect.
;; Think of transients like persistent collections in how you write code to
;; update them, except unlike persistent collections, the original collection
;; you passed in should be treated as having an undefined value. Only the return
;; value is predictable.
I'd like to repeat that last part because it's very important: the original collection you passed in should be treated as having an undefined value. Only the return value is predictable.
Here's a modified version of your code that works as expected:
(count
(let [m (transient {})]
(persistent!
(reduce (fn [acc i] (assoc! acc i i))
m (range 1000000)))))
As a side note, the reason you always get 8 is because Clojure likes to use a clojure.lang.PersistentArrayMap
(a map backed by an array) for maps with 8 or fewer elements. Once you get past 8, it switches to clojure.lang.PersistentHashMap
.
user=> (type '{1 a 2 a 3 a 4 a 5 a 6 a 7 a 8 a})
clojure.lang.PersistentArrayMap
user=> (type '{1 a 2 a 3 a 4 a 5 a 6 a 7 a 8 a 9 a})
clojure.lang.PersistentHashMap
Once you get past 8 entries, your transient map switches the backing data structure from an array of pairs (PersistentArrayMap
) to a hashtable (PersistentHashMap
), at which point assoc!
returns a new reference instead of just updating the old one.
这篇关于为什么在Clojure的瞬态地图中插入1000 000个值可以产生一个有8个项目的地图?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!