为什么在Clojure的瞬态地图中插入1000 000个值可以产生一个有8个项目的地图? [英] Why inserting 1000 000 values in a transient map in Clojure yields a map with 8 items in it?

查看:105
本文介绍了为什么在Clojure的瞬态地图中插入1000 000个值可以产生一个有8个项目的地图?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果我尝试在瞬态矢量上做1000 000 assoc!,我将得到一个1000 000个元素的矢量

 (count 
(let [m(transient [])]
(dotimes [i 1000000]
(assoc!mii)) (persistent!m)))
; => 1000000

另一方面,如果我对地图做同样的话,它只会有8项在其中

 (count 
(let [m(transient {})]
(dotimes [i 1000000]
(assoc!mii))(persistent!m)))
; => 8

是否有原因为什么会发生这种情况?

解决方案

瞬态数据类型的操作并不保证它们返回与传入的引用相同的引用。有时,实现可能决定返回一个新的(但仍然是暂时的)在 assoc!之后的地图,而不是使用你传递的那个。



ClojureDocs页面 assoc! 有一个很好的例子解释了这种行为:

  ;;这里要理解的关键概念是瞬态是
;;不是要扎根;总是使用值
;;由assoc返回!或操作
;;的其他功能关于瞬变。

(defn merge2
使用瞬态的merge的示例实现
[xy]
(persistent!(reduce
(fn [ res [kv]](assoc!res kv))
(transient x)
y)))

;;为什么总是使用返回值,而不是原来的?因为return
;;值可能是与原始对象不同的对象。实现
;; Clojure瞬变在某些情况下会改变内部代表性
;;的瞬时收集(例如当其达到一定大小时)。在这样的
;;情况下,如果继续尝试修改原始对象,结果
;;将不正确。

;;想像瞬间像持久集合,你如何编写代码到
;;更新它们,不同于持久集合,原始集合
;;你进入的应该被视为一个未定义的值。只有返回
;;价值是可预测的。

我想重复上一部分,因为这很重要: 您传入的原始集合应被视为具有未定义的值。只有返回值是可预测的。



以下是您的代码的修改版本,可以按预期工作:

 (count 
(let [m(transient {})]
(persistent!
(reduce(fn [acc i ](assoc!acc ii))
m(范围1000000)))))






作为一个附注,你总是得到8的原因是因为Clojure喜欢使用一个 clojure.lang.PersistentArrayMap (一张地图支持通过数组)用于具有8个或更少元素的地图。一旦你超过8,它切换到 clojure.lang.PersistentHashMap

 用户=> (类型'{1 a 2 a 3 a 4 a 5 a 6 a 7 a 8 a})
clojure.lang.PersistentArrayMap
user => (类型'{1 a 2 a 3 a 4 a 5 a 6 a 7 a 8 a 9 a})
clojure.lang.PersistentHashMap

一旦你超过8个条目,你的瞬态地图将一个数组的支持数据结构( PersistentArrayMap )切换到hashtable( PersistentHashMap ),此时 assoc!返回一个新的引用,而不是只更新旧引用。 p>

If I try to do 1000 000 assoc! on a transient vector, I'll get a vector of 1000 000 elements

(count
  (let [m (transient [])]
    (dotimes [i 1000000]
      (assoc! m i i)) (persistent! m)))
; => 1000000

on the other hand, if I do the same with a map, it will only have 8 items in it

(count
  (let [m (transient {})]
    (dotimes [i 1000000]
      (assoc! m i i)) (persistent! m)))
; => 8

Is there a reason why this is happening?

解决方案

The transient datatypes' operations don't guarantee that they will return the same reference as the one passed in. Sometimes the implementation might decide to return a new (but still transient) map after an assoc! rather than using the one you passed in.

The ClojureDocs page on assoc! has a nice example that explains this behavior:

;; The key concept to understand here is that transients are 
;; not meant to be `bashed in place`; always use the value 
;; returned by either assoc! or other functions that operate
;; on transients.

(defn merge2
  "An example implementation of `merge` using transients."
  [x y]
  (persistent! (reduce
                (fn [res [k v]] (assoc! res k v))
                (transient x)
                y)))

;; Why always use the return value, and not the original?  Because the return
;; value might be a different object than the original.  The implementation
;; of Clojure transients in some cases changes the internal representation
;; of a transient collection (e.g. when it reaches a certain size).  In such
;; cases, if you continue to try modifying the original object, the results
;; will be incorrect.

;; Think of transients like persistent collections in how you write code to
;; update them, except unlike persistent collections, the original collection
;; you passed in should be treated as having an undefined value.  Only the return
;; value is predictable.

I'd like to repeat that last part because it's very important: the original collection you passed in should be treated as having an undefined value. Only the return value is predictable.

Here's a modified version of your code that works as expected:

(count
  (let [m (transient {})]
    (persistent!
      (reduce (fn [acc i] (assoc! acc i i))
              m (range 1000000)))))


As a side note, the reason you always get 8 is because Clojure likes to use a clojure.lang.PersistentArrayMap (a map backed by an array) for maps with 8 or fewer elements. Once you get past 8, it switches to clojure.lang.PersistentHashMap.

user=> (type '{1 a 2 a 3 a 4 a 5 a 6 a 7 a 8 a})
clojure.lang.PersistentArrayMap
user=> (type '{1 a 2 a 3 a 4 a 5 a 6 a 7 a 8 a 9 a})
clojure.lang.PersistentHashMap

Once you get past 8 entries, your transient map switches the backing data structure from an array of pairs (PersistentArrayMap) to a hashtable (PersistentHashMap), at which point assoc! returns a new reference instead of just updating the old one.

这篇关于为什么在Clojure的瞬态地图中插入1000 000个值可以产生一个有8个项目的地图?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆