Clojure并行映射和无限序列 [英] Clojure Parallel Mapping and Infinite Sequences

查看:145
本文介绍了Clojure并行映射和无限序列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

让我们以下列方式定义所有自然数的顺序:

 (def naturals )

我还定义了一个将naturals映射到nil的函数,需要一段时间来计算, / p>

 (defn hard-comp [_](Thread / sleep 500))
/ pre>

注意计算由 clojure.core / time 测量的以下s表达式的计算时间。



(dorun(map hard-comp(range 30)))); 15010.367496 msecs



(dorun(pmap hard-comp(range 30))); 537.044554 msecs



(dorun(map hard-comp(doall(take 30 naturals))))); 15009.488499 msecs



(dorun(pmap hard-comp(doall(take 30 naturals)))); 3004.499013 msecs

(doall(take 30 naturals)); 0.385724 msecs



(range 30); 0.159374 msecs



pmap 的调用速度比使用显式范围调用的快6倍。



由于(=(范围30)(取30个自然数))返回true, code> clojure.lang.LazySeq ,并且clojure在调用函数之前调用函数的所有参数,如何解释上面的时间细节?

我猜这是因为:

 用户> ; (chunked-seq?(seq(range 30)))
true
user> (chunked-seq?(seq(take 30 naturals)))
false
user> (class(next(range 30)))
clojure.lang.ChunkedCons
user> (class(next(take 30 naturals)))
clojure.lang.Cons

这:

  user> (defn hard-comp [x](println x)(Thread / sleep 500))
#'user / hard-comp
user> (时间(dorun(pmap hard-comp(range 100))))

项目。这是一个范围的每个块获取的元素数量。分块seqs预先评估一堆项目提高性能。在这种情况下,看起来像 pmap chunkily一生产32个线程,一旦你尝试抓住一个元素从范围。



您可以随心所欲地将自然元素嵌入到向量中以获得分块行为。

  user> (时间(dorun(pmap hard-comp(range 100))))
经过时间:2004.680192 msecs
user> (时间(dorun(pmap hard-comp(vec(take 100 naturals)))))
经过时间:2005.887754 msecs

(注意,时间约为4 x 500 ms,4是需要多少块32才能达到100)



另一方面,你可能不想要分块的行为。 32个线程一次是很多。请参阅这个问题的例子如何un-chunkify seq。


Let's say I define the sequence of all natural numbers in the following way:

(def naturals (iterate inc 0))

I also define a function mapping the naturals to nil that takes a while to compute like so:

(defn hard-comp [_] (Thread/sleep 500))

Note the computation time to evaulate the following s-expressions as measured by clojure.core/time.

(dorun (map hard-comp (range 30))) ; 15010.367496 msecs

(dorun (pmap hard-comp (range 30))) ; 537.044554 msecs

(dorun (map hard-comp (doall (take 30 naturals))))) ; 15009.488499 msecs

(dorun (pmap hard-comp (doall (take 30 naturals)))) ; 3004.499013 msecs

(doall (take 30 naturals)) ; 0.385724 msecs

(range 30) ; 0.159374 msecs

pmap is ~6 times faster when called with an explicit range than with a section of the naturals.

Since (= (range 30) (take 30 naturals)) returns true and both objects are of type clojure.lang.LazySeq, and clojure evaulates all the arguments to a function before calling the function, how can the above timing details be explained?

解决方案

My guess is that it's due to this:

user> (chunked-seq? (seq (range 30)))
true
user> (chunked-seq? (seq (take 30 naturals)))
false
user> (class (next (range 30)))
clojure.lang.ChunkedCons
user> (class (next (take 30 naturals)))
clojure.lang.Cons

Try this:

user> (defn hard-comp [x] (println x) (Thread/sleep 500))
#'user/hard-comp
user> (time (dorun (pmap hard-comp (range 100))))

Note that it jumps 32 items at a time. That's how many elements are grabbed per chunk for a range. Chunked seqs pre-evaluate a bunch of items ahead of time to increase performance. In this case it looks like pmap chunkily spawns 32 threads as soon as you try to grab even one element from the range.

You could always stuff your naturals into a vector to get chunking behavior.

user> (time (dorun (pmap hard-comp (range 100))))
"Elapsed time: 2004.680192 msecs"
user> (time (dorun (pmap hard-comp (vec (take 100 naturals)))))
"Elapsed time: 2005.887754 msecs"

(Note that the time is approximately 4 x 500 ms, 4 being how many chunks of 32 it takes to get to 100.)

On the other hand, you might not want chunking behavior. 32 threads at a time is a lot. See this question for examples of how to un-chunkify a seq.

这篇关于Clojure并行映射和无限序列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆