Clojure 的并行剂量 [英] Parallel doseq for Clojure
问题描述
我根本没有在 Clojure 中使用过多线程,所以不确定从哪里开始.
I haven't used multithreading in Clojure at all so am unsure where to start.
我有一个 doseq
,它的主体可以并行运行.我想要的是总是有 3 个线程在运行(留下 1 个内核空闲)来并行评估主体,直到范围用完为止.没有共享状态,没有什么复杂的 - 相当于 Python 的多处理就可以了.
I have a doseq
whose body can run in parallel. What I'd like is for there always to be 3 threads running (leaving 1 core free) that evaluate the body in parallel until the range is exhausted. There's no shared state, nothing complicated - the equivalent of Python's multiprocessing would be just fine.
比如:
(dopar 3 [i (range 100)]
; repeated 100 times in 3 parallel threads...
...)
我应该从哪里开始寻找?有这个命令吗?标准包?一个很好的参考?
Where should I start looking? Is there a command for this? A standard package? A good reference?
到目前为止,我已经找到了 pmap
,并且可以使用它(如何一次限制为 3 个?看起来它一次使用 32 个 - 不,消息来源说 2 + 处理器数量),但似乎这是一个基本原语应该已经存在于某处.
So far I have found pmap
, and could use that (how do I restrict to 3 at a time? looks like it uses 32 at a time - no, source says 2 + number of processors), but it seems like this is a basic primitive that should already exist somewhere.
澄清:我真的很想控制线程数.我有长时间运行并使用大量内存的进程,因此创建大量进程并希望一切顺利并不是一个好方法(使用大量可用内存的示例).
clarification: I really would like to control the number of threads. I have processes that are long-running and use a fair amount of memory, so creating a large number and hoping things work out OK isn't a good approach (example which uses a significant chunk available mem).
更新:开始编写一个执行此操作的宏,我需要一个信号量(或互斥锁,或一个我可以等待的原子).Clojure 中是否存在信号量?或者我应该使用 ThreadPoolExecutor 吗?不得不从 Java 中提取这么多东西似乎很奇怪 - 我认为 Clojure 中的并行编程应该很容易......也许我完全错误地思考了这个问题?嗯.代理?
update: Starting to write a macro that does this, and I need a semaphore (or a mutex, or an atom i can wait on). Do semaphores exist in Clojure? Or should I use a ThreadPoolExecutor? It seems odd to have to pull so much in from Java - I thought parallel programming in Clojure was supposed to be easy... Maybe I am thinking about this completely the wrong way? Hmmm. Agents?
推荐答案
pmap
实际上在大多数情况下都可以正常工作 - 它为您的机器使用具有合理数量线程的线程池.我不会费心去尝试创建自己的机制来控制线程数,除非您有真实的基准测试证据表明默认值会导致问题.
pmap
will actually work fine in most circumstances - it uses a thread pool with a sensible number of threads for your machine. I wouldn't bother trying to create your own mechanisms to control the number of threads unless you have real benchmark evidence that the defaults are causing a problem.
话虽如此,如果你真的想限制最多三个线程,一个简单的方法是只对范围的 3 个子集使用 pmap:
Having said that, if you really want to limit to a maximum of three threads, an easy approach is to just use pmap on 3 subsets of the range:
(defn split-equally [num coll]
"Split a collection into a vector of (as close as possible) equally sized parts"
(loop [num num
parts []
coll coll
c (count coll)]
(if (<= num 0)
parts
(let [t (quot (+ c num -1) num)]
(recur (dec num) (conj parts (take t coll)) (drop t coll) (- c t))))))
(defmacro dopar [thread-count [sym coll] & body]
`(doall (pmap
(fn [vals#]
(doseq [~sym vals#]
~@body))
(split-equally ~thread-count ~coll))))
注意 doall
的使用,这是强制对 pmap
求值所必需的(这是惰性的).
Note the use of doall
, which is needed to force evaluation of the pmap
(which is lazy).
这篇关于Clojure 的并行剂量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!