Clojure 的并行剂量 [英] Parallel doseq for Clojure

查看:14
本文介绍了Clojure 的并行剂量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我根本没有在 Clojure 中使用过多线程,所以不确定从哪里开始.

I haven't used multithreading in Clojure at all so am unsure where to start.

我有一个 doseq,它的主体可以并行运行.我想要的是总是有 3 个线程在运行(留下 1 个内核空闲)来并行评估主体,直到范围用完为止.没有共享状态,没有什么复杂的 - 相当于 Python 的多处理就可以了.

I have a doseq whose body can run in parallel. What I'd like is for there always to be 3 threads running (leaving 1 core free) that evaluate the body in parallel until the range is exhausted. There's no shared state, nothing complicated - the equivalent of Python's multiprocessing would be just fine.

比如:

(dopar 3 [i (range 100)]
  ; repeated 100 times in 3 parallel threads...
  ...)

我应该从哪里开始寻找?有这个命令吗?标准包?一个很好的参考?

Where should I start looking? Is there a command for this? A standard package? A good reference?

到目前为止,我已经找到了 pmap,并且可以使用它(如何一次限制为 3 个?看起来它一次使用 32 个 - 不,消息来源说 2 + 处理器数量),但似乎这是一个基本原语应该已经存在于某处.

So far I have found pmap, and could use that (how do I restrict to 3 at a time? looks like it uses 32 at a time - no, source says 2 + number of processors), but it seems like this is a basic primitive that should already exist somewhere.

澄清:我真的很想控制线程数.我有长时间运行并使用大量内存的进程,因此创建大量进程并希望一切顺利并不是一个好方法(使用大量可用内存的示例).

clarification: I really would like to control the number of threads. I have processes that are long-running and use a fair amount of memory, so creating a large number and hoping things work out OK isn't a good approach (example which uses a significant chunk available mem).

更新:开始编写一个执行此操作的宏,我需要一个信号量(或互斥锁,或一个我可以等待的原子).Clojure 中是否存在信号量?或者我应该使用 ThreadPoolExecutor 吗?不得不从 Java 中提取这么多东西似乎很奇怪 - 我认为 Clojure 中的并行编程应该很容易......也许我完全错误地思考了这个问题?嗯.代理?

update: Starting to write a macro that does this, and I need a semaphore (or a mutex, or an atom i can wait on). Do semaphores exist in Clojure? Or should I use a ThreadPoolExecutor? It seems odd to have to pull so much in from Java - I thought parallel programming in Clojure was supposed to be easy... Maybe I am thinking about this completely the wrong way? Hmmm. Agents?

推荐答案

pmap 实际上在大多数情况下都可以正常工作 - 它为您的机器使用具有合理数量线程的线程池.我不会费心去尝试创建自己的机制来控制线程数,除非您有真实的基准测试证据表明默认值会导致问题.

pmap will actually work fine in most circumstances - it uses a thread pool with a sensible number of threads for your machine. I wouldn't bother trying to create your own mechanisms to control the number of threads unless you have real benchmark evidence that the defaults are causing a problem.

话虽如此,如果你真的想限制最多三个线程,一个简单的方法是只对范围的 3 个子集使用 pmap:

Having said that, if you really want to limit to a maximum of three threads, an easy approach is to just use pmap on 3 subsets of the range:

(defn split-equally [num coll] 
  "Split a collection into a vector of (as close as possible) equally sized parts"
  (loop [num num 
         parts []
         coll coll
         c (count coll)]
    (if (<= num 0)
      parts
      (let [t (quot (+ c num -1) num)]
        (recur (dec num) (conj parts (take t coll)) (drop t coll) (- c t)))))) 

(defmacro dopar [thread-count [sym coll] & body]
 `(doall (pmap 
    (fn [vals#]
      (doseq [~sym vals#]
        ~@body))  
    (split-equally ~thread-count ~coll))))

注意 doall 的使用,这是强制对 pmap 求值所必需的(这是惰性的).

Note the use of doall, which is needed to force evaluation of the pmap (which is lazy).

这篇关于Clojure 的并行剂量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆