为什么Clojure在执行我的计算后挂起? [英] Why does Clojure hang after having performed my calculations?
问题描述
我正在尝试并行过滤元素。对于每个元素,我需要执行距离计算,看看它是否足够接近目标点。不用担心数据结构已经存在,我现在只是做初步实验。
I'm experimenting with filtering through elements in parallel. For each element, I need to perform a distance calculation to see if it is close enough to a target point. Never mind that data structures already exist for doing this, I'm just doing initial experiments for now.
无论如何,我想运行一些非常基本的实验,我生成随机向量和过滤器。这里是我的实现,做所有这些
Anyway, I wanted to run some very basic experiments where I generate random vectors and filter them. Here's my implementation that does all of this
(defn pfilter [pred coll]
(map second
(filter first
(pmap (fn [item] [(pred item) item]) coll))))
(defn random-n-vector [n]
(take n (repeatedly rand)))
(defn distance [u v]
(Math/sqrt (reduce + (map #(Math/pow (- %1 %2) 2) u v))))
(defn -main [& args]
(let [[n-str vectors-str threshold-str] args
n (Integer/parseInt n-str)
vectors (Integer/parseInt vectors-str)
threshold (Double/parseDouble threshold-str)
random-vector (partial random-n-vector n)
u (random-vector)]
(time (println n vectors
(count
(pfilter
(fn [v] (< (distance u v) threshold))
(take vectors (repeatedly random-vector))))))))
代码执行并返回我预计,这是参数n(向量的长度),向量(向量的数量)和比阈值更接近目标向量的向量的数量。
The code executes and returns what I expect, that is the parameter n (length of vectors), vectors (the number of vectors) and the number of vectors that are closer than a threshold to the target vector. What I don't understand is why the programs hangs for an additional minute before terminating.
这里是一个运行的输出,演示错误
Here is the output of a run which demonstrates the error
$ time lein run 10 100000 1.0
[null] 10 100000 12283
[null] "Elapsed time: 3300.856 msecs"
real 1m6.336s
user 0m7.204s
sys 0m1.495s
如何并行过滤一般也是欢迎,因为我还没有确认 pfilter
实际上工作。
Any comments on how to filter in parallel in general are also more than welcome, as I haven't yet confirmed that pfilter
actually works.
推荐答案
您需要调用 shutdown-agents
才能杀死支持pmap使用的线程池的线程。
You need to call shutdown-agents
to kill the threads backing the threadpool used by pmap.
关于 pfilter
,它应该工作,但运行速度比过滤器
很简单。并行化不是免费的,所以你必须给每个线程适度密集的任务来抵消多线程开销。在过滤之前批量处理您的商品。
About pfilter
, it should work but run slower than filter
, since your predicate is simple. Parallelization isn't free so you have to give each thread moderately intensive tasks to offset the multithreading overhead. Batch your items before filtering them.
这篇关于为什么Clojure在执行我的计算后挂起?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!