二叉搜索clojure(实现/性能) [英] Binary search in clojure (implementation / performance)

查看:114
本文介绍了二叉搜索clojure(实现/性能)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我写了一个二进制搜索函数作为一个更大的程序的一部分,但它似乎比它应该是慢,剖析显示了很多调用clojure.lang.Numbers中的方法。



我的理解是,Clojure可以使用原语,当它可以确定它可以这样做。对clojure.lang.Numbers中的方法的调用似乎表明它没有在这里使用原语。



如果我将循环变量强制为int,它适当地抱怨recur参数不是原始的。如果我强迫这些,代码工作,但再次,它是缓慢的。我只想知道(quot; low-idx high-idx)2)不是产生一个原始,但我不知道从这里去哪里。 p>

这是我第一个在Clojure中的程序,所以如果有更清洁/功能/ Clojure的方式做任何事情,请随时让我知道。

 (defn binary-search 
[coll coll-size target]
(let [cnt(dec coll-size)]
(let [mid-idx(quot; low-idx high-idx))
nil
high-idx)2)mid-val(coll mid-idx)]
(cond
(= mid-val target)mid-idx
(inc mid-idx)high-idx)
(> mid-val target)(recur low-idx(dec mid-idx))
))))))

(defn binary-search-perf-test
[test-size]
(do
(let [test-set(vec(range 1(inc test-size))) -set-size(count test-set)]
(time(count(map#(binary-search2 test-set test-set-size%)test-set)))
)))


解决方案

首先, java.util.Collections

 (java.util.Collections / binarySearch [0 1 2 3] 2 compare)
; => 2

如果跳过 compare



对于你的纯Clojure实现,你可以提示 coll-size 与参数向量中的 ^ long - 或者可能只需要在函数体的开始处请求向量的大小一个非常快,恒定时间的操作),用(bit-shift-right ... 1)替换(quot ... 2) 并使用未检查的数学计算索引。有了一些额外的调整,二进制搜索可以写成如下:

 (defn binary-search 
在xs(一个向量)中使用二进制搜索的x的值。
([xs x]
(loop [l 0 h(unchecked-dec(count xs))]
; = h(inc l))
(cond
(== x(xs l))l
(== x(xs h))h
:else nil)
(let [m(unchecked-add l(bit-shift-right(unchecked-subtract hl)1))]
(if(<(xs m)x)
(unchecked-inc m)h)
(recur lm)))))))

这仍然明显慢于Java变体:

 (defn java-binsearch [xs x] 
.util.Collections / binarySearch xs x compare))

binary- / code>如上定义似乎比这个 java-binsearch 需要大约多25%的时间。


I wrote a binary search function as part of a larger program, but it seems to be slower than it should be and profiling shows a lot of calls to methods in clojure.lang.Numbers.

My understanding is that Clojure can use primitives when it can determine that it can do so. The calls to the methods in clojure.lang.Numbers seems to indicate that it's not using primitives here.

If I coerce the loop variables to ints, it properly complains that the recur arguments are not primitive. If i coerce those too, the code works again but again it's slow. My only guess is that (quot (+ low-idx high-idx) 2) is not producing a primitive but I'm not sure where to go from here.

This is my first program in Clojure so feel free to let me know if there are more cleaner/functional/Clojure ways to do something.

(defn binary-search
  [coll coll-size target]
  (let [cnt (dec coll-size)]
    (loop [low-idx 0 high-idx cnt]
      (if (> low-idx high-idx)
        nil
        (let [mid-idx (quot (+ low-idx high-idx) 2) mid-val (coll mid-idx)]
          (cond
            (= mid-val target) mid-idx
            (< mid-val target) (recur (inc mid-idx) high-idx)
            (> mid-val target) (recur low-idx (dec mid-idx))
            ))))))

(defn binary-search-perf-test
  [test-size]
  (do
    (let [test-set (vec (range 1 (inc test-size))) test-set-size (count test-set)]
      (time (count (map #(binary-search2 test-set test-set-size %) test-set)))
    )))

解决方案

First of all, you can use the binary search implementation provided by java.util.Collections:

(java.util.Collections/binarySearch [0 1 2 3] 2 compare)
; => 2

If you skip the compare, the search will be faster still, unless the collection includes bigints, in which case it'll break.

As for your pure Clojure implementation, you can hint coll-size with ^long in the parameter vector -- or maybe just ask for the vector's size at the beginning of the function's body (that's a very fast, constant time operation), replace the (quot ... 2) call with (bit-shift-right ... 1) and use unchecked math for the index calculations. With some additional tweaks a binary search could be written as follows:

(defn binary-search
  "Finds earliest occurrence of x in xs (a vector) using binary search."
  ([xs x]
     (loop [l 0 h (unchecked-dec (count xs))]
       (if (<= h (inc l))
         (cond
           (== x (xs l)) l
           (== x (xs h)) h
           :else nil)
         (let [m (unchecked-add l (bit-shift-right (unchecked-subtract h l) 1))]
           (if (< (xs m) x)
             (recur (unchecked-inc m) h)
             (recur l m)))))))

This is still noticeably slower than the Java variant:

(defn java-binsearch [xs x]
  (java.util.Collections/binarySearch xs x compare))

binary-search as defined above seems to take about 25% more time than this java-binsearch.

这篇关于二叉搜索clojure(实现/性能)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆