mapcat打破了懒惰 [英] mapcat breaking the lazyness

查看:145
本文介绍了mapcat打破了懒惰的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个函数产生称为a-function的延迟序列。

I have a function that produces lazy-sequences called a-function.

如果我运行代码:

(map a-function a-sequence-of-values) 


$ b b

它按预期返回一个延迟序列。

it returns a lazy sequence as expected.

但是当我运行代码时:

(mapcat a-function a-sequence-of-values) 

它打破了我的函数的惰性。事实上,它将该代码转换为

it breaks the lazyness of my function. In fact it turns that code into

(apply concat (map a-function a-sequence-of-values)) 

因此,在连接这些值之前,需要实现地图中的所有值。

So it needs to realize all the values from the map before concatenating those values.

我需要一个函数,根据需要连接map函数的结果,而不必预先实现所有的map。

What I need is a function that concatenates the result of a map function on demand without realizing all the map beforehand.

我可以为此添加一个函数:

I can hack a function for this:

(defn my-mapcat
  [f coll]
  (lazy-seq
   (if (not-empty coll)
     (concat
      (f (first coll))
      (my-mapcat f (rest coll))))))

但我不能相信clojure没有已经做过的事情。你知道clojure有这样的功能吗?只有几个人和我有同样的问题吗?

But I can't believe that clojure doesn't have something already done. Do you know if clojure has such feature? Only a few people and I have the same problem?

我也发现一个博客处理相同的问题: http://clojurian.blogspot.com.br/2012/11/beware-of-mapcat.html

I also found a blog that deals with the same issue: http://clojurian.blogspot.com.br/2012/11/beware-of-mapcat.html

推荐答案

延迟序列生产和消费与延迟评估不同。

Clojure函数对其参数进行严格/热切评估。对一个是或产生一个延迟序列的参数的评估不强制实现所产生的延迟序列本身。

Clojure functions do strict/eager evaluation of their arguments. Evaluation of an argument that is or that yields a lazy sequence does not force realization of the yielded lazy sequence in and of itself. However, any side effects caused by evaluation of the argument will occur.

mapcat 的普通用例是:连接序列产生没有副作用。因此,几乎没有必要对某些参数进行热切评估,因为不会产生副作用。

The ordinary use case for mapcat is to concatenate sequences yielded without side effects. Therefore, it hardly matters that some of the arguments are eagerly evaluated because no side effects are expected.

您的函数 my-mapcat 通过将它们封装在thunk(其他延迟seqs)中,对其参数的评估施加额外的惰性。当需要显着的副作用(IO,显着的内存消耗,状态更新)时,这可能很有用。 然而,如果你的函数正在执行副作用,并产生一个要连接的序列,你的代码可能需要重构,那么警告铃声可能应该在你的头上。

Your function my-mapcat imposes additional laziness on the evaluation of its arguments by wrapping them in thunks (other lazy-seqs). This can be useful when significant side effects - IO, significant memory consumption, state updates - are expected. However, the warning bells should probably be going off in your head if your function is doing side effects and producing a sequence to be concatenated that your code probably needs refactoring.

这里类似于algo.monads

Here is similar from algo.monads

(defn- flatten*
  "Like #(apply concat %), but fully lazy: it evaluates each sublist
   only when it is needed."
  [ss]
  (lazy-seq
    (when-let [s (seq ss)]
      (concat (first s) (flatten* (rest s))))))

另一种写 my-mapcat 的方法:

(defn my-mapcat [f coll] (for [x coll, fx (f x)] fx))



< hr>

将函数应用于延迟序列将强制实现该延迟序列的一部分,以满足函数的参数。如果该函数本身产生延迟序列,那么这些不会被实现。


Applying a function to a lazy sequence will force realization of a portion of that lazy sequence necessary to satisfy the arguments of the function. If that function itself produces lazy sequences as a result, those are not realized as a matter of course.

考虑这个函数来计算序列的实现部分

Consider this function to count the realized portion of a sequence

(defn count-realized [s] 
  (loop [s s, n 0] 
    (if (instance? clojure.lang.IPending s)
      (if (and (realized? s) (seq s))
        (recur (rest s) (inc n))
        n)
      (if (seq s)
        (recur (rest s) (inc n))
        n))))

现在让我们看看实现了什么

Now let's see what's being realized

(let [seq-of-seqs (map range (list 1 2 3 4 5 6))
      concat-seq (apply concat seq-of-seqs)]
  (println "seq-of-seqs: " (count-realized seq-of-seqs))
  (println "concat-seq: " (count-realized concat-seq))
  (println "seqs-in-seq: " (mapv count-realized seq-of-seqs)))          

 ;=> seq-of-seqs:  4
 ;   concat-seq:  0
 ;   seqs-in-seq:  [0 0 0 0 0 0]

但是没有一个组件序列被实现,也没有任何实现在连接序列中。

So, 4 elements of the seq-of-seqs got realized, but none of its component sequences were realized nor was there any realization in the concatenated sequence.

为什么4?因为 concat 的适用的arity重载版本需要4个参数 [x y& xs] (计数& )。

Why 4? Because the applicable arity overloaded version of concat takes 4 arguments [x y & xs] (count the &).



Compare to

(let [seq-of-seqs (map range (list 1 2 3 4 5 6))
      foo-seq (apply (fn foo [& more] more) seq-of-seqs)]
  (println "seq-of-seqs: " (count-realized seq-of-seqs))
  (println "seqs-in-seq: " (mapv count-realized seq-of-seqs)))

;=> seq-of-seqs:  2
;   seqs-in-seq:  [0 0 0 0 0 0]

(let [seq-of-seqs (map range (list 1 2 3 4 5 6))
      foo-seq (apply (fn foo [a b c & more] more) seq-of-seqs)]
  (println "seq-of-seqs: " (count-realized seq-of-seqs))
  (println "seqs-in-seq: " (mapv count-realized seq-of-seqs)))

;=> seq-of-seqs:  5
;   seqs-in-seq:  [0 0 0 0 0 0]





$ b b

Clojure有两个解决方案来对参数的计算进行延迟。


Clojure has two solutions to making the evaluation of arguments lazy.

一个是宏。与函数不同,宏不评估它们的参数。

One is macros. Unlike functions, macros do not evaluate their arguments.

这是一个有副作用的函数

Here's a function with a side effect

(defn f [n] (println "foo!") (repeat n n))

序列未实现

user=> (def x (concat (f 1) (f 2)))
foo!
foo!
#'user/x
user=> (count-realized x)
0

Clojure有一个 lazy -cat 宏以防止此情况

Clojure has a lazy-cat macro to prevent this

user=> (def y (lazy-cat (f 1) (f 2)))
#'user/y
user=> (count-realized y)
0
user=> (dorun y)
foo!
foo!
nil
user=> (count-realized y)
3
user=> y
(1 2 2)

不幸的是,您不能 宏。

Unfortunately, you cannot apply a macro.

延迟评估的另一个解决方案是换成thunks,这正是你所做的。

The other solution to delay evaluation is wrap in thunks, which is exactly what you've done.

这篇关于mapcat打破了懒惰的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆