mapcat 打破懒惰 [英] mapcat breaking the lazyness

查看:19
本文介绍了mapcat 打破懒惰的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个产生惰性序列的函数,称为 a-function.

I have a function that produces lazy-sequences called a-function.

如果我运行代码:

(map a-function a-sequence-of-values) 

它按预期返回一个惰性序列.

it returns a lazy sequence as expected.

但是当我运行代码时:

(mapcat a-function a-sequence-of-values) 

它打破了我函数的懒惰.事实上,它把代码变成了

it breaks the lazyness of my function. In fact it turns that code into

(apply concat (map a-function a-sequence-of-values)) 

所以它需要在连接这些值之前实现地图中的所有值.

So it needs to realize all the values from the map before concatenating those values.

我需要的是一个函数,它可以在不事先实现所有地图的情况下按需连接地图函数的结果.

What I need is a function that concatenates the result of a map function on demand without realizing all the map beforehand.

我可以为此编写一个函数:

I can hack a function for this:

(defn my-mapcat
  [f coll]
  (lazy-seq
   (if (not-empty coll)
     (concat
      (f (first coll))
      (my-mapcat f (rest coll))))))

但我不敢相信 clojure 还没有完成任何事情.你知道clojure有没有这样的功能?只有少数人和我有同样的问题?

But I can't believe that clojure doesn't have something already done. Do you know if clojure has such feature? Only a few people and I have the same problem?

我还发现了一个处理相同问题的博客:http://clojurian.blogspot.com.br/2012/11/beware-of-mapcat.html

I also found a blog that deals with the same issue: http://clojurian.blogspot.com.br/2012/11/beware-of-mapcat.html

推荐答案

惰性序列的产生和消耗不同于惰性求值.

Clojure 函数对其参数进行严格/急切的评估.对产生或产生惰性序列的参数的评估不会强制实现产生的惰性序列本身.但是,任何由参数求值引起的副作用都会发生.

Clojure functions do strict/eager evaluation of their arguments. Evaluation of an argument that is or that yields a lazy sequence does not force realization of the yielded lazy sequence in and of itself. However, any side effects caused by evaluation of the argument will occur.

mapcat 的普通用例是连接没有副作用的序列.因此,因为预期没有副作用,所以急切地评估某些论点几乎没有关系.

The ordinary use case for mapcat is to concatenate sequences yielded without side effects. Therefore, it hardly matters that some of the arguments are eagerly evaluated because no side effects are expected.

您的函数 my-mapcat 通过将它们包装在 thunk(其他惰性序列)中来对其参数的评估施加额外的惰性.这在预期会产生显着副作用(IO、显着内存消耗、状态更新)时非常有用.但是,如果您的函数正在产生副作用并产生要连接的序列,您的代码可能需要重构,那么警告铃声可能会在您的脑海中响起.

Your function my-mapcat imposes additional laziness on the evaluation of its arguments by wrapping them in thunks (other lazy-seqs). This can be useful when significant side effects - IO, significant memory consumption, state updates - are expected. However, the warning bells should probably be going off in your head if your function is doing side effects and producing a sequence to be concatenated that your code probably needs refactoring.

这里与 algo.monads 类似

Here is similar from algo.monads

(defn- flatten*
  "Like #(apply concat %), but fully lazy: it evaluates each sublist
   only when it is needed."
  [ss]
  (lazy-seq
    (when-let [s (seq ss)]
      (concat (first s) (flatten* (rest s))))))

另一种写my-mapcat的方法:

(defn my-mapcat [f coll] (for [x coll, fx (f x)] fx))

<小时>

将函数应用于惰性序列将强制实现满足函数参数所必需的惰性序列的一部分.如果该函数本身因此产生了惰性序列,那么这些当然不会被实现.


Applying a function to a lazy sequence will force realization of a portion of that lazy sequence necessary to satisfy the arguments of the function. If that function itself produces lazy sequences as a result, those are not realized as a matter of course.

考虑这个函数来计算序列的实现部分

Consider this function to count the realized portion of a sequence

(defn count-realized [s] 
  (loop [s s, n 0] 
    (if (instance? clojure.lang.IPending s)
      (if (and (realized? s) (seq s))
        (recur (rest s) (inc n))
        n)
      (if (seq s)
        (recur (rest s) (inc n))
        n))))

现在让我们看看实现了什么

Now let's see what's being realized

(let [seq-of-seqs (map range (list 1 2 3 4 5 6))
      concat-seq (apply concat seq-of-seqs)]
  (println "seq-of-seqs: " (count-realized seq-of-seqs))
  (println "concat-seq: " (count-realized concat-seq))
  (println "seqs-in-seq: " (mapv count-realized seq-of-seqs)))          

 ;=> seq-of-seqs:  4
 ;   concat-seq:  0
 ;   seqs-in-seq:  [0 0 0 0 0 0]

因此,实现了 seq-of-seq 的 4 个元素,但没有实现其组成序列,也没有实现连接的序列.

So, 4 elements of the seq-of-seqs got realized, but none of its component sequences were realized nor was there any realization in the concatenated sequence.

为什么是 4?因为 concat 适用的 arity 重载版本需要 4 个参数 [x y &xs](计算&).

Why 4? Because the applicable arity overloaded version of concat takes 4 arguments [x y & xs] (count the &).

比较

(let [seq-of-seqs (map range (list 1 2 3 4 5 6))
      foo-seq (apply (fn foo [& more] more) seq-of-seqs)]
  (println "seq-of-seqs: " (count-realized seq-of-seqs))
  (println "seqs-in-seq: " (mapv count-realized seq-of-seqs)))

;=> seq-of-seqs:  2
;   seqs-in-seq:  [0 0 0 0 0 0]

(let [seq-of-seqs (map range (list 1 2 3 4 5 6))
      foo-seq (apply (fn foo [a b c & more] more) seq-of-seqs)]
  (println "seq-of-seqs: " (count-realized seq-of-seqs))
  (println "seqs-in-seq: " (mapv count-realized seq-of-seqs)))

;=> seq-of-seqs:  5
;   seqs-in-seq:  [0 0 0 0 0 0]

<小时>

Clojure 有两种解决方案可以让参数的计算变得惰性.


Clojure has two solutions to making the evaluation of arguments lazy.

一个是宏.与函数不同,宏不评估它们的参数.

One is macros. Unlike functions, macros do not evaluate their arguments.

这是一个有副作用的函数

Here's a function with a side effect

(defn f [n] (println "foo!") (repeat n n))

即使没有实现序列也会产生副作用

Side effects are produced even though the sequence is not realized

user=> (def x (concat (f 1) (f 2)))
foo!
foo!
#'user/x
user=> (count-realized x)
0

Clojure 有一个 lazy-cat 宏来防止这种情况

Clojure has a lazy-cat macro to prevent this

user=> (def y (lazy-cat (f 1) (f 2)))
#'user/y
user=> (count-realized y)
0
user=> (dorun y)
foo!
foo!
nil
user=> (count-realized y)
3
user=> y
(1 2 2)

不幸的是,您不能应用宏.

延迟评估的另一个解决方案是用 thunk 包装,这正是您所做的.

The other solution to delay evaluation is wrap in thunks, which is exactly what you've done.

这篇关于mapcat 打破懒惰的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆