如何理解Clojure的lazy-seq [英] How to understand clojure's lazy-seq
问题描述
我试图理解clojure的lazy-seq
运算符,以及一般的惰性计算概念.我知道该概念背后的基本思想:表达式的求值被延迟,直到需要该值为止.
I'm trying to understand clojure's lazy-seq
operator, and the concept of lazy evaluation in general. I know the basic idea behind the concept: Evaluation of an expression is delayed until the value is needed.
通常,这可以通过两种方式实现:
In general, this is achievable in two ways:
- 在编译时使用宏或特殊形式;
- 在运行时使用lambda函数
使用惰性评估技术,可以构造评估为已使用的无限数据结构.这些无限序列利用lambda,闭包和递归. Clojure中,这些无限数据结构是使用lazy-seq
和cons
形式生成的.
With lazy evaluation techniques, it is possible to construct infinite data structures that are evaluated as consumed. These infinite sequences utilizes lambdas, closures and recursion. In clojure, these infinite data structures are generated using lazy-seq
and cons
forms.
我想了解lazy-seq
是怎么做到的.我知道它实际上是一个宏.请考虑以下示例.
I want to understand how lazy-seq
does it's magic. I know it is actually a macro. Consider the following example.
(defn rep [n]
(lazy-seq (cons n (rep n))))
此处,rep
函数返回类型为LazySeq
的延迟求值序列,现在可以使用序列API对其进行转换和使用(从而求值).该API提供功能take
,map
,filter
和reduce
.
Here, the rep
function returns a lazily-evaluated sequence of type LazySeq
, which now can be transformed and consumed (thus evaluated) using the sequence API. This API provides functions take
, map
, filter
and reduce
.
在展开的表格中,我们可以看到如何使用lambda存储单元格的配方而无需立即对其进行评估.
In the expanded form, we can see how lambda is utilized to store the recipe for the cell without evaluating it immediately.
(defn rep [n]
(new clojure.lang.LazySeq (fn* [] (cons n (rep n)))))
- 但是序列API如何真正与
LazySeq
一起使用? - 以下表达式中实际上发生了什么?
- But how does the sequence API actually work with
LazySeq
? - What actually happens in the following expression?
- 中间操作
map
如何应用于序列, -
take
如何限制序列和 - 终端操作
reduce
如何评估序列? - How is the intermediate operation
map
applied to the sequence, - how does
take
limit the sequence and - how does terminal operation
reduce
evaluate the sequence?
(reduce + (take 3 (map inc (rep 5))))
这些功能如何与Vector
或LazySeq
一起使用?
Also, how do these functions work with either a Vector
or a LazySeq
?
还有,是否可以生成嵌套的无限数据结构 ?:包含列表的列表,包含列表的列表,包含列表的列表...无限广泛和深入,被序列API评估为消耗了吗?
Also, is it possible to generate nested infinite data structures?: list containing lists, containing lists, containing lists... going infinitely wide and deep, evaluated as consumed with the sequence API?
最后一个问题,这之间有什么实际区别
(defn rep [n]
(lazy-seq (cons n (rep n))))
还有这个?
(defn rep [n]
(cons n (lazy-seq (rep n))))
推荐答案
有很多问题!
如果您查看 LazySeq
的类源代码,您会注意到它实现了
If you take a look at LazySeq
's class source code you will notice that it implements ISeq
interface providing methods like first
, more
and next
.
map
,take
和filter
之类的功能是使用lazy-seq
(它们产生延迟序列)以及first
和rest
(它们依次使用more
)构建的,这就是它们的工作方式.通过使用LazySeq
类的first
和more
实现,将延迟seq作为其输入集合.
Functions like map
, take
and filter
are built using lazy-seq
(they produce lazy sequences) and first
and rest
(which in turn uses more
) and that's how they can work with lazy seq as their input collection - by using first
and more
implementations of LazySeq
class.
(reduce + (take 3 (map inc (rep 5))))
关键在于查看LazySeq.first
的工作方式.它将调用包装的函数以获取并记录结果.您的情况将是以下代码:
The key is to look how LazySeq.first
works. It will invoke the wrapped function to obtain and memoize the result. In your case it will be the following code:
(cons n (rep n))
因此它将是一个cons单元格,其中n
作为其值,另一个LazySeq
实例(对rep
的递归调用的结果)作为其rest
部分.它将成为该LazySeq
对象的实现值,而first
将返回缓存的cons单元格的值.
Thus it will be a cons cell with n
as its value and another LazySeq
instance (result of a recursive call to rep
) as its rest
part. It will become the realised value of this LazySeq
object and first
will return the value of the cached cons cell.
在其上调用more
时,它将以相同的方式确保实现特定LazySeq
对象的值(或重用的备忘值)并在其上调用more
(在这种情况下为LazySeq
对象的cons单元上.
When you call more
on it, it will in the same way ensure that the value of the particular LazySeq
object is realised (or reused memoized value) and call more
on it (in this case more
on the cons cell containing another LazySeq
object).
一旦获得带有more
的LazySeq
对象的另一个实例,故事就会在您调用first
时重复.
Once you obtain another instance of LazySeq
object with more
the story repeats when you call first
on it.
map
和take
将创建另一个lazy-seq
,该lazy-seq
将调用作为其参数传递的集合的first
和more
(只是另一个惰性序列),因此将是类似的故事.区别仅在于传递给cons
的值的生成方式(例如,将f
调用为对映射到map
中的LazySeq
值的first
调用的first
获得的值,而不是像n
在您的rep
函数中).
map
and take
will create another lazy-seq
that will call first
and more
of the collection passed as their argument (just another lazy seq) so it will be similar story. The difference will be only in how the values passed to cons
are generated (e.g. calling f
to a value obtained by first
invoked on the LazySeq
value mapped over in map
instead of a raw value like n
in your rep
function).
使用reduce
会更简单,因为它将使用loop
和first
和more
来迭代输入的惰性序列并应用归约函数来产生最终结果.
With reduce
it's a bit simpler as it will use loop
with first
and more
to iterate over the input lazy seq and apply the reducing function to produce the final result.
由于map
和take
的实际实现看起来很像,我鼓励您检查它们的源代码-相当容易理解.
As the actual implementation looks like for map
and take
I encourage you to check their source code - it's quite easy to follow.
如上所述,map
,take
和其他功能根据first
和rest
起作用(提醒-rest
在more
之上实现).因此,我们需要解释first
和rest
/more
如何与不同的集合类型一起使用:它们检查集合是否实现了ISeq
(然后直接实现了这些功能),或者它们尝试创建了first
和more
的实现.
As mentioned above, map
, take
and other functions work in terms of first
and rest
(reminder - rest
is implemented on top of more
). Thus we need to explain how first
and rest
/more
can work with different collection types: they check if the collection implements ISeq
(and then it implement those functions directly) or they try to create a seq
view of the collection and coll its implementation of first
and more
.
这绝对有可能,但是我不确定您想要获得什么确切的数据形状.您是说要获得一个懒惰的seq来生成另一个序列作为它的值(而不是像rep
中的n
这样的单个值),但是将其作为平坦序列返回?
It's definitely possible but I am not sure what the exact data shape you would like to get. Do you mean getting a lazy seq which generates another sequence as it's value (instead of a single value like n
in your rep
) but returns it as a flat sequence?
(defn nested-cons [n]
(lazy-seq (cons (repeat n n) (nested-cons (inc n)))))
(take 3 (nested-cons 1))
;; => ((1) (2 2) (3 3 3))
宁愿返回(1 2 2 3 3 3)
吗?
在这种情况下,您可以使用concat
而不是cons
来创建由两个或多个序列组成的惰性序列:
For such cases you might use concat
instead of cons
which creates a lazy sequence of two or more sequences:
(defn nested-concat [n]
(lazy-seq (concat (repeat n n) (nested-concat (inc n)))))
(take 6 (nested-concat 1))
;; => (1 2 2 3 3 3)
与此有什么实际区别
(defn rep [n]
(lazy-seq (cons n (rep n))))
还有这个?
(defn rep [n]
(cons n (lazy-seq (rep n))))
在这种特殊情况下并非如此.但是,在cons单元格不包装原始值而是函数调用结果以计算原始值的情况下,后一种形式不是完全惰性的.例如:
In this particular case not really. But in the case where a cons cell doesn't wrap a raw value but a result of a function call to calculate it, the latter form is not fully lazy. For example:
(defn calculate-sth [n]
(println "Calculating" n)
n)
(defn rep1 [n]
(lazy-seq (cons (calculate-sth n) (rep1 (inc n)))))
(defn rep2 [n]
(cons (calculate-sth n) (lazy-seq (rep2 (inc n)))))
(take 0 (rep1 1))
;; => ()
(take 0 (rep2 1))
;; Prints: Calculating 1
;; => ()
因此,即使您可能不需要后一种形式,它也会评估其第一个元素.
Thus the latter form will evaluate its first element even if you might not need it.
这篇关于如何理解Clojure的lazy-seq的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!