Haskell中的变形和树遍历 [英] Catamorphism and tree-traversing in Haskell

查看:164
本文介绍了Haskell中的变形和树遍历的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我很不耐烦,期待了解变态现象相关的这个SO问题



我只练习真实世界Haskell教程的开始。所以,也许我现在要问的方式太多了,如果是这样的话,请告诉我我应该学习的概念。

下面我引用变形术的维基百科代码示例



我想知道你对下面的foldTree的看法,这是遍历Tree的一种方式,与其他SO的问题和答案相比,还涉及遍历Tree n元树遍历。 (独立于二元或不是,我认为下面的变形可以写成为管理n元树)我发表了评论,我理解并且是很高兴你能纠正我,并澄清一些事情。

  { - 这是一个二叉树定义 - } 
数据树a =叶a
|分支(树a)(树a)

{ - 我不了解{}
之间的结构,但是它定义了两个态射,叶和分支
叶带一个并返回一个r,分支需要两个r并返回一个r-}
数据TreeAlgebra ar = TreeAlgebra {leaf :: a - > r
,branch :: r - > r - > r}

{ - foldTree是一种态射体,它需要:树a的Tree代数,其结果为r,树a
并返回r - }
foldTree :: TreeAlgebra ar - >树a - > (TreeAlgebra {leaf = f})(Leaf x)= fx
foldTree a @(TreeAlgebra {branch = g})(Branch lr)= g(foldTree al)(foldTree ar )

此时我有很多困难,我似乎猜测morphism leaf
将应用于任何Leaf
但为了将这些代码用于实际,foldTree需要提供一个定义的TreeAlgebra,
a TreeAlgebra,它具有定义的morphism叶,以便执行某些操作?

,但在这种情况下,我认为{f = leaf}折叠的代码,而不是相反的

任何您的澄清都会非常受欢迎。

解决方案

不完全确定你在问什么。但是,对于您要在树上执行的计算,您提供了 TreeAlgebra foldTree 。例如,要将 Int s树中的所有元素求和,您可以使用下面的代数:

  sumAlgebra :: TreeAlgebra Int Int 
sumAlgebra = TreeAlgebra {leaf = id
,branch =(+)}

这意味着,为了获得叶的总和,对叶中的值应用 id 。要获得分支的总和,请将每个孩子的总和加在一起。



我们可以说(+)用于分支而不是,比如 \xy - > sumTree x + sumTree y 是变质的基本属性。它说,要在某些递归数据结构上计算一些函数 f ,只需为它的直接赋值 f 孩子们。

Haskell是一种非常独特的语言,我们可以抽象地形式化变形的概念。让我们为您的树中的单个节点创建一个数据类型,并对其子节点进行参数化:

 数据TreeNode a child 
= Leaf a
|分行儿童

看看我们在那儿做了什么?我们用我们选择的类型替换了递归子代。这是为了让我们可以在折叠时将子树的总和放在那里。



现在是真正神奇的东西。我将在pseudohaskell中编写它 - 将它编写成真正的Haskell是可能的,但我们必须添加一些注释来帮助typechecker,这可能会引起混淆。我们采用参数化数据类型的固定点 - 也就是构造一个数据类型 T ,使得 T = TreeNode a T 。他们称这个操作符为 Mu

  type Mu f = f(Mu f)

仔细看这里。 Mu 的参数不是类型,例如 Int Foo - >酒吧。它是类型构造函数,例如 Maybe TreeNode Int - Mu 本身需要一个参数。 (对类型构造函数进行抽象化的可能性是Haskell的类型系统在其表现力上真正脱颖而出的原因之一)。

因此,类型 Mu f 定义为取 f 并用 Mu f 本身填充它的类型参数。我将定义一个同义词来减少一些噪音:

  type IntNode = TreeNode Int 

code>

展开 Mu IntNode ,我们得到:

  Mu IntNode = IntNode(Mu IntNode)
= Leaf Int |分支(Mu IntNode)(Mu IntNode)

您是否看到 Mu IntNode 相当于你的 Tree Int ?我们刚刚将递归结构分开,然后使用 Mu 将它重新组合在一起。这给我们带来的好处是,我们可以一次讨论所有 Mu 类型。这给了我们我们需要定义一个变形。



让我们定义:

  type IntTree = Mu IntNode 

我说变形的基本属性是计算一些函数 f ,只需为它的直接子元素赋值 f 即可。让我们调用我们试图计算的东西的类型 r ,以及数据结构 node IntNode 可能是这个的一个实例)。因此,要计算特定节点上的 r ,我们需要节点将其子节点替换为它们的 r s。该计算具有类型节点r - > [R 。所以一个变形说如果我们有这些计算之一,那么我们可以为整个递归结构计算 r (记住,递归在这里被明确地表示 Mu ):

  cata ::(node r  - > r) - > Mu节点 - > r 

对于我们的示例来说,这看起来像这样:

  cata ::(IntNode r  - > r) - > IntTree  - > r 

重申,如果我们可以用 r s并计算一个 r ,那么我们可以为整个树计算一个 r 。为了实际计算,我们需要 node Functor code> - 这是我们需要能够在节点的子节点上映射一个任意函数。

  fmap ::(a  - > b) - >节点a  - >节点b 

可以直接对 IntNode

  fmap f(Leaf x)= Leaf x  - 没有孩子,所以保持不变
fmap f(分支lr)=分支(fl)(fr) - 将函数应用于每个孩子

现在, finally ,我们可以定义 cata Functor节点约束只是说 node 有一个合适的 fmap ):

  cata ::(Functor node)=> (节点r→r)→> Mu节点 - > r 
cata ft = f(fmap(cata f)t)

名称 t 作为树的助记符值。这是一个抽象的,密集的定义,但它非常简单。它说:递归地执行 cata f - 我们在树上进行的计算 - 在每个 t (它们本身 Mu节点 s)来获得节点r ,然后将结果传递给 f 计算结果为 t 本身。

到一开始,你所定义的代数本质上是一种定义节点r - >的方法。 r 函数。的确,给定一个 TreeAlgebra ,我们可以很容易地得到折叠函数:

  foldFunction :: TreeAlgebra ar  - > (TreeNode ar  - > r)
foldFunction alg(Leaf a)= leaf alg a
foldFunction alg(Branch lr)=分支alg lr

因此,树的变形可以用我们的泛型定义如下:

  type Tree a = Mu(TreeNode a)

treeCata :: TreeAlgebra ar - > (Tree a - > r)
treeCata alg = cata(foldFunctionalg)

I没时间了。我知道真的很抽象,但我希望它至少能给你一个新的观点来帮助你学习。祝你好运!


I am impatient, looking forward to understanding catamorphism related to this SO question :)

I have only practiced the beginning of Real World Haskell tutorial. So, Maybe I'm gonna ask for way too much right now, if it was the case, just tell me the concepts I should learn.

Below, I quote the wikipedia code sample for catamorphism.

I would like to know your opinion about foldTree below, a way of traversing a Tree, compared to this other SO question and answer, also dealing with traversing a Tree n-ary tree traversal. (independantly from being binary or not, I think the catamorphism below can be written so as to manage n-ary tree)

I put in comment what I understand, and be glad if you could correct me, and clarify some things.

{-this is a binary tree definition-}
data Tree a = Leaf a
            | Branch (Tree a) (Tree a)

{-I dont understand the structure between{} 
however it defines two morphisms, leaf and branch 
leaf take an a and returns an r, branch takes two r and returns an r-} 
data TreeAlgebra a r = TreeAlgebra { leaf   :: a      -> r
                                   , branch :: r -> r -> r }

{- foldTree is a morphism that takes: a TreeAlgebra for Tree a with result r, a Tree a
and returns an r -} 
foldTree :: TreeAlgebra a r -> Tree a -> r
foldTree a@(TreeAlgebra {leaf   = f}) (Leaf   x  ) = f x
foldTree a@(TreeAlgebra {branch = g}) (Branch l r) = g (foldTree a l) (foldTree a r)

at this point I am having many difficulties, I seem to guess that the morphism leaf will be applied to any Leaf But so as to use this code for real, foldTree needs to be fed a defined TreeAlgebra, a TreeAlgebra that has a defined morphism leaf so as to do something ?
but in this case in the foldTree code I would expect {f = leaf} and not the contrary

Any clarification from you would be really welcome.

解决方案

Not exactly sure what you're asking. But yeah, you feed a TreeAlgebra to foldTree corresponding to the computation you want to perform on the tree. For example, to sum all the elements in a tree of Ints you would use this algebra:

sumAlgebra :: TreeAlgebra Int Int
sumAlgebra = TreeAlgebra { leaf = id
                         , branch = (+) }

Which means, to get the sum of a leaf, apply id (do nothing) to the value in the leaf. To get the sum of a branch, add together the sums of each of the children.

The fact that we can say (+) for branch instead of, say, \x y -> sumTree x + sumTree y is the essential property of the catamorphism. It says that to compute some function f on some recursive data structure it suffices to have the values of f for its immediate children.

Haskell is a pretty unique language in that we can formalize the idea of catamorphism abstractly. Let's make a data type for a single node in your tree, parameterized over its children:

data TreeNode a child
    = Leaf a
    | Branch child child

See what we did there? We just replaced the recursive children with a type of our choosing. This is so that we can put the subtrees' sums there when we are folding.

Now for the really magical thing. I'm going to write this in pseudohaskell -- writing it in real Haskell is possible, but we have to add some annotations to help the typechecker which can be kind of confusing. We take the "fixed point" of a parameterized data type -- that is, constructing a data type T such that T = TreeNode a T. They call this operator Mu.

type Mu f = f (Mu f)

Look carefully here. The argument to Mu isn't a type, like Int or Foo -> Bar. It's a type constructor like Maybe or TreeNode Int -- the argument to Mu itself takes an argument. (The possibility of abstracting over type constructors is one of the things that makes Haskell's type system really stand out in its expressive power).

So the type Mu f is defined as taking f and filling in its type parameter with Mu f itself. I'm going to define a synonym to reduce some of the noise:

type IntNode = TreeNode Int

Expanding Mu IntNode, we get:

Mu IntNode = IntNode (Mu IntNode)
           = Leaf Int | Branch (Mu IntNode) (Mu IntNode)

Do you see how Mu IntNode is equivalent to your Tree Int? We have just torn the recursive structure apart and then used Mu to put it back together again. This gives us the advantage that we can talk about all Mu types at once. This gives us what we need to define a catamorphism.

Let's define:

type IntTree = Mu IntNode

I said the essential property of the catamorphism is that to compute some function f, it suffices to have the values of f for its immediate children. Let's call the type of the thing we are trying to compute r, and the data structure node (IntNode would be a possible instantiation of this). So to compute r on a particular node, we need the node with its children replaced with their rs. This computation has type node r -> r. So a catamorphism says that if we have one of these computations, then we can compute r for the entire recursive structure (remember recursion is denoted explicitly here with Mu):

cata :: (node r -> r) -> Mu node -> r

Making this concrete for our example, this looks like:

cata :: (IntNode r -> r) -> IntTree -> r

Restating, if we can take a node with rs for its children and compute an r, then we can compute an r for an entire tree.

In order to actually compute this, we need node to be a Functor -- that is we need to be able to map an arbitrary function over the children of a node.

fmap :: (a -> b) -> node a -> node b

This can be done straightforwardly for IntNode.

fmap f (Leaf x) = Leaf x                  -- has no children, so stays the same
fmap f (Branch l r) = Branch (f l) (f r)  -- apply function to each child

Now, finally, we can give a definition for cata (the Functor node constraint just says that node has a suitable fmap):

cata :: (Functor node) => (node r -> r) -> Mu node -> r
cata f t = f (fmap (cata f) t)

I used the parameter name t for the mnemonic value of "tree". This is an abstract, dense definition, but it is really very simple. It says: recursively perform cata f -- the computation we are doing over the tree -- on each of t's children (which are themselves Mu nodes) to get a node r, and then pass that result to f compute the result for t itself.

Tying this back to the beginning, the algebra you are defining is essentially a way of defining that node r -> r function. Indeed, given a TreeAlgebra, we can easily get the fold function:

foldFunction :: TreeAlgebra a r -> (TreeNode a r -> r)
foldFunction alg (Leaf a) = leaf alg a
foldFunction alg (Branch l r) = branch alg l r

Thus the tree catamorphism can be defined in terms of our generic one as follows:

type Tree a = Mu (TreeNode a)

treeCata :: TreeAlgebra a r -> (Tree a -> r)
treeCata alg = cata (foldFunction alg)

I'm out of time. I know that got really abstract really fast, but I hope it at least gave you a new viewpoint to help your learning. Good luck!

这篇关于Haskell中的变形和树遍历的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆