使用 continuation monad 在 `Set`(和其他具有约束的容器)上构建高效的 monad 实例 [英] Constructing efficient monad instances on `Set` (and other containers with constraints) using the continuation monad

查看:30
本文介绍了使用 continuation monad 在 `Set`(和其他具有约束的容器)上构建高效的 monad 实例的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Set,类似于 [] 有一个完美定义的 monadic 操作.问题是它们要求值满足 Ord 约束,因此不可能在没有任何约束的情况下定义 return>>=.同样的问题适用于许多其他需要对可能值进行某种约束的数据结构.

Set, similarly to [] has a perfectly defined monadic operations. The problem is that they require that the values satisfy Ord constraint, and so it's impossible to define return and >>= without any constraints. The same problem applies to many other data structures that require some kind of constraints on possible values.

标准技巧(在 haskell-cafe 帖子中向我建议)) 是将 Set 包装到延续 monad 中.ContT 不关心底层类型函子是否有任何约束.仅在将 Set 包装/解开到/从延续时才需要约束:

The standard trick (suggested to me in a haskell-cafe post) is to wrap Set into the continuation monad. ContT doesn't care if the underlying type functor has any constraints. The constraints become only needed when wrapping/unwrapping Sets into/from continuations:

import Control.Monad.Cont
import Data.Foldable (foldrM)
import Data.Set

setReturn :: a -> Set a
setReturn = singleton

setBind :: (Ord b) => Set a -> (a -> Set b) -> Set b
setBind set f = foldl' (s -> union s . f) empty set

type SetM r a = ContT r Set a

fromSet :: (Ord r) => Set a -> SetM r a
fromSet = ContT . setBind

toSet :: SetM r r -> Set r
toSet c = runContT c setReturn

这可以根据需要工作.例如,我们可以模拟一个非确定性函数,要么将其参数增加 1,要么保持不变:

This works as needed. For example, we can simulate a non-deterministic function that either increases its argument by 1 or leaves it intact:

step :: (Ord r) => Int -> SetM r Int
step i = fromSet $ fromList [i, i + 1]

-- repeated application of step:
stepN :: Int -> Int -> Set Int
stepN times start = toSet $ foldrM ($) start (replicate times step)

确实,stepN 5 0 产生 fromList [0,1,2,3,4,5].如果我们使用 [] monad 代替,我们会得到

Indeed, stepN 5 0 yields fromList [0,1,2,3,4,5]. If we used [] monad instead, we would get

[0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4,1,2,2,3,2,3,3,4,2,3,3,4,3,4,4,5]

相反.

问题在于效率.如果我们调用 stepN 20 0 输出需要几秒钟,而 stepN 30 0 不会在合理的时间内完成.事实证明,所有 Set.union 操作都在最后执行,而不是在每个 monadic 计算之后执行.结果是指数级多的 Set 只在最后被构造和union,这对于大多数任务来说是不可接受的.

The problem is efficiency. If we call stepN 20 0 the output takes a few seconds and stepN 30 0 doesn't finish within a reasonable amount of time. It turns out that all Set.union operations are performed at the end, instead of performing them after each monadic computation. The result is that exponentially many Sets are constructed and unioned only at the end, which is unacceptable for most tasks.

有什么办法可以使这种构造有效?我试过了,但没有成功.

Is there any way around it, to make this construction efficient? I tried but without success.

(我什至怀疑根据 Curry-Howard 同构和 Glivenko 定理可能存在某种理论限制. Glivenko 定理说,对于任何命题重言式 φ,公式 ¬¬φ 都可以用直觉逻辑证明.但是,我怀疑证明的长度 (在正常形式中)可以是指数级的.所以,也许在某些情况下,将计算包装到延续 monad 中会使其指数级更长?)

(I even suspect that there could be some kinds of theoretical limits following from Curry-Howard isomorphism and Glivenko's theorem. Glivenko's theorem says that for any propositional tautology φ the formula ¬¬φ can be proved in intuitionistic logic. However, I suspect that the length of the proof (in normal form) can be exponentially long. So, perhaps, there could be cases when wrapping a computation into the continuation monad will make it exponentially longer?)

推荐答案

Monads 是一种特殊的结构化和排序计算方式.monad 的绑定不能神奇地重组你的计算,以便以更有效的方式发生.您构建计算的方式存在两个问题.

Monads are one particular way of structuring and sequencing computations. The bind of a monad cannot magically restructure your computation so as to happen in a more efficient way. There are two problems with the way you structure your computation.

  1. 当计算stepN 20 0时,step 0的结果将被计算20次.这是因为计算的每一步都会产生 0 作为一个替代,然后将其馈送到下一步,该步骤也产生 0 作为替代,依此类推...

  1. When evaluating stepN 20 0, the result of step 0 will be computed 20 times. This is because each step of the computation produces 0 as one alternative, which is then fed to the next step, which also produces 0 as alternative, and so on...

也许这里的一些记忆会有所帮助.

Perhaps a bit of memoization here can help.

一个更大的问题是 ContT 对计算结构的影响.通过一些等式推理,展开replicate 20 step的结果,foldrM的定义,并根据需要进行多次简化,我们可以看到stepN20 0 相当于:

A much bigger problem is the effect of ContT on the structure of your computation. With a bit of equational reasoning, expanding out the result of replicate 20 step, the definition of foldrM and simplifying as many times as necessary, we can see that stepN 20 0 is equivalent to:

(...(return 0 >>= step) >>= step) >>= step) >>= ...)

该表达式的所有括号都与左边相关.这很好,因为这意味着每次出现 (>>=) 的 RHS 是一个基本计算,即 step,而不是组合计算.但是,放大(>>=) for ContT的定义,

All parentheses of this expression associate to the left. That's great, because it means that the RHS of each occurrence of (>>=) is an elementary computation, namely step, rather than a composed one. However, zooming in on the definition of (>>=) for ContT,

m >>= k = ContT $ c -> runContT m (a -> runContT (k a) c)

我们看到,当评估与左侧关联的 (>>=) 链时,每个绑定都会将新计算推送到当前的延续 c 上.为了说明发生了什么,我们可以再次使用一些等式推理,扩展 (>>=) 的定义和 runContT 的定义,以及简化,产生:

we see that when evaluating a chain of (>>=) associating to the left, each bind will push a new computation onto the current continuation c. To illustrate what is going on, we can use again a bit of equational reasoning, expanding out this definition for (>>=) and the definition for runContT, and simplifying, yielding:

setReturn 0 `setBind`
    (x1 -> step x1 `setBind`
        (x2 -> step x2 `setBind` (x3 -> ...)...)

现在,对于 setBind 的每次出现,让我们问问自己 RHS 参数是什么.对于最左边的出现,RHS 参数是 setReturn 0 之后的整个计算过程.对于第二次出现,就是step x1之后的一切,等等.让我们放大到setBind的定义:

Now, for each occurrence of setBind, let's ask ourselves what the RHS argument is. For the leftmost occurrence, the RHS argument is the whole rest of the computation after setReturn 0. For the second occurrence, it's everything after step x1, etc. Let's zoom in to the definition of setBind:

setBind set f = foldl' (s -> union s . f) empty set

这里的 f 代表所有其余的计算,一切出现在 setBind 的右侧.这意味着在每一步,我们都将计算的其余部分捕获为 f,并且应用 f 的次数与 set.计算不再像以前那样基本,而是组合,并且这些计算将重复多次.

Here f represents all the rest of the computation, everything on the right hand side of an occurrence of setBind. That means that at each step, we are capturing the rest of the computation as f, and applying f as many times as there are elements in set. The computations are not elementary as before, but rather composed, and these computations will be duplicated many times.

问题的关键在于 ContT monad 转换器正在转换计算的初始结构,您指的是 setBind 的左关联链,转换为具有不同结构的计算,即右关联链.这毕竟完全没问题,因为单子定律之一说,对于每个 mfg 我们有

The crux of the problem is that the ContT monad transformer is transforming the initial structure of the computation, which you meant as a left associative chain of setBind's, into a computation with a different structure, ie a right associative chain. This is after all perfectly fine, because one of the monad laws says that, for every m, f and g we have

(m >>= f) >>= g = m >>= (x -> f x >>= g)

然而,单子定律并没有强加复杂性在每个定律的方程的每一侧保持相同.事实上,在这种情况下,构造这个计算的左关联方式要高效得多.setBind 的左关联链立即求值,因为只有基本子计算是重复的.

However, the monad laws do not impose that the complexity remain the same on each side of the equations of each law. And indeed, in this case, the left associative way of structuring this computation is a lot more efficient. The left associative chain of setBind's evaluates in no time, because only elementary subcomputations are duplicated.

事实证明,其他将 Set 硬塞进 monad 的解决方案也遇到同样的问题.特别是,set-monad 包产生了类似的运行时.原因是,它也将左联想表达式改写为右联想表达式.

It turns out that other solutions shoehorning Set into a monad also suffer from the same problem. In particular, the set-monad package, yields similar runtimes. The reason being, that it too, rewrites left associative expressions into right associative ones.

我认为您坚持 Set 遵循 Monad 接口,这已经把手指放在了一个非常重要但相当微妙的问题上.而且我认为它无法解决.问题是monad的绑定类型需要

I think you have put the finger on a very important yet rather subtle problem with insisting that Set obeys a Monad interface. And I don't think it can be solved. The problem is that the type of the bind of a monad needs to be

(>>=) :: m a -> (a -> m b) -> m b

ab 上都不允许有类约束.这意味着我们不能在左侧嵌套绑定,而不首先调用 monad 定律来重写为右关联链.原因如下:给定 (m >>= f) >>= g,计算的类型 (m >>= f)mb 形式.计算 (m >>= f) 的值属于 b 类型.但是因为我们不能在类型变量b上挂任何类约束,所以我们不能知道我们得到的值满足Ord约束,因此不能使用这个value 作为我们希望能够计算 union 的集合的元素.

ie no class constraint allowed on either a or b. That means that we cannot nest binds on the left, without first invoking the monad laws to rewrite into a right associative chain. Here's why: given (m >>= f) >>= g, the type of the computation (m >>= f) is of the form m b. A value of the computation (m >>= f) is of type b. But because we can't hang any class constraint onto the type variable b, we can't know that the value we got satisfies an Ord constraint, and therefore cannot use this value as the element of a set on which we want to be able to compute union's.

这篇关于使用 continuation monad 在 `Set`(和其他具有约束的容器)上构建高效的 monad 实例的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆