什么是单子? [英] What is a monad?

查看:23
本文介绍了什么是单子?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

最近对 Haskell 进行了简要的研究,对于 monad 的本质是什么,有什么简短、简洁、实用的解释?

Having briefly looked at Haskell recently, what would be a brief, succinct, practical explanation as to what a monad essentially is?

我发现我遇到的大多数解释都相当难以理解且缺乏实际细节.

I have found most explanations I've come across to be fairly inaccessible and lacking in practical detail.

推荐答案

第一:如果您不是数学家,术语 monad 有点空洞.另一个术语是计算构建器,它更详细地描述了它们的实际用途.

First: The term monad is a bit vacuous if you are not a mathematician. An alternative term is computation builder which is a bit more descriptive of what they are actually useful for.

它们是链接操作的模式.看起来有点像面向对象语言中的方法链,但机制略有不同.

They are a pattern for chaining operations. It looks a bit like method chaining in object-oriented languages, but the mechanism is slightly different.

该模式主要用于函数式语言(尤其是普遍使用 monad 的 Haskell),但也可以用于任何支持高阶函数(即可以将其他函数作为参数的函数)的语言.

The pattern is mostly used in functional languages (especially Haskell which uses monads pervasively) but can be used in any language which support higher-order functions (that is, functions which can take other functions as arguments).

JavaScript 中的数组支持这种模式,所以让我们用它作为第一个例子.

Arrays in JavaScript support the pattern, so let’s use that as the first example.

模式的要点是我们有一个类型(在本例中为Array),它有一个将函数作为参数的方法.提供的操作必须返回相同类型的实例(即返回 Array).

The gist of the pattern is we have a type (Array in this case) which has a method which takes a function as argument. The operation supplied must return an instance of the same type (i.e. return an Array).

首先是一个使用monad模式的方法链示例:

First an example of method chaining which does not use the monad pattern:

[1,2,3].map(x => x + 1)

结果是[2,3,4].代码不符合 monad 模式,因为我们作为参数提供的函数返回一个数字,而不是一个数组.monad 形式的相同逻辑是:

The result is [2,3,4]. The code does not conform to the monad pattern, since the function we are supplying as an argument returns a number, not an Array. The same logic in monad form would be:

[1,2,3].flatMap(x => [x + 1])

这里我们提供了一个返回 Array 的操作,所以现在它符合模式.flatMap 方法为数组中的每个元素执行提供的函数.它期望一个数组作为每次调用的结果(而不是单个值),但将结果数组集合合并为一个数组.所以最终结果是一样的,数组[2,3,4].

Here we supply an operation which returns an Array, so now it conforms to the pattern. The flatMap method executes the provided function for every element in the array. It expects an array as result for each invocation (rather than single values), but merges the resulting set of arrays into a single array. So the end result is the same, the array [2,3,4].

(提供给诸如 mapflatMap 之类的方法的函数参数在 JavaScript 中通常称为回调".我将其称为操作",因为更一般.)

(The function argument provided to a method like map or flatMap is often called a "callback" in JavaScript. I will call it the "operation" since it is more general.)

如果我们链接多个操作(以传统方式):

If we chain multiple operations (in the traditional way):

[1,2,3].map(a => a + 1).filter(b => b != 3)

结果在数组[2,4]

monad 形式的相同链接:

The same chaining in monad form:

[1,2,3].flatMap(a => [a + 1]).flatMap(b => b != 3 ? [b] : [])

产生相同的结果,数组[2,4].

Yields the same result, the array [2,4].

您会立即注意到 monad 形式比非 monad 更丑!这只是表明 monad 不一定好".它们是一种有时有益有时无益的模式.

You will immediately notice that the monad form is quite a bit uglier than the non-monad! This just goes to show that monads are not necessarily "good". They are a pattern which is sometimes beneficial and sometimes not.

请注意,monad 模式可以以不同的方式组合:

Do note that the monad pattern can be combined in a different way:

[1,2,3].flatMap(a => [a + 1].flatMap(b => b != 3 ? [b] : []))

这里的绑定是嵌套的而不是链式的,但结果是一样的.这是 monad 的一个重要属性,我们将在后面看到.这意味着两个操作组合可以被视为单个操作.

Here the binding is nested rather than chained, but the result is the same. This is an important property of monads as we will see later. It means two operations combined can be treated the same as a single operation.

该操作允许返回不同元素类型的数组,例如将数字数组转换为字符串数组或其他;只要它还是一个数组.

The operation is allowed to return an array with different element types, for example transforming an array of numbers into an array of strings or something else; as long as it still an Array.

这可以使用 Typescript 符号来更正式地描述.数组的类型为 Array,其中 T 是数组中元素的类型.flatMap() 方法采用 T => 类型的函数参数.Array 并返回一个 Array.

This can be described a bit more formally using Typescript notation. An array has the type Array<T>, where T is the type of the elements in the array. The method flatMap() takes a function argument of the type T => Array<U> and returns an Array<U>.

概括地说,monad 是任何类型的 Foo,它具有绑定"接受类型为 Bar => 的函数参数的方法Foo 并返回一个 Foo.

Generalized, a monad is any type Foo<Bar> which has a "bind" method which takes a function argument of type Bar => Foo<Baz> and returns a Foo<Baz>.

这回答了 monad 是什么.这个答案的其余部分将尝试通过示例来解释为什么 monad 在 Haskell 这样的语言中可以成为一种有用的模式,它对它们有很好的支持.

This answers what monads are. The rest of this answer will try to explain through examples why monads can be a useful pattern in a language like Haskell which has good support for them.

Haskell 和 Do-notation

为了将地图/过滤器示例直接转换为 Haskell,我们将 flatMap 替换为 >>= 运算符:

To translate the map/filter example directly to Haskell, we replace flatMap with the >>= operator:

[1,2,3] >>= a -> [a+1] >>=  -> if b == 3 then [] else [b] 

>>= 操作符是 Haskell 中的绑定函数.当操作数为列表时,它与 JavaScript 中的 flatMap 作用相同,但对于其他类型,它的重载含义不同.

The >>= operator is the bind function in Haskell. It does the same as flatMap in JavaScript when the operand is a list, but it is overloaded with different meaning for other types.

但是 Haskell 也有一个专门用于 monad 表达式的语法,do-block,它完全隐藏了绑定操作符:

But Haskell also has a dedicated syntax for monad expressions, the do-block, which hides the bind operator altogether:

 do a <- [1,2,3] 
    b <- [a+1] 
    if b == 3 then [] else [b] 

这隐藏了管道"并让您专注于每一步应用的实际操作.

This hides the "plumbing" and lets you focus on the actual operations applied at each step.

do 块中,每一行都是一个操作.约束仍然认为块中的所有操作必须返回相同的类型.由于第一个表达式是一个列表,其他操作也必须返回一个列表.

In a do-block, each line is an operation. The constraint still holds that all operations in the block must return the same type. Since the first expression is a list, the other operations must also return a list.

后箭头 <- 看起来很像赋值,但请注意,这是在绑定中传递的参数.所以,当右边的表达式是一个整数列表时,左边的变量将是一个单独的整数——但会为列表中的每个整数执行.

The back-arrow <- looks deceptively like an assignment, but note that this is the parameter passed in the bind. So, when the expression on the right side is a List of Integers, the variable on the left side will be a single Integer – but will be executed for each integer in the list.

示例:安全导航(Maybe 类型)

关于列表已经足够了,让我们看看 monad 模式如何对其他类型有用.

Enough about lists, lets see how the monad pattern can be useful for other types.

某些函数可能并不总是返回有效值.在 Haskell 中,这由 Maybe 类型表示,它是一个选项,可以是 Just valueNothing.

Some functions may not always return a valid value. In Haskell this is represented by the Maybe-type, which is an option that is either Just value or Nothing.

总是返回有效值的链式操作当然很简单:

Chaining operations which always return a valid value is of course straightforward:

streetName = getStreetName (getAddress (getUser 17)) 

但是如果任何函数可以返回Nothing怎么办?我们需要单独检查每个结果,如果不是Nothing,则只将值传递给下一个函数:

But what if any of the functions could return Nothing? We need to check each result individually and only pass the value to the next function if it is not Nothing:

case getUser 17 of
      Nothing -> Nothing 
      Just user ->
         case getAddress user of
            Nothing -> Nothing 
            Just address ->
              getStreetName address

相当多的重复检查!想象一下,如果链条更长.Haskell 使用 Maybe 的 monad 模式解决了这个问题:

Quite a lot of repetitive checks! Imagine if the chain was longer. Haskell solves this with the monad pattern for Maybe:

do
  user <- getUser 17
  addr <- getAddress user
  getStreetName addr

这个do块调用Maybe类型的绑定函数(因为第一个表达式的结果是一个Maybe).如果值为Just value,则绑定函数仅执行以下操作,否则它只会传递Nothing.

This do-block invokes the bind-function for the Maybe type (since the result of the first expression is a Maybe). The bind-function only executes the following operation if the value is Just value, otherwise it just passes the Nothing along.

这里使用 monad-pattern 来避免重复代码.这类似于其他一些语言使用宏来简化语法的方式,尽管宏以非常不同的方式实现相同的目标.

Here the monad-pattern is used to avoid repetitive code. This is similar to how some other languages use macros to simplify syntax, although macros achieve the same goal in a very different way.

请注意,是 monad 模式和 Haskell 中对 monad 友好的语法的组合导致代码更清晰.在像 JavaScript 这样对 monad 没有任何特殊语法支持的语言中,我怀疑 monad 模式是否能够在这种情况下简化代码.

Note that it is the combination of the monad pattern and the monad-friendly syntax in Haskell which result in the cleaner code. In a language like JavaScript without any special syntax support for monads, I doubt the monad pattern would be able to simplify the code in this case.

可变状态

Haskell 不支持可变状态.所有变量都是常量,所有值都是不可变的.但是 State 类型可用于模拟具有可变状态的编程:

Haskell does not support mutable state. All variables are constants and all values immutable. But the State type can be used to emulate programming with mutable state:

add2 :: State Integer Integer
add2 = do
        -- add 1 to state
         x <- get
         put (x + 1)
         -- increment in another way
         modify (+1)
         -- return state
         get


evalState add2 7
=> 9

add2 函数构建了一个 monad 链,然后以 7 作为初始状态进行评估.

The add2 function builds a monad chain which is then evaluated with 7 as the initial state.

显然,这仅在 Haskell 中才有意义.其他语言支持开箱即用的可变状态.Haskell 通常是选择加入"的.关于语言功能 - 您可以在需要时启用可变状态,并且类型系统确保效果是显式的.IO 是另一个例子.

Obviously this is something which only makes sense in Haskell. Other languages support mutable state out of the box. Haskell is generally "opt-in" on language features - you enable mutable state when you need it, and the type system ensures the effect is explicit. IO is another example of this.

IO

IO 类型用于链接和执行不纯"函数.

The IO type is used for chaining and executing "impure" functions.

像任何其他实用语言一样,Haskell 有一堆与外界交互的内置函数:putStrLinereadLine 等等.这些函数被称为不纯的",因为它们要么会导致副作用,要么会产生不确定的结果.即使像获取时间这样简单的事情也被认为是不纯的,因为结果是不确定的——用相同的参数调用它两次可能会返回不同的值.

Like any other practical language, Haskell has a bunch of built-in functions which interface with the outside world: putStrLine, readLine and so on. These functions are called "impure" because they either cause side effects or have non-deterministic results. Even something simple like getting the time is considered impure because the result is non-deterministic – calling it twice with the same arguments may return different values.

纯函数是确定性的——它的结果完全取决于传递的参数,除了返回一个值之外,它对环境没有任何副作用.

A pure function is deterministic – its result depends purely on the arguments passed and it has no side effects on the environment beside returning a value.

Haskell 非常鼓励使用纯函数——这是该语言的一个主要卖点.不幸的是,对于纯粹主义者来说,您需要一些不纯的函数来做任何有用的事情.Haskell 的妥协是将纯函数和非纯函数彻底分离,并保证纯函数无法直接或间接执行不纯函数.

Haskell heavily encourages the use of pure functions – this is a major selling point of the language. Unfortunately for purists, you need some impure functions to do anything useful. The Haskell compromise is to cleanly separate pure and impure, and guarantee that there is no way that pure functions can execute impure functions, directly or indirect.

这是通过为所有不纯函数提供 IO 类型来保证的.Haskell 程序的入口点是main 函数,它具有IO 类型,因此我们可以在顶层执行不纯的函数.

This is guaranteed by giving all impure functions the IO type. The entry point in Haskell program is the main function which have the IO type, so we can execute impure functions at the top level.

但是语言是如何防止纯函数执行不纯函数的呢?这是由于 Haskell 的惰性.一个函数只有在它的输出被其他函数消耗时才会被执行.但是没有办法使用 IO 值,除非将它分配给 main.所以如果一个函数想要执行一个非纯函数,它必须连接到 main 并具有 IO 类型.

But how does the language prevent pure functions from executing impure functions? This is due to the lazy nature of Haskell. A function is only executed if its output is consumed by some other function. But there is no way to consume an IO value except to assign it to main. So if a function wants to execute an impure function, it has to be connected to main and have the IO type.

对 IO 操作使用 monad 链接还可以确保它们以线性和可预测的顺序执行,就像命令式语言中的语句一样.

Using monad chaining for IO operations also ensures that they are executed in a linear and predictable order, just like statements in an imperative language.

这让我们看到了大多数人会用 Haskell 编写的第一个程序:

This brings us to the first program most people will write in Haskell:

main :: IO ()
main = do 
        putStrLn "Hello World"

do 关键字在只有一个操作并因此没有绑定的情况下是多余的,但为了一致性我还是保留了它.

The do keyword is superfluous when there is only a single operation and therefore nothing to bind, but I keep it anyway for consistency.

() 类型的意思是空".这种特殊的返回类型仅对因副作用而调用的 IO 函数有用.

The () type means "void". This special return type is only useful for IO functions called for their side effect.

更长的例子:

main = do
    putStrLn "What is your name?"
    name <- getLine
    putStrLn "hello" ++ name

这构建了一系列 IO 操作,并且由于它们被分配给 main 函数,因此它们会被执行.

This builds a chain of IO operations, and since they are assigned to the main function, they get executed.

IOMaybe 进行比较显示了 monad 模式的多功能性.对于Maybe,该模式用于通过将条件逻辑移动到绑定函数来避免重复代码.对于IO,该模式用于确保IO 类型的所有操作都是有序的,并且IO 操作不会泄漏".到纯函数.

Comparing IO with Maybe shows the versatility of the monad pattern. For Maybe, the pattern is used to avoid repetitive code by moving conditional logic to the binding function. For IO, the pattern is used to ensure that all operations of the IO type are sequenced and that IO operations cannot "leak" to pure functions.

总结

在我的主观意见中,monad 模式只有在一种对模式有一些内置支持的语言中才真正值得.否则它只会导致过于复杂的代码.但是 Haskell(和其他一些语言)有一些内置的支持,可以隐藏繁琐的部分,然后该模式可以用于各种有用的事情.喜欢:

In my subjective opinion, the monad pattern is only really worthwhile in a language which has some built-in support for the pattern. Otherwise it just leads to overly convoluted code. But Haskell (and some other languages) have some built-in support which hides the tedious parts, and then the pattern can be used for a variety of useful things. Like:

  • 避免重复代码(Maybe)
  • 为程序的分隔区域添加可变状态或异常等语言功能.
  • 将讨厌的东西与好的东西隔离开来 (IO)
  • 嵌入式领域特定语言(Parser)
  • 将 GOTO 添加到语言中.

这篇关于什么是单子?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆