如何从有向无环图推导FRP? [英] How to derive FRP from Directed Acyclic Graphs?

查看:158
本文介绍了如何从有向无环图推导FRP?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在研究我的下一个项目。这是一个预先规划阶段,所以这个问题只是为了了解一下现有技术。

设置



<我有一个具有多个输入和输出的有向无环图(DAG),现在认为是人造神经元网络:



处理这种结构的常用方法是在每个时间)步骤。我相信这是frp库使用的方法,例如 netwire



现在我处于幸运的位置,我有一系列事件,每个事件都呈现 中的变化输入节点。我的想法是,如果我可以静态地知道给定的更改只会影响其中的一部分,我可能不需要在网络中的每个节点。



示例



在上面的图像中,7和3是输入,11和8是'隐藏'的,2,9和10是输出节点。节点5的改变只会影响节点11,实际上节点2,9和10.我不需要处理节点7,3和8。

目标



以尽可能小的延迟处理这种网络。图表的大小可能会达到10万个节点,并且每个节点都会进行适量的计算。



计划



我希望有人会加紧宣传能够完成工作的图书馆X.

否则,我现在的计划是从图形描述中得出每个输入节点的计算结果。可能我会使用 Par monad,这样我就不必自己处理数据依赖关系,并且仍然可以从多核机器中受益。



问题




  • 是否有任何库可以满足我需要?

  • 我的 Par 计划是否可行?这取决于每个节点所需处理的数量是多少?

解决方案

这通常是针对 Applicative Arrow 抽象进行编码。我只会讨论 Applicative 。在 Control.Applicative 中找到的 Applicative 类型类允许通过纯以及将函数应用于<>

  class Functor f =>适用f,其中
- |提起一个值。
pure :: a - > f a

- |顺序应用。
(*):: f(a - > b) - > f a - > fb

您的示例图可以非常简单地编码为 Applicative (用加法替换每个节点)作为

  example1 ::(Applicative f,Num a)=> f a  - > f a  - > f a  - > f(a,a,a)
示例1五七三=
let
十一=纯(+)*五*七
八=纯(+)*七*三
two =纯id *十一
九=纯(+)*十一*八
十=纯(+)*十一*
中的三个
pure(,,)* *两个* 9 *十个

可以通过遍历图形以编程方式从图表的表示形式创建相同的编码,节点将在其所有依赖关系之后被访问。



您可能希望有三种优化方式无法用仅使用编码的网络实现,应用性。一般策略是按照 Applicative 和一些额外的类来编码问题,以便优化或额外功能。然后,您提供一个或多个实施必要课程的口译员。这使您可以将问题从实现中分离出来,从而允许您编写自己的解释器或使用现有的库,如反应反应香蕉 mvc-updates 。我不打算讨论如何编写这些口译员,或者将这里给出的表述适应特定的口译员。我只想讨论为了使底层解释器能够利用这些优化而需要的程序的通用表示。我链接的所有三个库都可以避免重新计算值,并且可以适应以下优化。



可观察共享



在前面的例子中,中间节点 eleven 只定义一次,但在三个不同的地方使用。 Applicative 的实现将无法查看参考透明度,以查看这三个 eleven s是否全部一样。您可以假设实现使用一些 IO魔法来查看参考透明度或定义网络,以便实现可以看到

以下是 Applicative Functor s,其中计算结果可以在多个计算中进行划分和重用。这个类没有在我知道的库中定义。

  class Applicative f =>可分式f其中
(< />):: f a - > (f a - > f b) - > f b

infixl 2< />

您的示例可以非常简单地编码为可分割 Functor 作为

  example2 ::(Divisible f,Num a)=> f a  - > f a  - > f a  - > f(a,a,a)
example2五七三=
pure(+)*五*七< /> \eleven - >
pure(+)*七*三< /> \eight - >
纯id< *>十一< /> \two - >
pure(+)*十一*八个< /> \ nine - >
pure(+)*十一*三< /> \ten - >
pure(,,)* *两个* 9 * 10



总计和阿贝尔团体



A典型的神经元计算其输入的加权和并对该和应用响应函数。对于一个程度很大的神经元,将所有输入相加是耗时的。更新总和的简单优化是减去旧值并添加新值。这利用了三个加法属性:

逆 - a * b *b⁻¹= a 减法是与添加数字相反的数字,这个倒数允许我们从总数中删除之前添加的数字。
$ b $ p 交换性 - a * b = b * a 加法和减法无论执行的顺序如何都会达到相同的结果。这可以让我们在减去旧值时添加相同的结果并添加即使旧值不是最近添加的值,也是新值。

关联性 - a *(b * c)=(a * b)* c 加法和减法可以在任意分组中执行并且仍然达到相同的结果。这可以让我们在减去旧值并将新值添加到总值时达到相同的结果,即使它是在添加中间的某个位置添加了旧值。



具有这些属性以及关闭和身份的任何结构都是 Abelian组。下面的字典为底层库提供了足够的信息来执行相同的优化

 数据Abelian a = Abelian {
id :: a,
inv :: a - > a,
op :: a - > a - > a
}

这是可以总计阿贝尔团体的一类结构

  class总计f其中
total :: Abelian a - > [f a] - > fa

建造地图可能会有类似的优化。

阈值和相等



神经网络中的另一个典型操作是将一个值与一个阈值进行比较,并根据传递的值是否完全门槛。如果对输入的更新没有改变值落在阈值的哪一侧,则响应不会改变。如果响应没有改变,那么没有理由重新计算所有的下游节点。能够检测到阈值 Bool 没有变化或者响应是相等的。以下是可以利用平等的一类结构。 stabilize Eq 实例提供给底层结构

  class Stabilizes f where 
stabilize :: Eq a => f a - > f a


I am currently researching for my next project. This is in a pre-planning phase so this question is just to get an overview on existing technology.

Setup

I have a directed acyclic graph (DAG) with multiple inputs and output, think artificial neuronal network for now:

The common way to process such a structure is to process the whole network on every (time-)step. I believe that is the method used by frp libraries such as netwire.

Now I am in the fortunate position that I have a stream of events that each presents the change in one of the input nodes. The idea is that I probably don't have to step each node in the network if I can statically know that a given change will only affect a part of it.

Example

In the image above 5, 7 and 3 are inputs, 11 and 8 are 'hidden' and 2, 9 and 10 are output nodes. A change at node 5 will only affect node 11 and in effect nodes 2, 9 and 10. I won't need to process nodes 7, 3 and 8.

The goal

Process this kind of network with as little latency as possible. The size of the graphs will probably reach up to 100k nodes, with a moderate amount of calculation being done per node.

The plan

I hope that someone will step up and advertise library X that just gets the job done.

Otherwise my current plan is to derive a calculation per input node from the graph description. Probably I will use the Par monad so that I won't have to deal with data dependencies myself and still get to benefit from multicore machines.

The questions

  • Is there any library out there that just does what I need?
  • Is my Par plan feasible? How much does this depend on the amount of processing needed in each node?

解决方案

Problems like this are usually encoded against either an Applicative or Arrow abstraction. I'll only discuss Applicative. The Applicative type class, found in Control.Applicative, allows values and functions to be provided via pure and functions to be applied to values with <*>.

class Functor f => Applicative f where
    -- | Lift a value.
    pure :: a -> f a

    -- | Sequential application.
    (<*>) :: f (a -> b) -> f a -> f b

Your example graph could be very simply encoded for an Applicative (replacing each node with addition) as

example1 :: (Applicative f, Num a) => f a -> f a -> f a -> f (a, a, a)
example1 five seven three = 
    let
        eleven = pure (+) <*> five   <*> seven
        eight  = pure (+) <*> seven  <*> three
        two    = pure id  <*> eleven
        nine   = pure (+) <*> eleven <*> eight
        ten    = pure (+) <*> eleven <*> three
    in
        pure (,,) <*> two <*> nine <*> ten

The same encoding can be created programmatically from the representation of a graph by traversing the graph such that each node will be visited after all of its dependencies.

There are three optimizations you may desire that can't be implemented for a network encoded using only Applicative. The general strategy is to encode the problem in terms of Applicative and a few additional classes as needed for optimization or extra features. You then provide one or more interpreters that implement the necessary classes. This lets you separate the problem from the implementation, allowing you to write your own interpreter or use an existing library like reactive, reactive-banana, or mvc-updates. I am not going to discuss how to write these interpreters or adapt the representation given here to a specific interpreter. I'm only going to discuss the common representation of the program that's needed in order for the underlying interpreter to be able to exploit these optimizations. All three of the libraries I linked can avoid recomputing values and can accommodate the following optimizations.

Observable Sharing

In the previous example, the intermediate node eleven is only defined once, but is used in three different places. An implementation of Applicative won't be able to see through the referential transparency to see that these three elevens are all the same. You can either assume that the implementation uses some IO magic to peek through referential transparency or define the network so that an implementation can see that a computation is being reused.

The following is the class of Applicative Functors where the result of a computation can be divided and reused in multiple computations. This class isn't defined in a library anywhere that I know of.

class Applicative f => Divisible f where
    (</>) :: f a -> (f a -> f b) -> f b

infixl 2 </>

Your example can then be very simply encoded for a Divisible Functor as

example2 :: (Divisible f, Num a) => f a -> f a -> f a -> f (a, a, a)
example2 five seven three = 
    pure (+) <*> five   <*> seven </> \eleven ->
    pure (+) <*> seven  <*> three </> \eight  ->
    pure id  <*> eleven           </> \two    ->
    pure (+) <*> eleven <*> eight </> \nine   ->
    pure (+) <*> eleven <*> three </> \ten    ->
    pure (,,) <*> two <*> nine <*> ten

Sums and Abelian Groups

A typical neuron computes a weighted sum of its inputs and applies a response function to that sum. For a neuron with a large in degree, summing all of its inputs is time consuming. An easy optimization for updating a sum is to subtract the old value and add the new value. This exploits three properties of addition:

Inverse - a * b * b⁻¹ = a Subtraction is the inverse of adding a number, this inverse allows us to remove the previously added number from the total

Commutativity - a * b = b * a Addition and subtraction reach the same result regardless of the order they are performed in. This lets us reach the same result when we subtract the old value and add the new value to the total even if the old value wasn't the most recently added value.

Associativity - a * (b * c) = (a * b) * c Addition and subtraction can be performed in arbitrary groupings and still reach the same result. This lets us reach the same result when we subtract the old value and add the new value to the total even it the old value was added somewhere in the middle of the additions.

Any structure with these properties as well as closure and an identity is an Abelian group. The following dictionary holds enough information for an underlying library to perform the same optimization

data Abelian a = Abelian {
    id  :: a,
    inv :: a -> a,
    op  :: a -> a -> a
}

This is the class of structures that can total Abelian groups

class Total f where 
    total :: Abelian a -> [f a] -> f a

Similar optimization is possible for the construction of maps.

Thresholding and Equality

Another typical operation in neural networks is to compare a value to a threshold and determine the response entirely based on whether or not the value passed the threshold. If an update to an input doesn't change which side of the threshold the value falls on, the response doesn't change. If the response doesn't change, there's no reason to recalculate all of the downstream nodes. The ability to either detect that there was no change to the threshold Bool or the response is equality. The following is the class of structures that can exploit equality. stabilize provides the Eq instance to the underlying structure.

class Stabilizes f where
    stabilize :: Eq a => f a -> f a

这篇关于如何从有向无环图推导FRP?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆