如何在Clojure中创建循环(和不可变)数据结构,而不需要额外的间接? [英] How can one create cyclic (and immutable) data structures in Clojure without extra indirection?

查看:122
本文介绍了如何在Clojure中创建循环(和不可变)数据结构,而不需要额外的间接?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要在Clojure中表示有向图。我想将图中的每个节点表示为一个对象(可能是一个记录),它包含一个名为的字段:edges ,它是可直接连接的节点的集合从当前节点。希望不用说,但我希望这些图是不可变的。



我可以使用这种方法构造定向的非循环图我做一个拓扑排序,并从叶子上构建每个图。



但是,这种方法不适用于循环图。我可以想到的一个解决方法是有一个单独的集合(可能是一个地图或向量)的所有边的整个图。然后,每个节点中的:edges 字段将具有到图的边集合中的键(或索引)。添加这个额外的间接级别的工作原理,因为我可以创建键(或索引)之前,他们(将)引用存在,但它感觉像一个kludge。我不仅需要做一个额外的查找每当我想访问一个相邻的节点,但我也必须传递全局边缘集合,这感觉很笨拙。



我听说一些Lisps有一种方法来创建循环列表而不诉诸变异函数。有没有办法在Clojure中创建不变的循环数据结构?

解决方案

我已经玩过这几天了。



我首先尝试让每个节点保存一组对边的引用,并且每个边都保存一组对节点的引用。我在(dosync ...(ref-set ...))类型的操作中将它们设置为彼此相等。我不喜欢这样,因为更改一个节点需要大量的更新,打印出来的图有点棘手。我不得不重写 print-method multimethod,所以repl不会堆栈溢出。还有任何时候,我想添加一个边缘到现有的节点,我不得不从图中提取实际的节点,然后做各种边缘更新和那种事情,以确保每个人都坚持到最新的版本的另一件事。此外,因为事物是在一个参考,确定某事是否连接到别的东西是一个线性时间操作,这似乎不起眼。在确定使用此方法实际执行任何有用的算法之前,我没有得到很多。



然后我尝试了另一种方法,到其他地方。该图是一个clojure映射,其中键是节点(不是节点的引用),值是另一个映射,其中键是相邻节点,每个键的单个值是该节点的边,表示为



看起来像这样,排序为 1 →2,1→3,2→5,5→2

 (def graph {node-1 {node-2 edge12,node-3 edge13},
node-2 {node-5 edge25},
node-3 nil ;; node 3
node-5 {node-2 edge52});;节点2和5有一个无向边

要访问node-1的邻居, $ c>(keys(graph node-1))或调用其他地方定义的函数(neighbors graph node-1) ((图形节点-1)节点2)以从 1-> 2 p>

几个优点:


  1. 图中节点的恒定时间查找

  2. 简单灵活的边界定义。当您向地图中的节点条目添加邻居时,有向边会隐式存在,并且显式地提供其值(或更多信息的结构)。

  3. 不得不查找现有的节点来做任何事情。它是不可变的,所以你可以先定义它,然后将它添加到图形,然后你不必追逐它获得最新版本时,事情改变。如果图表中的连接发生更改,则您可以更改图表结构,而不是节点/边缘本身。

  4. 这将结合了矩阵表示的最佳特征(图形拓扑位于图形地图本身不在节点和边缘编码,恒定时间查找和非突变节点和边缘)和邻接列表(每个节点具有其相邻节点的列表,空间有效,因为您没有任何空格就像一个规范的稀疏矩阵)。

  5. 你可以在节点之间有多个边,如果你不小心定义了一个已经存在的边,地图结构会确保你

  6. 节点和边缘身份由clojure保存。我不必提出任何种类的索引方案或公​​共参考点。映射的键和值它们表示的东西,而不是其他地方或ref的查找。您的节点结构可以是所有nils,只要它是唯一的,它可以在图中表示。

ish的缺点我看到的是,对于任何给定的操作(添加,删除,任何算法),你不能只是传递一个起始节点。你必须传递整个图形图和起始节点,这可能是一个公平的价格,以支付整个事情的简单。另一个小缺点(或者可能不是),对于无向边缘,你必须在每个方向上定义边缘。这实际上是好的,因为有时边缘对每个方向有不同的值,这个方案允许你这样做。



我在这里看到的唯一其他的事情是,因为边在地图中存在键值对时是隐式的,你不能定义超边其连接多于2个节点)。我不认为这是一个大事必然,因为大多数图算法我遇到了(所有?)只处理连接2个节点的边。


I need to represent directed graphs in Clojure. I'd like to represent each node in the graph as an object (probably a record) that includes a field called :edges that is a collection of the nodes that are directly reachable from the current node. Hopefully it goes without saying, but I would like these graphs to be immutable.

I can construct directed acyclic graphs with this approach as long as I do a topological sort and build each graph "from the leaves up".

This approach doesn't work for cyclic graphs, however. The one workaround I can think of is to have a separate collection (probably a map or vector) of all of the edges for an entire graph. The :edges field in each node would then have the key (or index) into the graph's collection of edges. Adding this extra level of indirection works because I can create keys (or indexes) before the things they (will) refer to exist, but it feels like a kludge. Not only do I need to do an extra lookup whenever I want to visit a neighboring node, but I also have to pass around the global edges collection, which feels very clumsy.

I've heard that some Lisps have a way of creating cyclic lists without resorting to mutation functions. Is there a way to create immutable cyclic data structures in Clojure?

解决方案

I've been playing with this the last few days.

I first tried making each node hold a set of refs to edges, and each edge hold a set of refs to the nodes. I set them equal to each other in a (dosync... (ref-set...)) type of operation. I didn't like this because changing one node requires a large amount of updates, and printing out the graph was a bit tricky. I had to override the print-method multimethod so the repl wouldn't stack overflow. Also any time I wanted to add an edge to an existing node, I had to extract the actual node from the graph first, then do all sorts of edge updates and that sort of thing to make sure everyone was holding on to the most recent version of the other thing. Also, because things were in a ref, determining whether something was connected to something else was a linear-time operation, which seemed inelegant. I didn't get very far before determining that actually performing any useful algorithms with this method would be difficult.

Then I tried another approach which is a variation of the matrix referred to elsewhere. The graph is a clojure map, where the keys are the nodes (not refs to nodes), and the values are another map in which the keys are the neighboring nodes and single value of each key is the edge to that node, represented either as a numerical value indicating the strength of the edge, or an edge structure which I defined elsewhere.

It looks like this, sort of, for 1->2, 1->3, 2->5, 5->2

(def graph {node-1 {node-2 edge12, node-3 edge13},
            node-2 {node-5 edge25},
            node-3 nil ;;no edge leaves from node 3
            node-5 {node-2 edge52}) ;; nodes 2 and 5 have an undirected edge

To access the neighbors of node-1 you go (keys (graph node-1)) or call the function defined elsewhere (neighbors graph node-1), or you can say ((graph node-1) node-2) to get the edge from 1->2.

Several advantages:

  1. Constant time lookup of a node in the graph and of a neighboring node, or return nil if it doesn't exist.
  2. Simple and flexible edge definition. A directed edge exists implicitly when you add a neighbor to a node entry in the map, and its value (or a structure for more information) is provided explicitly, or nil.
  3. You don't have to look up the existing node to do anything to it. It's immutable, so you can define it once before adding it to the graph and then you don't have to chase it around getting the latest version when things change. If a connection in the graph changes, you change the graph structure, not the nodes/edges themselves.
  4. This combines the best features of a matrix representation (the graph topology is in the graph map itself not encoded in the nodes and edges, constant time lookup, and non-mutating nodes and edges), and the adjacency-list (each node "has" a list of its neighboring nodes, space efficient since you don't have any "blanks" like a canonical sparse matrix).
  5. You can have multiples edges between nodes, and if you accidentally define an edge which already exists exactly, the map structure takes care of making sure you are not duplicating it.
  6. Node and edge identity is kept by clojure. I don't have to come up with any sort of indexing scheme or common reference point. The keys and values of the maps are the things they represent, not a lookup elsewhere or ref. Your node structure can be all nils, and as long as it's unique, it can be represented in the graph.

The only big-ish disadvantage I see is that for any given operation (add, remove, any algorithm), you can't just pass it a starting node. You have to pass the whole graph map and a starting node, which is probably a fair price to pay for the simplicity of the whole thing. Another minor disadvantage (or maybe not) is that for an undirected edge you have to define the edge in each direction. This is actually okay because sometimes an edge has a different value for each direction and this scheme allows you to do that.

The only other thing I see here is that because an edge is implicit in the existence of a key-value pair in the map, you cannot define a hyperedge (ie one which connects more than 2 nodes). I don't think this is a big deal necessarily since most graph algorithms I've come across (all?) only deal with an edge that connects 2 nodes.

这篇关于如何在Clojure中创建循环(和不可变)数据结构,而不需要额外的间接?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆