使用严格的函数式编程从poset生成DAG [英] Generate a DAG from a poset using stricly functional programming

查看:120
本文介绍了使用严格的函数式编程从poset生成DAG的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这里是我的问题:我有一个序列S(非空但可能不是不同)集s_i,并且对于每个s_i需要知道S(i≠j)中有多少集s_j是s_i的子集。



我还需要增量性能:一旦我拥有了所有的计数,我可以用一些s_i子集替换一组s_i,并逐步更新计数。 b
$ b

使用纯功能代码执行所有这些将会是一个巨大的优势(我在Scala中编写代码)。

排序,我认为解决我的问题的最好方法是构建一个代表集合的Hasse图的DAG,其中边代表包含,并将一个整数值连接到表示子节点大小的每个节点节点加1.但是,我一直坚持了几天,试图开发从部分排序构建Hasse图的算法(让我们不要谈论增量性!),尽管我认为这将是一些标准的本科材料。 p>

以下是我的数据结构:

  case class HNode [A]( 
val v:A,
val child:List [HNode [A]]){
val rank = 1 + child.map(_。rank).sum
}

我的DAG由一个根的列表和一些偏序:

  class Hasse [A](val po:PartialOrdering [A],val root: List [HNode [A]]){
def +(v:A):Hasse [A] = new Hasse [A](po,add(v,roots))

private def collect(v:A,roots:List [HNode [A]],收藏:List [HNode [A]]):List [HNode [A]] =
if(roots == Nil) $ b else {
val(子集,剩余)= roots.partition(r => po.lteq(rv,v))
collect(v,remaining.map(_。child).flatten,subsets.filter(r =>!collected.exists(c => po.lteq(rv ,cv))):::收集)
}
}

I我在这里很困难。最后我想给DAG添加一个新值v:


  1. 查找DAG中v的所有根子集rs_i即v的子集,使得rs_i的超集不是v的一个子集。这可以通过在图上执行搜索(BFS或DFS)来很容易地完成( collect
  2. 构建新节点n_v,其中的子节点是以前找到的rs_i。

  3. 现在,让我们找出n_v应该附加在哪里:对于给定的根列表,找出v的超集。如果没有找到,则将n_v添加到根,并从根中删除n_v的子集。否则,递归地执行步骤3的超集的孩子。

我还没有完全实现这个算法,但它看起来无疑是circonvoluted和nonoptimal对于我显然很简单的问题。是否有一些更简单的算法可用(Google对此一无所知)?

解决方案

经过一番努力,我终于解决了我的问题,遵循我最初的直觉。收集方法和等级评估是有缺陷的,我用尾递归作为奖励重写了它们。这里是我得到的代码:

  final case class HNode [A](
val v:A,
val child:List [HNode [A]]){
val rank:Int = 1 + count(child,Set.empty)

@tailrec
private def count (stack:List [HNode [A]],c:Set [HNode [A]]):Int =
if(stack == Nil)c.size
else {
val head :: rem = stack
if(c(head))count(rem,c)
else count(head.child ::: rem,c + head)
}
}
$ b $ ...

private def add(v:A,roots:List [HNode [A]]):List [HNode [A]] = {
val newNode = HNode(v,collect(v,roots,Nil))
attach(newNode,roots)
}
$ b $ private def attach(n:HNode (root.contains(n))根
else {
val(超集) ,剩余)= roots.partition {r =>
//严格超集以避免在相同元素的情况下创建循环
po.tryCompare(nv,rv)== Some(-1)
}
if(supersets.isEmpty )n :: remaining.filter(r =>!po.lteq(rv,nv))
else {
supersets.map(s => HNode(sv,attach(n,s。 child)))::: remaining
}
}

@tailrec
private def collect(v:A,stack:List [HNode [A]],收藏:List [HNode [A]]):列表[HNode [A]] =
如果(stack == Nil)收集
else {
val head :: tail = stack
$ b $ if(collect.exists(c => po.lteq(head.v,cv)))collect(v,tail,collected)
else if(po.lteq(head.v ,v))collect(v,tail,head :: :( collect.filter(c =>!po.lteq(cv,head.v))))
else collect(v,head.child :: :tail,collect)
}

我现在必须检查一些优化:
- 收集子集时截断完全不同的分支(如雷克斯Kerr建议)
- 看是否按大小对集合进行排序可以改进流程(如mitchus建议的)

以下问题是工作(最坏情况)的复杂性的add()操作。
当n是集合的数量,d是最大集合的大小时,复杂度可能是O(n2d),但我希望它可以被提炼。这是我的推理:如果所有集合都不同,那么DAG将被缩减为一系列根/叶。因此,每次我尝试向数据结构中添加一个节点时,我仍然需要检查每个已存在的节点(包括收集和附加过程)。这导致1 + 2 + ... + n = n(n + 1)/ 2∈O(n 2)包含检查。每个集合包含测试是O(d ),因此结果。


Here is my problem: I have a sequence S of (nonempty but possibly not distinct) sets s_i, and for each s_i need to know how many sets s_j in S (i ≠ j) are subsets of s_i.

I also need incremental performance: once I have all my counts, I may replace one set s_i by some subset of s_i and update the counts incrementally.

Performing all this using purely functional code would be a huge plus (I code in Scala).

As set inclusion is a partial ordering, I thought the best way to solve my problem would be to build a DAG that would represent the Hasse diagram of the sets, with edges representing inclusion, and join an integer value to each node representing the size of the sub-dag below the node plus 1. However, I have been stuck for several days trying to develop the algorithm that builds the Hasse diagram from the partial ordering (let's not talk about incrementality!), even though I thought it would be some standard undergraduate material.

Here is my data structure :

case class HNode[A] (
  val v: A,
  val child: List[HNode[A]]) {
  val rank = 1 + child.map(_.rank).sum
}

My DAG is defined by a list of roots and some partial ordering:

class Hasse[A](val po: PartialOrdering[A], val roots: List[HNode[A]]) {
  def +(v: A): Hasse[A] = new Hasse[A](po, add(v, roots))

  private def collect(v: A, roots: List[HNode[A]], collected: List[HNode[A]]): List[HNode[A]] =
    if (roots == Nil) collected
    else {
      val (subsets, remaining) = roots.partition(r => po.lteq(r.v, v))
      collect(v, remaining.map(_.child).flatten, subsets.filter(r => !collected.exists(c => po.lteq(r.v, c.v))) ::: collected)
    }
}

I am pretty stuck here. The last I came up to add a new value v to the DAG is:

  1. find all "root subsets" rs_i of v in the DAG, i.e., subsets of v such that no superset of rs_i is a subset of v. This can be done quite easily by performing a search (BFS or DFS) on the graph (collect function, possibly non-optimal or even flawed).
  2. build the new node n_v, the children of which are the previously found rs_i.
  3. Now, let's find out where n_v should be attached: for a given list of roots, find out supersets of v. If none are found, add n_v to the roots and remove subsets of n_v from the roots. Else, perform step 3 recursively on the supersets's children.

I have not yet implemented fully this algorithm, but it seems uncessarily circonvoluted and nonoptimal for my apparently simple problem. Is there some simpler algorithm available (Google was clueless on this)?

解决方案

After some work, I finally ended up solving my problem, following my initial intuition. The collect method and rank evaluation were flawed, I rewrote them with tail-recursion as a bonus. Here is the code I obtained:

final case class HNode[A](
  val v: A,
  val child: List[HNode[A]]) {
  val rank: Int = 1 + count(child, Set.empty)

  @tailrec
  private def count(stack: List[HNode[A]], c: Set[HNode[A]]): Int =
    if (stack == Nil) c.size
    else {
      val head :: rem = stack
      if (c(head)) count(rem, c)
      else count(head.child ::: rem, c + head)
    }
}

// ...

  private def add(v: A, roots: List[HNode[A]]): List[HNode[A]] = {
    val newNode = HNode(v, collect(v, roots, Nil))
    attach(newNode, roots)
  }

  private def attach(n: HNode[A], roots: List[HNode[A]]): List[HNode[A]] =
    if (roots.contains(n)) roots
    else {
      val (supersets, remaining) = roots.partition { r =>
        // Strict superset to avoid creating cycles in case of equal elements
        po.tryCompare(n.v, r.v) == Some(-1)
      }
      if (supersets.isEmpty) n :: remaining.filter(r => !po.lteq(r.v, n.v))
      else {
        supersets.map(s => HNode(s.v, attach(n, s.child))) ::: remaining
      }
    }

  @tailrec
  private def collect(v: A, stack: List[HNode[A]], collected: List[HNode[A]]): List[HNode[A]] =
    if (stack == Nil) collected
    else {
      val head :: tail = stack

      if (collected.exists(c => po.lteq(head.v, c.v))) collect(v, tail, collected)
      else if (po.lteq(head.v, v)) collect(v, tail, head :: (collected.filter(c => !po.lteq(c.v, head.v))))
      else collect(v, head.child ::: tail, collected)
    }

I now must check some optimization: - cut off branches with totally distinct sets when collecting subsets (as Rex Kerr suggested) - see if sorting the sets by size improves the process (as mitchus suggested)

The following problem is to work the (worst case) complexity of the add() operation out. With n the number of sets, and d the size of the largest set, the complexity will probably be O(n²d), but I hope it can be refined. Here is my reasoning: if all sets are distinct, the DAG will be reduced to a sequence of roots/leaves. Thus, every time I try to add a node to the data structure, I still have to check for inclusion with each node already present (both in collect and attach procedures). This leads to 1 + 2 + … + n = n(n+1)/2 ∈ O(n²) inclusion checks.

Each set inclusion test is O(d), hence the result.

这篇关于使用严格的函数式编程从poset生成DAG的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆