scala.xml.RuleTransformer 的复杂性真的是指数级的吗? [英] Is complexity of scala.xml.RuleTransformer really exponential?

查看:47
本文介绍了scala.xml.RuleTransformer 的复杂性真的是指数级的吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是我之前的一个的后续帖子.

This is a follow-up to one of my previous posts.

我试图理解为什么 RuleTransformer 性能太差了.现在我相信它这么慢是因为它的复杂度是 O(2n),其中 n 是输入 XML 树的高度.

I tried to understand why the RuleTransformer performance is so poor. Now I believe that it is so slow because its complexity is O(2n), where n is the height of the input XML tree.

假设我需要将所有元素的所有标签重命名为标签b":

Suppose I need to rename all labels of all elements to label "b":

import scala.xml._, scala.xml.transform._

val rule: RewriteRule = new RewriteRule() {
  override def transform(node: Node): Seq[Node] = node match {
    case e: Elem => e.copy(label = "b")
    case other => other
  }
}

def trans(node: Node): Node = new RuleTransformer(rule).apply(node)

让我们计算transform访问输入中每个节点的次数.
为了统计访问次数,我们添加了一个缓冲区visited,在开始时初始化它,存储访问过的节点,最后打印出来.

Let's count how many times the transform visits each node in input <a3><a2><a1/></a2></a3>.
In order to count the visits we add a buffer visited, init it in the beginning, store visited nodes, and print it in the end.

import scala.collection.mutable.ListBuffer

// buffer to store visited nodes
var visited: ListBuffer[Node] = ListBuffer[Node]()

val rule: RewriteRule = new RewriteRule() {
  override def transform(n: Node): Seq[Node] = {
    visited append (n) // count this visit
    n match {
      case e: Elem => e.copy(label = "b")
      case other => other
    }
  }
}

def trans(node: Node): Node = {
  visited = ListBuffer[Node]() // init the buffer
  val r = new RuleTransformer(rule).apply(node)
  // print visited nodes and numbers of visits 
  println(visited.groupBy(identity).mapValues(_.size).toSeq.sortBy(_._2))
  r
}

现在让我们在 REPL 中运行它并查看 visited

Now let's run it in REPL and see the visited

scala> val a3 = <a3><a2><a1/></a2></a3>
a3: scala.xml.Elem = <a3><a2><a1/></a2></a3>

scala> trans(a3)
ArrayBuffer((<a3><b><b/></b></a3>,2), (<a2><b/></a2>,4), (<a1/>,8))
res1: scala.xml.Node = <b><b><b/></b></b>

所以 a1 被访问了 8 次.

So a1 is visited eight times.

如果我们变换 那么a1 将被访问 16 次,对于 -- 32,等等.所以复杂度看起来指数.

If we transform <a4><a3><a2><a1/></a2></a3></a4> then a1 will be visited 16 times, for <a5><a4><a3><a2><a1/></a2></a3></a4></a5> -- 32, etc. So the complexity looks exponential.

有意义吗?你将如何通过分析 代码?

Does it make sense ? How would you prove it by analysis of the code ?

推荐答案

这不是一个非常严谨的分析,但问题似乎出在 BasicTransformer 的 transform(Seq[Node]) 方法[1].

This will be not a very rigours analysis, but the problem seems to be with the BasicTransformer's transform(Seq[Node]) method[1].

子变换方法将针对更改的节点调用两次.特别是在您的示例中,因此所有节点都将被调用两次.你是对的,高度为 h 的每个节点将被称为 2^(h-1) 次.还要注意,一个节点的一个子节点最多会被调用两次,因为使用了span和code,在具体的例子中,节点的所有子节点都会被调用两次.

The child transform method will be called twice for a node which is changed. Specifically in your example all the nodes will be called twice for this reason. And you are right in that each node at height h will be called 2^(h-1) times. Also note that at most one child of a node will be called twice because use of span and code, in the specific example all the child of the nodes will be called twice.

为了验证这一点,我们为修改后的 RulesTransformer 编写了这些代码示例.(我也可以覆盖 RulesTransformer.但无论如何)

Just to verify this wrote these code samples for modified RulesTransformer. ( I could have overriden RulesTransformer too. But anyway)

// This is same as library RuleTransformer added with print statement
class CopiedRuleTransformer(rules: RewriteRule*) extends BasicTransformer {
    override def transform(n: Node): Seq[Node] = {
        println(n)
        rules.foldLeft(super.transform(n)) { (res, rule) => rule transform res }
    }
}

我修改后的 RuleTransformer

My modified RuleTransformer

class MyRuleTransformer(rules: RewriteRule*) extends BasicTransformer {
    override def transform(n: Node): Seq[Node] = {
        println(n)
        rules.foldLeft(super.transform(n)) { (res, rule) => rule transform res }
    }  
    override def transform(ns:Seq[Node]): Seq[Node] = {
        ns.flatMap(n => transform(n))
    } 
}

这些代码仅用于演示目的.您可以将这些称为

These codes are only for demonstrating purpose. You can call these as

new CopiedRuleTransformer(rule).apply(node)

new MyRuleTransformer(rule).apply(node)

[1] 行:34 https://github.com/scala/scala-xml/blob/master/src/main/scala/scala/xml/transform/BasicTransformer.scala

这篇关于scala.xml.RuleTransformer 的复杂性真的是指数级的吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆