Scala HTML解析器对象的使用情况 [英] Scala HTML parser object usage

查看:147
本文介绍了Scala HTML解析器对象的使用情况的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用HTML解析器解析HTML字符串:

  import nu.validator.htmlparser。{sax,common } 
import sax.HtmlParser
import common.XmlViolationPolicy
$ b $ val source = Source.fromString(response)
val html = new models.HTML5Parser
val htmlObject = html.loadXML(source)

如何为对象中的特定元素提取值?我可以通过以下方式获取孩子和标签:

  val child = htmlObject.child(1).label 

但我不知道如何获取孩子的内容。此外,我不知道如何迭代子对象。

解决方案

目前还不清楚 HTML5Parser 类来自于,但我会假设它是这个例子(或类似的东西)。在这种情况下,您的 htmlObject 只是一个 scala.xml.Node 。首先进行一些设置:

  val source = Source.fromString(
< html>< head /> ;< body>< div class ='main'>< span>测试< / span>< / div>< / body>< / html>


val htmlObject = html.loadXML(source)

现在您可以执行以下操作:

  scala> htmlObject.child(1).label 
res0:String = body

scala> htmlObject.child(1).child(0).child(0).text
res1:String = test

scala> (htmlObject \\span)。text
res2:String = test

scala> (htmlObject \body\div\span)。text
res3:String = test

scala> (htmlObject \\\div)。head.attributes.asAttrMap
res4:Map [String,String] = Map(class - > main)
pre>

等等。


I am using the HTML parser to parse an HTML string:

import nu.validator.htmlparser.{sax,common}
import sax.HtmlParser
import common.XmlViolationPolicy

val source = Source.fromString(response)
val html = new models.HTML5Parser
val htmlObject = html.loadXML(source)

How do I pull values for specific elements in the object? I can get the child and the label using this:

val child = htmlObject.child(1).label

But I don't know how to get the content of the child. Also, I don't know how to iterate through the child objects.

解决方案

It's unclear where your HTML5Parser class comes from, but I'm going to assume it's the one in this example (or something similar). In that case your htmlObject is just a scala.xml.Node. First for some setup:

val source = Source.fromString(
  "<html><head/><body><div class='main'><span>test</span></div></body></html>"
)

val htmlObject = html.loadXML(source)

Now you can do the following, for example:

scala> htmlObject.child(1).label
res0: String = body

scala> htmlObject.child(1).child(0).child(0).text
res1: String = test

scala> (htmlObject \\ "span").text
res2: String = test

scala> (htmlObject \ "body" \ "div" \ "span").text
res3: String = test

scala> (htmlObject \\ "div").head.attributes.asAttrMap
res4: Map[String,String] = Map(class -> main)

Etcetera.

这篇关于Scala HTML解析器对象的使用情况的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆