Groovy XMLSlurper问题 [英] Groovy XMLSlurper issue

查看：204 发布时间：2018/5/30 10:15:16 xhtml groovy dtd xmlslurper

本文介绍了Groovy XMLSlurper问题的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想用XmlSlurper解析一个我使用HTTPBuilder读取的HTML文档。最初我试图这样做：

  def response = http.get（path：index.php，contentType： TEXT）
 def slurper = new XmlSlurper（）
 def xml = slurper.parse（响应）

但是它会产生一个异常：

  java.io.IOException：服务器返回的HTTP响应代码：503 ：http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd

我找到了解决方法来提供缓存的DTD文件。我发现了一个类的简单实现，它可以帮助这里： p>

  class CachedDTD {
 / ** 
 *将DTD'systemId'作为InputSource返回。 
 * @param publicId 
 * @param systemId 
 * @return InputSource用于本地缓存的DTD。 
 * / 
 def static entityResolver = [
 resolveEntity：{publicId，systemId  - > 
 try {
 String dtd =dtd /+ systemId.split（/）。last（）
 Logger.getRootLogger（）。debugDTD path：$ {dtd} 
 new org.xml.sax.InputSource（CachedDTD.class.getResourceAsStream（dtd））
} catch（e）{
 //e.printStackTrace（）
 Logger.getRootLogger （）.fatal致命错误，e 
 null 
} 
} 
]作为org.xml.sax.EntityResolver 
 
}

我的包树看起来如下所示：

< img src =https://i.stack.imgur.com/1gqF9.jpgalt =alt text>

我修改了一些解析代码响应，所以它看起来像这样：
def response = http.get（path：index.php，contentType：TEXT ） def slurper = new XmlSlurper（） slurper.setEntityResolver（org.yuri.CachedDTD.entityResolver） def xml = slurper.parse（响应）
但现在我得到 java.net.MalformedURLException 。从CachedDTD entityResolver记录的DTD路径是 org / yuri / dtd / xhtml1-transitional.dtd ，我无法正常工作......
解决方案
您可以使用HTML解析与XmlSlurper一起解决这些问题。

http://sourceforge.net/projects/nekohtml/

这里的示例用法

http://groovy.codehaus.org/Testing+Web+Applications

I want to parse with XmlSlurper a HTML document which I read using HTTPBuilder. Initialy I tried to do it this way:

def response = http.get(path: "index.php", contentType: TEXT)
def slurper = new XmlSlurper()
def xml = slurper.parse(response)

But it produces an exception:

java.io.IOException: Server returned HTTP response code: 503 for URL: http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd

I found a workaround to provide cached DTD files. I found a simple implementation of class which should help here:

class CachedDTD {
/**
 * Return DTD 'systemId' as InputSource.
 * @param publicId
 * @param systemId
 * @return InputSource for locally cached DTD.
 */
  def static entityResolver = [
          resolveEntity: { publicId, systemId ->
            try {
              String dtd = "dtd/" + systemId.split("/").last()
              Logger.getRootLogger().debug "DTD path: ${dtd}"
              new org.xml.sax.InputSource(CachedDTD.class.getResourceAsStream(dtd))
            } catch (e) {
              //e.printStackTrace()
              Logger.getRootLogger().fatal "Fatal error", e
              null
            }
          }
  ] as org.xml.sax.EntityResolver

}

My package tree looks as shown below:

I modified also a little code for parsing response, so it looks like this:

def response = http.get(path: "index.php", contentType: TEXT)
def slurper = new XmlSlurper()
slurper.setEntityResolver(org.yuri.CachedDTD.entityResolver)
def xml = slurper.parse(response)

But now I'm getting java.net.MalformedURLException. Logged DTD path from CachedDTD entityResolver is org/yuri/dtd/xhtml1-transitional.dtd and I can't get it working...

解决方案

there is a HTML parse that you could use, in conjunction with XmlSlurper to address these problems

http://sourceforge.net/projects/nekohtml/

Sample useage here

http://groovy.codehaus.org/Testing+Web+Applications

这篇关于Groovy XMLSlurper问题的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Groovy XMLSlurper问题 [英] Groovy XMLSlurper issue

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

Groovy XMLSlurper问题 [英] Groovy XMLSlurper issue

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭