如何在HTML中使用JAXB? [英] How to use JAXB with HTML?

查看:119
本文介绍了如何在HTML中使用JAXB?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用JAXB将一些令人讨厌的HTML解组为Java对象。 (我在使用Java 7)。

I would like to unmarshall some nasty HTML to a Java object using JAXB. (I'm on Java 7).

Tagsoup是一个符合SAX标准的XML解析器,可以处理令人讨厌的HTML。

Tagsoup is a SAX-compliant XML parser that can handle nasty HTML.

如何设置JAXB以使用Tagsoup来解组HTML?

How can I setup JAXB to use Tagsoup for unmarshalling HTML?

我尝试设置System.setProperty(org.xml.sax.driver,org。 ccil.cowan.tagsoup.Parser);

I tried setting System.setProperty("org.xml.sax.driver", "org.ccil.cowan.tagsoup.Parser");

如果我创建XMLReader,它使用Tagsoup,但不是在我使用JAXB时。

If I create an XMLReader, it uses Tagsoup, but not when I use JAXB.


  1. com.sun.xml.internal.bind.v2.runtime.JAXBContextImpl是否使用DOM或SAX来解析XML?

  1. Does com.sun.xml.internal.bind.v2.runtime.JAXBContextImpl use DOM or SAX for parsing XML?

如何判断JAXB使用SAX?

How can I tell JAXB to use SAX?

如何判断JAXB使用TagSoup作为SAX实现?

How can I tell JAXB to use TagSoup as it's SAX implementation?

按照Blaise的建议,尝试下面,但在最后一行得到SAXParseException。仅使用XMLReader完成解析:

As per Blaise's suggesting, tried below, but getting SAXParseException on the last line. The parse is fine when done with the XMLReader only:

    JAXBContext jaxbContext = JAXBContext.newInstance(Thing.class);
    Unmarshaller unmarshaller = jaxbContext.createUnmarshaller();

    XMLReader xmlReader = new org.ccil.cowan.tagsoup.Parser();

    xmlReader.parse("file:///c:/test.xml");
    System.out.println("parse ok");

    xmlReader.setContentHandler(unmarshaller.getUnmarshallerHandler());

    //SAXParseException; systemId: file:/c:/test.xml; lineNumber: 5; columnNumber: 3; The element type "br" must be terminated by the matching end-tag "</br>".
    Thing thing = (Thing) unmarshaller.unmarshal(new File("c:/test.xml"));


推荐答案

你可以得到一个 UnmarshallerHandler 来自 Unmarshaller 并在SAX解析器上将其设置为 ContentHandler 。在执行SAX解析后,从 UnmarshallerHandler 获取对象。

You can get an UnmarshallerHandler from an Unmarshaller and set that as the ContentHandler on your SAX parser. After you do the SAX parse obtain the object from the UnmarshallerHandler.

UnmarshallerHandler unmarshallerHandler = unmarshaller.getUnmarshallerHandler();
xmlReader.setContentHandler(unmarshallerHandler);
xmlReader.parse(...);
Thing thing = (Thing) unmarshallerHandler.getResult();

我的博客上有一个例子:

There is an example of this on my blog:

  • http://blog.bdoughan.com/2011/05/jaxb-and-dtd.html

这篇关于如何在HTML中使用JAXB?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆