哪个是java中用于XML解析的最佳库 [英] Which is the best library for XML parsing in java

查看:148
本文介绍了哪个是java中用于XML解析的最佳库的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在搜索java库以解析XML(复杂的配置和数据文件),我搜索了一下但是找不到除了dom4j(看起来他们正在使用V2)..我已经看了下面的公共配置但不喜欢它,其他关于XML的apache项目似乎处于休眠状态。我没有自己评估dom4j但只是想知道 - java有其他(好的)开源xml解析库吗?你对dom4j的体验如何?

I'm searching the java library for parsing XML (complex configuration and data files), I googled a bit but couldn't found other than dom4j (Seems like they are working on V2).. I have taken look at commons configuration but didn't liked it, Other apache projects on XML seems under hibernation. I haven't evaluated dom4j by myself but just wanted to know - Do java has other (Good) open source xml parsing library? and how's your experience with dom4j?

在@Voo的回答之后让我问另一个 - 我应该在内置类或任何第三个库中使用java,比如dom4j ..什么是优势?

After the @Voo's answer let me ask another one - Should I use java's in built classes or any third library like dom4j.. What are the advantages?

推荐答案

实际上Java支持4种方法来解析开箱即用的XML:

Actually Java supports 4 methods to parse XML out of the box:

DOM Parser / Builder:整个XML结构被加载到内存中,您可以使用众所周知的DOM方法来处理它。 DOM还允许您使用Xslt转换写入文档。
示例:

DOM Parser/Builder: The whole XML structure is loaded into memory and you can use the well known DOM methods to work with it. DOM also allows you to write to the document with Xslt transformations. Example:

public static void parse() throws ParserConfigurationException, IOException, SAXException {
    DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
    factory.setValidating(true);
    factory.setIgnoringElementContentWhitespace(true);
    DocumentBuilder builder = factory.newDocumentBuilder();
    File file = new File("test.xml");
    Document doc = builder.parse(file);
    // Do something with the document here.
}

SAX Parser:仅读取XML文档。 Sax解析器遍历文档并调用用户的回调方法。有开始/结束文档,元素等的方法。它们在org.xml.sax.ContentHandler中定义,并且有一个空助手类DefaultHandler。

SAX Parser: Solely to read a XML document. The Sax parser runs through the document and calls callback methods of the user. There are methods for start/end of a document, element and so on. They're defined in org.xml.sax.ContentHandler and there's an empty helper class DefaultHandler.

public static void parse() throws ParserConfigurationException, SAXException {
    SAXParserFactory factory = SAXParserFactory.newInstance();
    factory.setValidating(true);
    SAXParser saxParser = factory.newSAXParser();
    File file = new File("test.xml");
    saxParser.parse(file, new ElementHandler());    // specify handler
}

StAx读取器/写入器:这适用于面向数据流的接口。当程序就像光标/迭代器一样准备好时,程序会询问下一个元素。您也可以使用它创建文档。
阅读文档:

StAx Reader/Writer: This works with a datastream oriented interface. The program asks for the next element when it's ready just like a cursor/iterator. You can also create documents with it. Read document:

public static void parse() throws XMLStreamException, IOException {
    try (FileInputStream fis = new FileInputStream("test.xml")) {
        XMLInputFactory xmlInFact = XMLInputFactory.newInstance();
        XMLStreamReader reader = xmlInFact.createXMLStreamReader(fis);
        while(reader.hasNext()) {
            reader.next(); // do something here
        }
    }
}

写document:

public static void parse() throws XMLStreamException, IOException {
    try (FileOutputStream fos = new FileOutputStream("test.xml")){
        XMLOutputFactory xmlOutFact = XMLOutputFactory.newInstance();
        XMLStreamWriter writer = xmlOutFact.createXMLStreamWriter(fos);
        writer.writeStartDocument();
        writer.writeStartElement("test");
        // write stuff
        writer.writeEndElement();
    }
}

JAXB:读取XML文档的最新实现:是v2中Java 6的一部分。这允许我们从文档中序列化java对象。您使用实现javax.xml.bind.Unmarshaller接口的类读取文档(您可以从JAXBContext.newInstance获取此类)。必须使用已使用的类初始化上下文,但您只需指定根类,而不必担心静态引用的类。
您使用注释来指定哪些类应该是元素(@XmlRootElement)以及哪些字段是元素(@XmlElement)或属性(@XmlAttribute,这真是一个惊喜!)

JAXB: The newest implementation to read XML documents: Is part of Java 6 in v2. This allows us to serialize java objects from a document. You read the document with a class that implements a interface to javax.xml.bind.Unmarshaller (you get a class for this from JAXBContext.newInstance). The context has to be initialized with the used classes, but you just have to specify the root classes and don't have to worry about static referenced classes. You use annotations to specify which classes should be elements (@XmlRootElement) and which fields are elements(@XmlElement) or attributes (@XmlAttribute, what a surprise!)

public static void parse() throws JAXBException, IOException {
    try (FileInputStream adrFile = new FileInputStream("test")) {
        JAXBContext ctx = JAXBContext.newInstance(RootElementClass.class);
        Unmarshaller um = ctx.createUnmarshaller();
        RootElementClass rootElement = (RootElementClass) um.unmarshal(adrFile);
    }
}

写文档:

public static void parse(RootElementClass out) throws IOException, JAXBException {
    try (FileOutputStream adrFile = new FileOutputStream("test.xml")) {
        JAXBContext ctx = JAXBContext.newInstance(RootElementClass.class);
        Marshaller ma = ctx.createMarshaller();
        ma.marshal(out, adrFile);
    }
}

从一些旧的演讲幻灯片中无耻地复制的例子;-)

Examples shamelessly copied from some old lecture slides ;-)

编辑:关于我应该使用哪种API?。这取决于 - 并非所有API都具有您所看到的相同功能,但是如果您可以控制用于映射XML文档的类,那么JAXB是我个人最喜欢的,非常优雅和简单的解决方案(尽管我没有使用它真的很大的文件,它可能会有点复杂)。 SAX也非常容易使用,如果你没有充分的理由使用它,那就远离DOM - 我认为旧的,笨重的API。我认为没有任何现代的第三方库具有STL中缺少的任何特别有用的东西,而且标准库具有经过极好测试,记录和稳定的通常优势。

About "which API should I use?". Well it depends - not all APIs have the same capabilities as you see, but if you have control over the classes you use to map the XML document JAXB is my personal favorite, really elegant and simple solution (though I haven't used it for really large documents, it could get a bit complex). SAX is pretty easy to use too and just stay away from DOM if you don't have a really good reason to use it - old, clunky API in my opinion. I don't think there are any modern 3rd party libraries that feature anything especially useful that's missing from the STL and the standard libraries have the usual advantages of being extremely well tested, documented and stable.

这篇关于哪个是java中用于XML解析的最佳库的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆