在不知道xml文件结构的情况下解析xml文件内容 [英] Parsing xml file contents without knowing xml file structure

查看:101
本文介绍了在不知道xml文件结构的情况下解析xml文件内容的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在努力学习一些使用java来解析文件的新技术,而对于msot部分,它一切顺利。但是,我很遗憾如何将xml文件解析为收到结构时未知的结构。如果你知道结构(getElementByTagName似乎是要走的路),那么如何做的大量例子,但没有动态选项,至少不是我找到的。

I've been working on learning some new tech using java to parse files and for the msot part it's going well. However, I'm at a lost as to how I could parse an xml file to where the structure is not known upon receipt. Lots of examples of how to do so if you know the structure (getElementByTagName seems to be the way to go), but no dynamic options, at least not that I've found.

所以这个问题的tl; dr版本,如何解析一个我不能依赖于知道它的结构的xml文件?

So the tl;dr version of this question, how can I parse an xml file where I cannot rely on knowing it's structure?

推荐答案

解析部分很容易;像注释中所述的helderdarocha,解析器只需要有效的XML,它不关心结构。您可以使用Java的标准 DocumentBuilder 获取 文件

Well the parsing part is easy; like helderdarocha stated in the comments, the parser only requires valid XML, it does not care about the structure. You can use Java's standard DocumentBuilder to obtain a Document:

InputStream in = new FileInputStream(...);
Document doc = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(in);

(如果您正在解析多个文档,则可以继续重复使用相同的 DocumentBuilder 。)

(If you're parsing multiple documents, you can keep reusing the same DocumentBuilder.)

然后你可以从根文档元素开始并使用熟悉的 DOM 方法从那里开始:

Then you can start with the root document element and use familiar DOM methods from there on out:

Element root = doc.getDocumentElement(); // perform DOM operations starting here.

至于处理它,它真的取决于你想用它做什么,但你可以使用 Node getFirstChild() getNextSibling()按照您认为合适的方式迭代子进程和进程基于结构,标签和属性。

As for processing it, well it really depends on what you want to do with it, but you can use the methods of Node like getFirstChild() and getNextSibling() to iterate through children and process as you see fit based on structure, tags, and attributes.

请考虑以下示例:

import java.io.ByteArrayInputStream;
import java.io.InputStream;
import javax.xml.parsers.DocumentBuilderFactory;   
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.Node;


public class XML {

    public static void main (String[] args) throws Exception {

        String xml = "<objects><circle color='red'/><circle color='green'/><rectangle>hello</rectangle><glumble/></objects>";

        // parse
        InputStream in = new ByteArrayInputStream(xml.getBytes("utf-8"));
        Document doc = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(in);

        // process
        Node objects = doc.getDocumentElement();
        for (Node object = objects.getFirstChild(); object != null; object = object.getNextSibling()) {
            if (object instanceof Element) {
                Element e = (Element)object;
                if (e.getTagName().equalsIgnoreCase("circle")) {
                    String color = e.getAttribute("color");
                    System.out.println("It's a " + color + " circle!");
                } else if (e.getTagName().equalsIgnoreCase("rectangle")) {
                    String text = e.getTextContent();
                    System.out.println("It's a rectangle that says \"" + text + "\".");
                } else {
                    System.out.println("I don't know what a " + e.getTagName() + " is for.");
                }
            }
        }

    }

}

输入XML文档(例如硬编码)是:

The input XML document (hard-coded for example) is:

<objects>
    <circle color='red'/>
    <circle color='green'/>
    <rectangle>hello</rectangle>
    <glumble/>
</objects>

输出为:


It's a red circle!
It's a green circle!
It's a rectangle that says "hello".
I don't know what a glumble is for.

这篇关于在不知道xml文件结构的情况下解析xml文件内容的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆