用DOM(Java)解析XML文件 [英] Parsing XML file with DOM (Java)

查看:109
本文介绍了用DOM(Java)解析XML文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想解析以下url: http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esummary.fcgi?db=nucleotide&id=224589801

I want to parse the following url: http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esummary.fcgi?db=nucleotide&id=224589801

因此,我想出了以下方法:

As a result I came up with the following method:

public void parseXml2(String URL) {
    DOMParser parser = new DOMParser();

    try {
        parser.parse(new InputSource(new URL(URL).openStream()));
        Document doc = parser.getDocument();

        NodeList nodeList = doc.getElementsByTagName("Item");
        for (int i = 0; i < nodeList.getLength(); i++) {
            Node n = nodeList.item(i);
            Node actualNode = n.getFirstChild();
            if (actualNode != null) {
                System.out.println(actualNode.getNodeValue());
            }
        }

    } catch (SAXException ex) {
        Logger.getLogger(TaxMapperXml.class.getName()).log(Level.SEVERE, null, ex);
    } catch (IOException ex) {
        Logger.getLogger(TaxMapperXml.class.getName()).log(Level.SEVERE, null, ex);
    }
}

使用此方法,我可以使用项目的值节点,但我不能承担任何属性。我尝试用NamedNodeMap来尝试getAttribute(),但仍然没有用。

With this method I can take the values of the Item nodes but I can't take any of their attributes. I tried experimenting with getAttribute() with NamedNodeMap but still to no avail.


  1. 为什么我必须做 n.getFirstChild()。getNodeValue(); 获取实际值? n.getNodeValue()只返回null?是不是这个反直觉 - 显然在我的情况下,节点没有子节点?

  1. Why do I have to do n.getFirstChild().getNodeValue(); to get the actual value? n.getNodeValue() returns just null? Isn't this counter-intuitive - obviously in my case node's doesn't have subnodes?

有没有更强大和广泛接受的解析XML文件的方法使用DOM?我的文件最多不会是15-20行,所以SAX不是必需的(或者是吗?)

Is there some more robust and widely accepted way of parsing XML files using DOM? My files aren't gonna be big 15-20 lines at most, so SAX isn't necessary (or is it?)


推荐答案

import java.io.IOException;
import java.net.URL;
import org.apache.xerces.parsers.DOMParser;

import org.w3c.dom.Document;
import org.w3c.dom.NamedNodeMap;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import org.xml.sax.InputSource;

public class XMLParser {

    /**
     * @param args
     */
    public static void main(String[] args) {
        // TODO Auto-generated method stub
        parseXml2("http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esummary.fcgi?db=nucleotide&id=224589801");
    }

    public static void parseXml2(String URL) {
        DOMParser parser = new DOMParser();

        try {
            parser.parse(new InputSource(new URL(URL).openStream()));
            Document doc = parser.getDocument();

            NodeList nodeList = doc.getElementsByTagName("Item");
            for (int i = 0; i < nodeList.getLength(); i++) {
                System.out.print("Item "+(i+1));
                Node n = nodeList.item(i);
                NamedNodeMap m = n.getAttributes();
                System.out.print(" Name: "+m.getNamedItem("Name").getTextContent());
                System.out.print(" Type: "+m.getNamedItem("Type").getTextContent());
                Node actualNode = n.getFirstChild();
                if (actualNode != null) {
                    System.out.println(" "+actualNode.getNodeValue());
                } else {
                    System.out.println(" ");                    
                }
            }

        } catch (Exception ex) {
            ex.printStackTrace();
        }
    }
}

完成示例代码并添加几条线来获取属性。

Completed the sample code and added a few lines to get the attributes.

这应该让你开始,虽然我觉得你需要了解DOM的基本概念。 网站(和许多其他网站)可以帮助您。最重要的是理解不同种类的节点。

This should get you started, although I feel that you need to get yourself up to date with the basic notions of DOM. This site (and many others) can help you with that. Most importantly is understanding the different kinds of nodes there are.

这篇关于用DOM(Java)解析XML文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆