为什么我将额外的文本节点作为根节点的子节点? [英] Why am I getting extra text nodes as child nodes of root node?
问题描述
我想打印根节点的子元素。这是我的XML文件。
I want to print the child elements of the root node. This is my XML file.
<?xml version="1.0"?>
<!-- Hi -->
<company>
<staff id="1001">
<firstname>yong</firstname>
<lastname>mook kim</lastname>
<nickname>mkyong</nickname>
< salary>100000</salary>
</staff>
<staff id="2001">
<firstname>low</firstname>
<lastname>yin fong</lastname>
<nickname>fong fong</nickname>
<salary>200000</salary>
</staff>
</company>
根据我的理解,Root节点是'company',其子节点必须是'staff'和'员工'(因为有'员工'节点2次)。但是当我试图让他们通过我的java代码时,我得到了5个子节点。 3个额外的文本节点将从哪里来?
According to my understanding, Root node is 'company' and its child nodes must be 'staff' and 'staff' (as there are 'staff' nodes 2 times). But when I am trying to get them through my java code I am getting 5 child nodes. From where are the 3 extra text nodes are coming ?
Java代码:
package com.training.xml;
import java.io.File;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
public class ReadingXML {
public static void main(String[] args) {
try {
File file=new File("D:\\TestFile.xml");
DocumentBuilderFactory dbFactory=DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder=dbFactory.newDocumentBuilder();
Document document=dBuilder.parse(file);
document.getDocumentElement().normalize();
System.out.println("root element: "+document.getDocumentElement().getNodeName());
Node rootNode=document.getDocumentElement(); //saving root node in a variable.
System.out.println("root: "+rootNode.getNodeName());
NodeList nList=rootNode.getChildNodes(); //to store the child nodes as node list.
for(int i=0;i<nList.getLength();i++)
{
System.out.println("node name: "+nList.item(i).getNodeName() );
}
}
catch(Exception e)
{
e.printStackTrace();
}
}
}
输出:
root element: company
root: company
node name: #text
node name: staff
node name: #text
node name: staff
node name: #text
为什么这三个文本节点都来了?
Why the three text nodes are coming over here ?
推荐答案
为什么这三个文本节点都来到这里?
Why the three text nodes are coming over here ?
它们之间是空格child 元素。如果你只想要子元素,你应该忽略其他类型的节点:
They're the whitespace between the child elements. If you only want the child elements, you should just ignore nodes of other types:
for (int i = 0;i < nList.getLength(); i++) {
Node node = nList.item(i);
if (node.getNodeType() == Node.ELEMENT_NODE) {
System.out.println("node name: " + node.getNodeName());
}
}
或者您可以将文档更改为没有该空格。
Or you could change your document to not have that whitespace.
或者您可以使用不同的XML API,它可以让您轻松地询问元素。 (DOM API在各方面都很痛苦。)
Or you could use a different XML API which allows you to easily ask for just elements. (The DOM API is a pain in various ways.)
如果你只想忽略元素内容空格,你可以使用 Text.isElementContentWhitespace
。
If you only want to ignore element content whitespace, you can use Text.isElementContentWhitespace
.
这篇关于为什么我将额外的文本节点作为根节点的子节点?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!