归一化DOM没有归一化的相同效果 [英] Normalization DOM same effect without normalize
问题描述
在此处阅读答案:
用Java进行DOM解析时的规范化-它是如何工作的?
我了解规范化将删除空的相邻文本节点,因此尝试了以下xml: / p>
I understand that the normalization will remove empty adjacent text nodes, I tried the following xml :
<company>hello
wor
ld
</company>
具有以下代码:
try {
DocumentBuilder dBuilder = DocumentBuilderFactory.newInstance()
.newDocumentBuilder();
Document doc = dBuilder.parse(file);
doc.getDocumentElement().normalize();
System.out.println("Root element :" + doc.getDocumentElement().getNodeName());
System.out.println(doc.getDocumentElement().getChildNodes().getLength());
System.out.println(doc.getDocumentElement().getChildNodes().item(0).getTextContent());
} catch (Exception e) {
e.printStackTrace();
}
即使没有规范化,我也总是为元素 company获得1个子节点。结果是:
I always get 1 child node for the element "company" even without normalize. the result is :
Root element :company
1
hello
wor
ld
那么这是怎么了?有人可以解释吗?
so what is wrong here ? can anyone explain ? shouldn't I get hello world in one line.
推荐答案
解析器已经在创建标准化的DOM树。
The parser is already creating a normalized DOM tree.
normalize()
方法对于构建/修改DOM很有用,因为它可能不会导致树标准化,在这种情况下,该方法将为您将其标准化。
The normalize()
method is useful for when you're building/modifying the DOM, which might not result in a normalized tree, in which case the method will normalize it for you.
通用助手
private static void printDom(String indent, Node node) {
System.out.println(indent + node);
for (Node child = node.getFirstChild(); child != null; child = child.getNextSibling())
printDom(indent + " ", child);
}
示例1
public static void main(String[] args) throws Exception {
String xml = "<Root>text 1<!-- test -->text 2</Root>";
DocumentBuilder domBuilder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
Document doc = domBuilder.parse(new InputSource(new StringReader(xml)));
printDom("", doc);
deleteComments(doc);
printDom("", doc);
doc.normalizeDocument();
printDom("", doc);
}
private static void deleteComments(Node node) {
if (node.getNodeType() == Node.COMMENT_NODE)
node.getParentNode().removeChild(node);
else {
NodeList children = node.getChildNodes();
for (int i = 0; i < children.getLength(); i++)
deleteComments(children.item(i));
}
}
输出
[#document: null]
[Root: null]
[#text: text 1]
[#comment: test ]
[#text: text 2]
[#document: null]
[Root: null]
[#text: text 1]
[#text: text 2]
[#document: null]
[Root: null]
[#text: text 1text 2]
示例2
public static void main(String[] args) throws Exception {
DocumentBuilder domBuilder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
Document doc = domBuilder.newDocument();
Element root = doc.createElement("Root");
doc.appendChild(root);
root.appendChild(doc.createTextNode("Hello"));
root.appendChild(doc.createTextNode(" "));
root.appendChild(doc.createTextNode("World"));
printDom("", doc);
doc.normalizeDocument();
printDom("", doc);
}
输出
[#document: null]
[Root: null]
[#text: Hello]
[#text: ]
[#text: World]
[#document: null]
[Root: null]
[#text: Hello World]
这篇关于归一化DOM没有归一化的相同效果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!