如何在序列化之前从 DOM 中去除纯空白文本节点? [英] How to strip whitespace-only text nodes from a DOM before serialization?

查看：25 发布时间：2021/12/18 14:02:14 java xml dom whitespace

本文介绍了如何在序列化之前从 DOM 中去除纯空白文本节点?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一些 Java (5.0) 代码从各种(缓存的)数据源构造一个 DOM，然后删除某些不需要的元素节点，然后使用以下方法将结果序列化为 XML 字符串:

I have some Java (5.0) code that constructs a DOM from various (cached) data sources, then removes certain element nodes that are not required, then serializes the result into an XML string using:

// Serialize DOM back into a string
Writer out = new StringWriter();
Transformer tf = TransformerFactory.newInstance().newTransformer();
tf.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
tf.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
tf.setOutputProperty(OutputKeys.INDENT, "no");
tf.transform(new DOMSource(doc), new StreamResult(out));
return out.toString();

但是，由于我要删除多个元素节点，因此最终序列化文档中会出现很多额外的空白.

However, since I'm removing several element nodes, I end up with a lot of extra whitespace in the final serialized document.

是否有一种简单的方法可以在将 DOM 序列化为字符串之前(或同时)从 DOM 中删除/折叠它?

Is there a simple way to remove/collapse the extraneous whitespace from the DOM before (or while) it's serialized into a String?

推荐答案

您可以使用 XPath 找到空文本节点，然后像这样以编程方式删除它们:

You can find empty text nodes using XPath, then remove them programmatically like so:

XPathFactory xpathFactory = XPathFactory.newInstance();
// XPath to find empty text nodes.
XPathExpression xpathExp = xpathFactory.newXPath().compile(
        "//text()[normalize-space(.) = '']");  
NodeList emptyTextNodes = (NodeList) 
        xpathExp.evaluate(doc, XPathConstants.NODESET);

// Remove each empty text node from document.
for (int i = 0; i < emptyTextNodes.getLength(); i++) {
    Node emptyTextNode = emptyTextNodes.item(i);
    emptyTextNode.getParentNode().removeChild(emptyTextNode);
}

如果您希望对节点删除进行更多控制，而使用 XSL 模板无法轻松实现，则此方法可能很有用.

This approach might be useful if you want more control over node removal than is easily achieved with an XSL template.

这篇关于如何在序列化之前从 DOM 中去除纯空白文本节点?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何在序列化之前从 DOM 中去除纯空白文本节点? [英] How to strip whitespace-only text nodes from a DOM before serialization?

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

如何在序列化之前从 DOM 中去除纯空白文本节点? [英] How to strip whitespace-only text nodes from a DOM before serialization?

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭