使用Java和UTF-8编码生成有效的XML [英] Producing valid XML with Java and UTF-8 encoding

查看：654 发布时间：2017/8/16 19:36:05 java xml encoding utf-8

本文介绍了使用Java和UTF-8编码生成有效的XML的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在使用JAXP生成和解析从数据库中加载一些字段的XML文档。

I am using JAXP to generate and parse an XML document from which some fields are loaded from a database.

序列化XML的代码：

DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
Document doc = builder.newDocument();
Element root = doc.createElement("test");
root.setAttribute("version", text);
doc.appendChild(root);

DOMSource domSource = new DOMSource(doc);
TransformerFactory tFactory = TransformerFactory.newInstance();

FileWriter out = new FileWriter("test.xml");
Transformer transformer = tFactory.newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
transformer.transform(domSource, new StreamResult(out));

解析XML的代码：

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setNamespaceAware(true);
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse("test.xml");

我遇到以下异常：

[Fatal Error] test.xml:1:4: Invalid byte 1 of 1-byte UTF-8 sequence.
Exception in thread "main" org.xml.sax.SAXParseException: Invalid byte 1 of 1-byte UTF-8 sequence.
    at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)
    at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source)
    at javax.xml.parsers.DocumentBuilder.parse(Unknown Source)
    at com.test.Test.xml(Test.java:27)
    at com.test.Test.main(Test.java:55)

字符串文本包括u-umlaut和o-umlaut（字符代码0xFC和0xF6）。这些是导致错误的字符。当我逃避String自己使用& #xFC;和&＃xF6;那么问题就消失了。当我写出XML时，其他实体会自动编码。

The String text includes u-umlaut and o-umlaut (character codes 0xFC and 0xF6). These are the characters that are causing the error. When I escape the String myself to use ü and ö then the problem goes away. Other entities are automatically encoded when I write out the XML.

如何自己编写/读取我的输出，而不用这些字符代替？

How do I get my output to be written / read properly without substituting these characters myself?

（我已经阅读了以下问题：

(I've read the following questions already:

如何将字符从Oracle编码为XML？

修复XML文件中的错误编码）

使用Java和UTF-8编码生成有效的XML [英] Producing valid XML with Java and UTF-8 encoding

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

使用Java和UTF-8编码生成有效的XML [英] Producing valid XML with Java and UTF-8 encoding

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭