1字节UTF-8序列的XML无效字节1 [英] XML Invalid byte 1 of 1-byte UTF-8 sequence
问题描述
我有一个程序,该程序需要两个xml文件并合并为一个,在执行此操作时,我设法将"and"转换为"and".不是在说我为什么要这么做,这是代码片段,它消除了不再存在的"’
"错误,这就是为什么我将其粘贴在这里.
I have a program that takes two xml files and merge into one, while I'm doing this, I managed to convert from "and " to "and ’". Not talking about why I'm doing this, here's the code snippet, removing "’
" error no longer exist that's why I paste it here.
convertedString = replace(convertedString, (String)"and ",
(String)"and ’");
convertedString = replace(convertedString, (String)""",
(String)"\\\"");
convertedString = StringEscapeUtils.unescapeHtml(convertedString);
使用printDocument方法:
with printDocument method:
private static void printDocument(Document doc, OutputStream out)
throws IOException, TransformerException
{
TransformerFactory tf = TransformerFactory.newInstance();
Transformer transformer = tf.newTransformer();
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "no");
transformer.setOutputProperty(OutputKeys.METHOD, "xml");
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-" +
"amount", "4");
transformer.transform(new DOMSource(doc),
new StreamResult(new OutputStreamWriter(out, "UTF-8")));
}
运行我得到的程序
com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException: Invalid byte 1 of 1-byte UTF-8 sequence.
at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.invalidByte(UTF8Reader.java:684)
at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.read(UTF8Reader.java:554)
at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.load(XMLEntityScanner.java:1742)
at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.skipChar(XMLEntityScanner.java:1416)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2793)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:648)
at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:140)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:511)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:808)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:737)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:119)
at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:235)
at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:284)
即使可能与printDocument()方法中的UTF-8有关,将其更改为ISO-8859-1也无济于事.
Even though it might to do with UTF-8 in printDocument() method, changing it to ISO-8859-1 wasn't going to help.
那么有人可以帮助我解决什么问题吗?非常感谢
So could anyone help me what the issue is? Much appreciated
推荐答案
如果使用的是eclipse.尝试导航至首选项"/常规"/工作区". 然后将文本文件编码"更改为UTF-8
If you are using eclipse. try navigating to Preferences/General/Workspace. then change the change the "text file encoding" to UTF-8
这篇关于1字节UTF-8序列的XML无效字节1的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!