1字节UTF-8序列的XML无效字节1 [英] XML Invalid byte 1 of 1-byte UTF-8 sequence

查看：303 发布时间：2020/7/13 6:21:12 xml utf-8 byte

本文介绍了1字节UTF-8序列的XML无效字节1的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个程序，该程序需要两个xml文件并合并为一个，在执行此操作时，我设法将"and"转换为"and".不是在说我为什么要这么做，这是代码片段，它消除了不再存在的"’"错误，这就是为什么我将其粘贴在这里.

I have a program that takes two xml files and merge into one, while I'm doing this, I managed to convert from "and " to "and ’". Not talking about why I'm doing this, here's the code snippet, removing "’" error no longer exist that's why I paste it here.

convertedString = replace(convertedString, (String)"and ", 
                (String)"and &#8217;");
convertedString = replace(convertedString, (String)"&quot;", 
                (String)"\\\"");
convertedString = StringEscapeUtils.unescapeHtml(convertedString);

使用printDocument方法:

with printDocument method:

private static void printDocument(Document doc, OutputStream out) 
    throws IOException, TransformerException 
    {     
        TransformerFactory tf = TransformerFactory.newInstance();     
        Transformer transformer = tf.newTransformer();     
        transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "no");     
        transformer.setOutputProperty(OutputKeys.METHOD, "xml");     
        transformer.setOutputProperty(OutputKeys.INDENT, "yes");     
        transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");     
        transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-" +
                "amount", "4");      
        transformer.transform(new DOMSource(doc),           
                new StreamResult(new OutputStreamWriter(out, "UTF-8"))); 
    }

运行我得到的程序

com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException: Invalid byte 1 of 1-byte UTF-8 sequence.
    at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.invalidByte(UTF8Reader.java:684)
    at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.read(UTF8Reader.java:554)
    at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.load(XMLEntityScanner.java:1742)
    at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.skipChar(XMLEntityScanner.java:1416)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2793)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:648)
    at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:140)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:511)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:808)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:737)
    at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:119)
    at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:235)
    at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:284)

即使可能与printDocument()方法中的UTF-8有关，将其更改为ISO-8859-1也无济于事.

Even though it might to do with UTF-8 in printDocument() method, changing it to ISO-8859-1 wasn't going to help.

那么有人可以帮助我解决什么问题吗?非常感谢

So could anyone help me what the issue is? Much appreciated

1字节UTF-8序列的XML无效字节1 [英] XML Invalid byte 1 of 1-byte UTF-8 sequence

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

1字节UTF-8序列的XML无效字节1 [英] XML Invalid byte 1 of 1-byte UTF-8 sequence

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭