在 XML 中使其无法解析 [英]   in XML making it unparseable

查看:33
本文介绍了 在 XML 中使其无法解析的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以我的数据库中有一个值,其中有一个   形式的非中断空间.我有一个遗留服务,它从数据库中读取这个字符串并使用这个字符串创建一个 XML.我面临的问题是为此消息返回的 XML 无法解析.当我在记事本 ++ 中打开它时,我看到字符 xA0 位于非中断空间的位置,并且在删除此字符时,XML 变得可解析.此外,我有来自同一服务的此 XML 文件的较旧版本,其中字符 "" 代替了不间断空格.我最近更改了运行该服务的tomcat服务器,因此出现了问题.我发现这个 post 根据我的 XML 被编码为 ISO-8859-1; 但我用来将 XML 转换为字符串的代码不使用 ISO-8859-1;.下面是我的代码

So I have a value in my database which has a non breaking space in the form   in it. I have a legacy service which reads this string from the database and creates an XML using this string. The issue I am facing is that the XML returned for this message is un-parseable. When I open it in notepad++ I see the character xA0 in the place of the non breaking space, and on removing this character the XML becomes parseable. Furthermore I have older revisions of this XML file from the same service which have the character "Â " in place of the non breaking space. I recently changed the tomcat server on which the service was running, and something has gone wrong because of it. I found this post according to which my XML is encoded to ISO-8859-1; but the code which I use to convert the XML to string does not use ISO-8859-1;. Below is my code

private String nodeToString(Node node) {
        StringWriter sw = new StringWriter();

        try {
            Transformer t = TransformerFactory.newInstance().newTransformer();
            t.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "no");
            t.transform(new DOMSource(node), new StreamResult(sw));


        } catch (TransformerException te) {
            LOG.error("Exception during String to XML transformation ", te);
        }
        return sw.toString();

    }

我想知道为什么我的 XML 无法解析,为什么在 XML 文件的旧版本中有 "Â ".

I want to know why is my XML un-parseable and why is there a "Â " in the older revisions of the XML file.

这是记事本++中有问题的字符的图像记事本++中的图像

Here is the image of the problematic character in notepad++ image in notepad++

此外,当我在记事本中打开我的 XML 并尝试保存它时,我看到编码类型是 ANSI,当我将其更改为 UTF-8 然后将其保存时,XML 变得可解析.

Also when I open my XML in notepad and try to save it I see the encoding type is ANSI, when I change it to UTF-8 and then save it the XML becomes parseable.

新信息 - 使用 transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8"); 强制 UTF-8 不起作用我仍然得到 xA0在我的 XML 中.

New Info - Enforcing UTF-8 with transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8"); did not work I am still getting the xA0 in my XML.

推荐答案

问题是我的 java 版本以某种方式将我的文件保存为 ANSI 文件格式.当我在记事本中打开我的文件并尝试保存它时,我看到了这一点.旧文件采用 UTF-8 格式.所以我所做的就是在编写文件时指定 UTF-8 编码.

The issue was that my version of java was somehow saving my file in ANSI file format. I saw this when I opened my file in notepad, and tried to save it. The older files were in UTF-8 format. So all I did was specify UTF-8 encoding while writing my file.

Writer out = new BufferedWriter(new OutputStreamWriter(
                new FileOutputStream(fileName.trim()), StandardCharsets.UTF_8));
        try {
            out.write(data);
        } finally {
            out.close();
        }

这篇关于 在 XML 中使其无法解析的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆