Apache commons IO 如何将我的 XML 标头从 UTF-8 转换为 UTF-16? [英] How does Apache commons IO convert my XML header from UTF-8 to UTF-16?

查看:20
本文介绍了Apache commons IO 如何将我的 XML 标头从 UTF-8 转换为 UTF-16?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用的是 Java 6.我有一个 XML 模板,开头是这样的

I’m using Java 6. I have an XML template, which begins like so

<?xml version="1.0" encoding="UTF-8"?>

但是,当我使用以下代码(使用 Apache Commons-io 2.4)解析并输出它时,我注意到……

However, I notice when I parse and output it with the following code (using Apache Commons-io 2.4) …

    Document doc = null;
    InputStream in = this.getClass().getClassLoader().getResourceAsStream("my-template.xml");

    try
    {
        byte[] data = org.apache.commons.io.IOUtils.toByteArray( in );
        InputSource src = new InputSource(new StringReader(new String(data)));

        DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
        DocumentBuilder builder = factory.newDocumentBuilder();
        doc = builder.parse(src);
    }
    finally
    {
        in.close();
    }

第一行输出为

<?xml version="1.0" encoding="UTF-16"?>

在解析/输出文件时我需要做什么才能使标头编码保持UTF-8"?

What do I need to do when parsing/outputting the file so that the header encoding will remain "UTF-8"?

根据给出的建议,我将代码更改为

Per the suggestion given, I changed my code to

    Document doc = null;
    InputStream in = this.getClass().getClassLoader().getResourceAsStream(name);

    try
    {
        DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
        DocumentBuilder builder = factory.newDocumentBuilder();
        doc = builder.parse(in);
    }
    finally
    {
        in.close();
    }

但尽管我的输入元素模板文件的第一行是

But despite the fact my input element template file's first line is

<?xml version="1.0" encoding="UTF-8"?>

当我将文档输出为它产生的字符串时

when i output the document as a String it produces

<?xml version="1.0" encoding="UTF-16"?>

作为第一行.这是我用来将doc"对象输出为字符串的内容......

as a first line. Here's what I use to output the "doc" object as a string ...

private String getDocumentString(Document doc)
{
    DOMImplementationLS domImplementation = (DOMImplementationLS)doc.getImplementation();
    LSSerializer lsSerializer = domImplementation.createLSSerializer();
    return lsSerializer.writeToString(doc);  
}

推荐答案

原来是我把 Document -> String 方法改成

Turns out that when I changed my Document -> String method to

private String getDocumentString(Document doc)
{
    String ret = null;
    DOMSource domSource = new DOMSource(doc);
    StringWriter writer = new StringWriter();
    StreamResult result = new StreamResult(writer);
    TransformerFactory tf = TransformerFactory.newInstance();
    Transformer transformer;
    try
    {
        transformer = tf.newTransformer();
        transformer.transform(domSource, result);
        ret = writer.toString();
    }
    catch (TransformerConfigurationException e)
    {
        e.printStackTrace();
    }
    catch (TransformerException e)
    {
        e.printStackTrace();
    }
    return ret;
}

'encoding="UTF-8"' 标头不再输出为 'encoding="UTF-16"'.

the 'encoding="UTF-8"' headers no longer got output as 'encoding="UTF-16"'.

这篇关于Apache commons IO 如何将我的 XML 标头从 UTF-8 转换为 UTF-16?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆