使用JAXB解组XML,而不用转义字符 [英] Unmarshalling XML with JAXB without unescaping characters

查看:156
本文介绍了使用JAXB解组XML,而不用转义字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

想象以下情况:我们从一些外部工具收到一个xml文件。最近在这个xml中,可以在nodenames或其richcontent标签中有一些转义的charameter,如下面的例子(simplefied):

imagine following situation: we receive a xml file from some external tool. Lately within this xml, there can be some escaped charakters in nodenames or within their richcontent tag, like in the following example (simplyfied):

<map>
<node TEXT="Project">
<node TEXT="&#xe4;&#xe4;">
<richcontent TYPE="NOTE"><html>
  <head>

  </head>
  <body>
    <p>
      I am a Note for Node &#228;&#228;!
    </p>
  </body>
</html>
</richcontent>
</node>
</node>
</map>

在使用JAXB解组文件之后,这些转义的char charter被取消转义。不幸的是,我需要他们保持他们的方式,意味着逃避。在解组时,是否有任何方法避免对这些角色的转义?

After unmarshalling the file with JAXB those escaped charakters get unescaped. Unfortunatly I need them to stay the way they are, meaning escaped. Is there any way to avoid unescaping those characters while unmarshalling?

在研究我发现很多关于编排xml文件的问题时,出现相反的问题,但是没有帮助我:

While researching I found a lot of questions concerning marshalling xml-files where the opposite problem occurs, but those didnt help me either:

  • Question 1
  • Question 2

是甚至可以通过JAXB实现这一目标,或者我们甚至要考虑改用不同的xml reader API?

Is it even possible to achieve this aim with JAXB, or do we even have to consider changing to a different xml reader API?

提前谢谢你,
ymene

Thank you in advance, ymene

推荐答案

您只需将&#替换为& amp;#因此调用

You need only to replace &# by &amp;# hence call

unmarshaller.unmarshal(new AmpersandingStream(new FileInputStream(...)));

import java.io.IOException;
import java.io.InputStream;

/**
* Replaces numerical entities with their notation as text.
*/
public class AmpersandingStream extends InputStream {

    private InputStream in;
    private boolean justReadAmpersand;
    private String lookAhead = "";

    public AmpersandingStream(InputStream in) {
        this.in = in;
    }

    @Override
    public int read() throws IOException {
        if (!lookAhead.isEmpty()) {
            int c = lookAhead.codePointAt(0);
            lookAhead = lookAhead.substring(Character.charCount(c));
            return c;
        }
        int c = in.read();
        if (c == (int)'#' && justReadAmpersand) {
            c = (int)'a';
            lookAhead = "mp;#";
        }
        justReadAmpersand = c == (int)'&';
        return c;
    }

    @Override
    public int available() throws IOException {
        return in.available();
    }

    @Override
    public void close() throws IOException {
        in.close();
    }

    @Override
    public synchronized void mark(int readlimit) {
        in.mark(readlimit);
    }

    @Override
    public boolean markSupported() {
        return in.markSupported();
    }

    @Override
    public int read(byte[] b) throws IOException {
        return in.read(b);
    }

    @Override
    public int read(byte[] b, int off, int len) throws IOException {
        return in.read(b, off, len);
    }

    @Override
    public synchronized void reset() throws IOException {
        in.reset();
    }

    @Override
    public long skip(long n) throws IOException {
        return in.skip(n);
    }

}

这篇关于使用JAXB解组XML,而不用转义字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆