Java特殊字符替换 [英] Java special chars replace

查看:179
本文介绍了Java特殊字符替换的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个文本:

Csuklásirohamgyötörhetiasvédeket,annyit emlegetikmostanságismétasvédmodelltMagyarországon。

I have a text: " Csuklási roham gyötörheti a svédeket, annyit emlegetik mostanság ismét a svéd modellt Magyarországon."

原始文本根本没有换行符。

In that original text there are no line breaks at all.

当我发送此文本(使用gmail)时,我将其编码为以下内容:

When I email this text (with gmail), I get it encoded as the following:

Content-Type: text/plain; charset=ISO-8859-2
Content-Transfer-Encoding: quoted-printable

Csukl=E1si roham gy=F6t=F6rheti a sv=E9deket, annyit emlegetik mostans=E1g =
ism=E9t a
sv=E9d modellt Magyarorsz=E1gon. 

在HTML中:

Content-Type: text/html; charset=ISO-8859-2
Content-Transfer-Encoding: quoted-printable


<span class=3D"Apple-style-span" style=3D"font-family: Helvetica, Verdana, = sans-serif; font-size: 15px; ">Csukl=E1si roham gy=F6t=F6rheti a sv=E9deket= , annyit emlegetik mostans=E1g ism=E9t a sv=E9d modellt Magyarorsz=E1gon.

....

当我尝试以文本/简单的方式解析电子邮件正文,我无法摆脱这两个单词之间的mostans = E1g =
ism = E9t的符号。请注意,HTML编码消息中缺少相同的字符。我不知道这个特殊字符可能是什么,但是我需要消除它以获取原始文本。

When I try to parse the email body as text/plain, I cannot get rid of the = sign in "mostans=E1g = ism=E9t" between the two words. Note that the same character is missing from the HTML encoded message. I don't have any idea what that special character might be, but I need to eliminate it to get back the original text.

我试图替换'\\\
但是它不是那个,如果我在文本中点击Enter,我可以正确地替换为我想要的任何字符。我也试过'\''和'\t'。

I tried to replace '\n' but it's not that one, if I hit 'Enter' in the text, I can correctly replace it to whatever character I want it to. I also tried '\r', and '\t'.

所以问题是,我错过了什么?那个特殊字符来自哪里?是因为字符和/或传输编码?如果是这样,我需要做些什么来解决问题并获得原始文本。

So the question is, what am I missing? Where does that special character come from? Is it because of the charser and/or the transfer encoding? If so, what do I have to do to solve the problem and get the original text back.

欢迎任何帮助。

干杯,
Balázs

Cheers, Balázs

推荐答案

您需要使用 MimeUtility 。这是一个例子。

You need to use MimeUtility.Here is an example.

public class Mime {
    public static void main(String[] args) throws MessagingException,
            IOException {
        InputStream stringStream = new FileInputStream("mime");
        InputStream output = MimeUtility.decode(stringStream,
                "quoted-printable");
        System.out.println(convertStreamToString(output));
    }

    public static String convertStreamToString(InputStream is)
            throws IOException {
        /*
         * To convert the InputStream to String we use the Reader.read(char[]
         * buffer) method. We iterate until the Reader return -1 which means
         * there's no more data to read. We use the StringWriter class to
         * produce the string.
         */
        if (is != null) {
            Writer writer = new StringWriter();

            char[] buffer = new char[1024];
            try {
                Reader reader = new BufferedReader(new InputStreamReader(is,
                        "ISO8859_1"));
                int n;
                while ((n = reader.read(buffer)) != -1) {
                    writer.write(buffer, 0, n);
                }
            } finally {
                is.close();
            }
            return writer.toString();
        } else {
            return "";
        }
    }
}

文件 mime'包含编码文本:

Csukl=E1si roham gy=F6t=F6rheti a sv=E9deket, annyit emlegetik mostans=E1g =
ism=E9t a
sv=E9d modellt Magyarorsz=E1gon.

更新:

使用 Guava 图书馆:

    InputSupplier<InputStream> supplier = new InputSupplier<InputStream>() {
        @Override
        public InputStream getInput() throws IOException {
            InputStream inStream = new FileInputStream("mime");
            InputStream decodedStream=null;
            try {
                decodedStream = MimeUtility.decode(inStream,
                "quoted-printable");
            } catch (MessagingException e) {
                e.printStackTrace();
            }
            return decodedStream;
        }
    };
    InputSupplier<InputStreamReader> result = CharStreams
    .newReaderSupplier(supplier, Charsets.ISO_8859_1);
    String ans = CharStreams.toString(result);
    System.out.println(ans);

这篇关于Java特殊字符替换的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆