解码引用可打印正确 [英] Decode quoted printable correct

查看:60
本文介绍了解码引用可打印正确的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下字符串:

=?utf-8?Q?=5Bproconact_=2D_Verbesserung_=23=32=37=39=5D_=28Neu=29_Stellvertretungen_Benutzerrecht_=2D_andere_k=C3=B6nnen_f=C3=BCr_andere_Stellvertretungen_erstellen_=C3=A4ndern_usw=2E_dadurch_ist_der_Schutz_der_Aktivi=C3=A4ten_Mails_nicht_gew=C3=A4hrt=...

[proconact-Verbesserung #279] (Neu) Stellvertretungen Benutzerrecht - andere können für andere Stellvertretungen erstellen ändern usw. dadurch ist der Schutz der Aktiviäten Mails nicht gewährt.

我正在寻找一种解码带引号的字符串的方法.

I am searching for a way do decode the quoted string.

我尝试过:

private static string DecodeQuotedPrintables(string input, string charSet) {
    Encoding enc = new ASCIIEncoding();
    try {
        enc = Encoding.GetEncoding(charSet);
    } catch {
        enc = new UTF8Encoding();
    }

    var occurences = new Regex(@"(=[0-9A-Z]{2}){1,}", RegexOptions.Multiline);
    var matches = occurences.Matches(input);

    foreach (Match match in matches) {
        try {
            byte[] b = new byte[match.Groups[0].Value.Length / 3];
            for (int i = 0; i < match.Groups[0].Value.Length / 3; i++) {
                b[i] = byte.Parse(match.Groups[0].Value.Substring(i * 3 + 1, 2), System.Globalization.NumberStyles.AllowHexSpecifier);
            }
            char[] hexChar = enc.GetChars(b);
            input = input.Replace(match.Groups[0].Value, hexChar[0].ToString());
        } catch { ;}
    }
    input = input.Replace("?=", "").Replace("=\r\n", "");

    return input;
}

我打电话时(其中s是我的字符串)

when I call (where s is my string)

var x = DecodeQuotedPrintables(s, "utf-8");

这将返回

=?utf-8?Q?[proconact_-_Verbesserung_#_(Neu)_Stellvertretungen_Benutzerrecht_-_andere_können_für_andere_Stellvertretungen_erstellen_ändern_usw._dadurch_ist_der_Schutz_der_Aktiviäten_Mails_nicht_gewährt=...

我该怎么办,还要删除_以及开始的 =?utf-8?Q?和结尾的 = .. ?>

What can I do, that there will also the _ and the starting =?utf-8?Q? and the trailing =.. be removed?

推荐答案

您要解码的文本通常在MIME标头中找到,并根据以下Internet标准中定义的规范进行编码: RFC 2047:MIME(多用途Internet邮件扩展)第三部分:非ASCII文本的消息标题扩展.

The text you’re trying to decode is typically found in MIME headers, and is encoded according to the specification defined in the following Internet standard: RFC 2047: MIME (Multipurpose Internet Mail Extensions) Part Three: Message Header Extensions for Non-ASCII Text.

在GitHub上有这种解码器的示例实现;也许您可以从中得出一些想法: C#中的RFC2047解码器.

There is a sample implementation for such a decoder on GitHub; maybe you can draw some ideas from it: RFC2047 decoder in C#.

您还可以使用此在线工具比较结果:在线MIME标头解码器.

You can also use this online tool for comparing your results: Online MIME Headers Decoder.

请注意,您的示例文本不正确.规范声明:

Note that your sample text is incorrect. The specification declares:

encoded-word = "=?" charset "?" encoding "?" encoded-text "?="

根据规范,任何编码的单词必须?= 结尾.因此,必须从以下位置更正您的样品:

Per the specification, any encoded word must end in ?=. Thus, your sample must be corrected from:

=?utf-8?Q?=5Bproconact_=2D_Verbesserung_=23=32=37=39=5D_=28Neu=29_Stellvertretungen_Benutzerrecht_=2D_andere_k=C3=B6nnen_f=C3=BCr_andere_Stellvertretungen_erstellen_=C3=A4ndern_usw=2E_dadurch_ist_der_Schutz_der_Aktivi=C3=A4ten_Mails_nicht_gew=C3=A4hrt=

…到(滚动到最右边):

…to (scroll to the far right):

=?utf-8?Q?=5Bproconact_=2D_Verbesserung_=23=32=37=39=5D_=28Neu=29_Stellvertretungen_Benutzerrecht_=2D_andere_k=C3=B6nnen_f=C3=BCr_andere_Stellvertretungen_erstellen_=C3=A4ndern_usw=2E_dadurch_ist_der_Schutz_der_Aktivi=C3=A4ten_Mails_nicht_gew=C3=A4hrt?=

严格来说,您的样本也是无效的,因为它超过了对任何编码单词施加的75个字符的限制;但是,大多数解码器倾向于容忍这种不符合.

Strictly speaking, your sample is also invalid because it exceeds the 75-character limit imposed on any encoded word; however, most decoders tend to be tolerant of this non-conformity.

这篇关于解码引用可打印正确的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆