什么是“=C2=A0"?在 MIME 编码的、引用可打印的文本中? [英] What is "=C2=A0" in MIME encoded, quoted-printable text?
问题描述
这是我试图解析的原始电子邮件示例:
This is an example raw email I am trying to parse:
MIME-version: 1.0
Content-type: text/html; charset=UTF-8
Content-transfer-encoding: quoted-printable
X-Mailer: Verizon Webmail
X-Originating-IP: [x.x.x.x]
=C2=A0test testing testing 123
=C2=A0 是什么?我已经尝试了六个带引号的可打印解析器,但没有一个能正确处理这个问题.如何在 C# 中正确解析它?
What is =C2=A0? I have tried a half dozen quoted-printable parsers, but none handle this correctly. How would one properly parse this in C#?
老实说,现在我正在编码:
Honestly, for now, I'm coding:
//TODO WTF
encoded = encoded.Replace("=C2=A0", "");
因为我无法弄清楚为什么该文本会随机出现在 MIME 内容中,并且不应该被呈现为任何内容.只需删除它,我就可以获得所需的效果 - 但为什么呢?!
Because I can't figure out why that text is there randomly within the MIME content, and isn't supposed to be rendered into anything. By just removing it, I'm getting the desired effect - but WHY?!
明确地说,我知道 (=[0-9A-F]{2}) 是一个编码字符.但在这种情况下,它似乎什么都不代表.
To be clear, I know that (=[0-9A-F]{2}) is an encoded character. But in this case, it seemingly represents NOTHING.
推荐答案
=C2=A0
代表字节 C2 A0.由于这是 UTF-8,所以它转换为 U+00A0,这是用于不间断空格的 Unicode.
=C2=A0
represents the bytes C2 A0. Since this is UTF-8, it translates to U+00A0, which is the Unicode for non-breaking space.
请参阅UTF-8(维基百科).
See UTF-8 (Wikipedia).
这篇关于什么是“=C2=A0"?在 MIME 编码的、引用可打印的文本中?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!