解码 UTF8 电子邮件标头 [英] Decode an UTF8 email header
问题描述
我有一个表单的电子邮件主题:
I have an email subject of the form:
=?utf-8?B?T3.....?=
电子邮件的正文采用 utf-8 base64 编码 - 解码良好.我目前正在使用 Perl 的 Email::MIME 模块来解码电子邮件.
The body of the email is utf-8 base64 encoded - and has decoded fine. I am current using Perl's Email::MIME module to decode the email.
=?utf-8 分隔符是什么意思,如何从这个字符串中提取信息?
What is the meaning of the =?utf-8 delimiter and how do I extract information from this string?
推荐答案
编码字
令牌(根据 RFC 2047)可以出现在某些标头的值中.解析如下:
The encoded-word
tokens (as per RFC 2047) can occur in values of some headers. They are parsed as follows:
=?<charset>?<encoding>?<data>?=
在这种情况下,字符集是 UTF-8,编码是 B
,表示 base64(另一个选项是 Q
,表示 Quoted Printable).
Charset is UTF-8 in this case, the encoding is B
which means base64 (the other option is Q
which means Quoted Printable).
要读取它,首先解码 base64,然后将其视为 UTF-8 字符.
To read it, first decode the base64, then treat it as UTF-8 characters.
还可以阅读各种 Internet 邮件 RFC 了解更多详细信息,主要是 RFC 2047.
Also read the various Internet Mail RFCs for more detail, mainly RFC 2047.
由于您使用的是 Perl,Encode::MIME::Header 可以有用:
Since you are using Perl, Encode::MIME::Header could be of use:
概要
use Encode qw/encode decode/;
$utf8 = decode('MIME-Header', $header);
$header = encode('MIME-Header', $utf8);
摘要
此模块实现 RFC 2047 Mime标头编码.有 3 种变体编码名称;MIME 标头,MIME-B和 MIME-Q.不同的是如下所述
This module implements RFC 2047 Mime Header Encoding. There are 3 variant encoding names; MIME-Header, MIME-B and MIME-Q. The difference is described below
decode() encode()
MIME-Header Both B and Q =?UTF-8?B?....?=
MIME-B B only; Q croaks =?UTF-8?B?....?=
MIME-Q Q only; B croaks =?UTF-8?Q?....?=
这篇关于解码 UTF8 电子邮件标头的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!