从ISO-8859-5开始的Python解码 [英] Python decoding from iso-8859-5

查看:58
本文介绍了从ISO-8859-5开始的Python解码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我通过python email.parser.Parser解析电子邮件时,我有很多这样的字符串:

When I parse my email messages via python email.parser.Parser, I had a lot of strings like this:

=?ISO-8859-5?Q?=C0=D5=D5=E1=E2=E0_=BF=DB=D0=E2=D5=D6=D5=D9_?=

如何使用python将其解码为utf-8?

How can i decode this to utf-8 using python?

推荐答案

您的输入内容是带引号的可打印编码文本.您可以使用模块 quopri 来处理该问题:

Your input is quoted-printable encoded text. You can use the module quopri to handle that:

import quopri

incode = '=?ISO-8859-5?Q?=C0=D5=D5=E1=E2=E0_=BF=DB=D0=E2=D5=D6=D5=D9_?='
inencoding = incode[2:12]  # 'ISO-8859-5'
intext = incode[15:-2]
result = quopri.decodestring(intext).encode(inencoding)

结果将是

Реестр_Платежей 

在带引号可打印的编码周围,您还具有电子邮件标题格式,指定在应用带引号可打印的解码后应解释字符串的编码字符.上面的示例代码手动"将部分字符串化,但是您也可以一步一步解决所有问题:

Around the quoted-printable encoding you additionally have an email-header formating, specifying the character encoding the string should be interpreted in after applying the quoted-printable decoding. The example code above substrings the portions "manually", but you also can solve all that in one step:

import email

[ (text, encoding) ] = email.header.decode_header(incode)
result = text.decode(encoding)

结果现在将再次是上面给出的字符串.

result now will again be the string given above.

这篇关于从ISO-8859-5开始的Python解码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆