Python imaplib:正确显示非ASCII字符 [英] Python imaplib: Display non-ASCII characters correctly

查看:74
本文介绍了Python imaplib:正确显示非ASCII字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Python 3.5和 imaplib 从GMail提取电子邮件并打印其正文.正文包含非ASCII字符.这些以奇怪的方式被编码",我无法找到解决方法.

I am using Python 3.5 and imaplib to fetch an e-mail from GMail and print its body. The body contains non-ASCII characters. These are 'encoded' in a strange way and I cannot find out how to fix this.

import email
import imaplib

c = imaplib.IMAP4_SSL('imap.gmail.com')
c.login('example@gmail.com', 'password')

c.select('Inbox')
_, data = c.fetch(b'12345', '(RFC822)')

mail = data[0][1]
message = email.message_from_bytes(mail)
payload = message.get_payload()

body = mail[0].as_string()
print(body)

给予

>> ... Mit freundlichen Gr=C3=BC=C3=9Fen ...

而不是期望的

>> ... Mit freundlichen Grüßen ...

在我看来,这不是编码问题,而是转换问题.但是,如何告诉Python正确转换字符?有更方便的图书馆吗?

It looks to me like this is not an issue of encoding but one of conversion. But how do I tell Python to convert the characters correctly? Is there a more convenient library?

推荐答案

文本使用引号进行编码-printable encoding ,这是一种在ASCII文本中对非ASCII字符进行编码的方法.您可以使用python的 quopri 模块对其进行解码.

The text is encoded with quoted-printable encoding, which is a way to encode non-ascii characters in ascii text. You can decode it using python's quopri module.

>>> import quopri
>>> bs = b'Gr=C3=BC=C3=9Fen'

>>> # Decode quoted-printable to raw bytes.
>>> utf8 = quopri.decodestring(bs)

>>> # Decode bytes to text.
>>> s = utf8.decode('utf-8')
>>> print(s)
Grüßen

您可能会发现 quoted-printable 是电子邮件的 content-transfer-encoding 标头的值.

You may find that quoted-printable is the value of the email's content-transfer-encoding header.

这篇关于Python imaplib:正确显示非ASCII字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆