Python:这种编码是什么以及如何解码? [英] Python: What is this encoding and how to decode?
本文介绍了Python:这种编码是什么以及如何解码?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有很多来自邮件正文的字符串,其打印方式如下:
I have a lot of strings from mail bodies, that print as such:
=C3=A9
例如,应为é。
此编码到底是什么以及如何对其进行解码?
What exactly is this encoding and how to decode it?
我正在使用python 3.5
I'm using python 3.5
编辑:
我设法通过应用以下内容来正确编码邮件的正文:
I managed to get the body of the mail properly encoded by applying:
quopri.decodestring(sometext).decode('utf-8')
但是我仍然很难使FROM,TO,SUBJECT等...零件正确。
However I still struggle to get the FROM , TO, SUBJECT, etc... parts get right.
这是我构造电子邮件的方式:
This is how I construct the e-mails:
import imaplib
import email
import quopri
mail = imaplib.IMAP4_SSL('imap.gmail.com')
mail.login('mail@gmail.com', '*******')
mail.list()
mail.select('"[Gmail]/All Mail"')
typ, data = mail.search(None, 'SUBJECT', '"{}"'.format('123456'))
data[0].split()
print(data[0].split())
for e_mail in data[0].split():
typ, data = mail.fetch('{}'.format(e_mail.decode()),'(RFC822)')
raw_mail = data[0][1]
email_message = email.message_from_bytes(raw_mail)
if email_message.is_multipart():
for part in email_message.walk():
if part.get_content_type() == 'text/plain':
if part.get_content_type() == 'text/plain':
body = part.get_payload()
to = email_message['To']
utf = quopri.decodestring(to)
text = utf.decode('utf-8')
print(text)
.
.
.
我还是得到这个:=?UTF-8?B?UMOpdGVyIFBldMWRY3o =?=
I still got this: =?UTF-8?B?UMOpdGVyIFBldMWRY3o=?=
推荐答案
这解决了它:
from email.header import decode_header
def mail_header_decoder(self,header):
if header != None:
mail_header_decoded = decode_header(header)
l=[]
header_new=[]
for header_part in mail_header_decoded:
l.append(header_part[1])
if all(item == None for item in l):
# print(header)
return header
else:
for header_part in mail_header_decoded:
header_new.append(header_part[0].decode())
header_new = ''.join(header_new) # convert list to string
# print(header_new)
return header_new
这篇关于Python:这种编码是什么以及如何解码?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文