读取.msg文件的属性 [英] Reading attributes of .msg file
问题描述
我正在尝试读取.msg文件以获取发件人,收件人和标题.
I am trying to read a .msg file to get the sender, recipients, and title.
我正在为我的工作场所制作此脚本,只允许安装默认的python库,因此我想使用电子邮件模块来完成此工作.
I'm making this script for my workplace where I'm only allowed to install default python libraries so I want to use the email module to do this.
在python网站上,我找到了一些使用email模块的示例. https://docs.python.org/3/library/email.examples. html
On the python website I found some examples of using the email module. https://docs.python.org/3/library/email.examples.html
在页面末尾附近讨论了如何获取发件人,主题和收件人.我已经尝试过使用如下代码:
Near the end of the page it talks about getting the sender, subject and recipient. I've tried using this code like this:
# Import the email modules we'll need
from email import policy
from email.parser import BytesParser
with open('test_email.msg', 'rb') as fp:
msg = BytesParser(policy=policy.default).parse(fp)
# Now the header items can be accessed as a dictionary, and any non-ASCII will
# be converted to unicode:
print('To:', msg['to'])
print('From:', msg['from'])
print('Subject:', msg['subject'])
这将导致输出:
To: None
From: None
Subject: None
我检查了文件test_email.msg,它是有效的电子邮件.
I checked the file test_email.msg, it is a valid email.
当我添加一行代码
print(msg)
我得到的电子邮件显示乱码,就像在记事本中打开.msg文件一样.
I get an output of a garbled email the same as if I opened the .msg file in notepad.
有人可以建议为什么电子邮件模块找不到正确的发件人/收件人/主题吗?
Can anybody suggest why the email module isn't finding the sender/recipient/subject correctly?
推荐答案
您显然正在尝试读取某种专有的二进制格式. Python email
库不支持此功能.它仅处理传统的(基本上是文本)RFC822/RFC5322格式.
You are apparently attempting to read some sort of proprietary binary format. The Python email
library does not support this; it only handles traditional (basically text) RFC822 / RFC5322 format.
要阅读Microsoft的OLE格式,您将需要第三方模块,以及一些耐心,巫毒和运气
To read Microsoft's OLE formats, you will need a third-party module, and some patience, voodoo, and luck.
根据记录,.msg
没有明确的定义. Outlook对其文件使用此文件扩展名,但同时也用于其他格式的其他文件,包括传统的RFC822文件.
Also, for the record, there is no unambigious definition of .msg
. Outlook uses this file extension for its files, but it is used on other files in other formats as well, including also traditional RFC822 files.
(第二个链接尝试链接到MSDN上的MS-OXMSG规范;但是Microsoft过去将URL视为某种可消耗的资源,当您使用它时,该资源会用完,因此该链接可能会停止如果有足够多的人点击它,则可以正常工作.)
这篇关于读取.msg文件的属性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!