使用Python解析电子邮件 [英] Parsing email with Python

查看:190
本文介绍了使用Python解析电子邮件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在编写一个Python脚本来处理从 Procmail 返回的电子邮件。根据此问题中的建议,我使用以下Procmail配置:

 :0:
| $ HOME / process_mail.py

我的process_mail.py脚本通过stdin接收邮件,如下所示:

 从主机名Tue Jun 15 21:43:30 2010 
收到:(qmail 8580从网络调用); 15 Jun 2010 21:43:22 -0400
收到:从mail-fx0-f44.google.com(209.85.161.44)
由ip-73-187-35-131.ip.secureserver。网络用SMTP; 15 Jun 2010 21:43:22 -0400
收到:由fxm19与SMTP id 19so170709fxm.3
为< username@domain.com> ;;星期二,2010年6月15日18:47:33 -0700(PDT)
MIME版本:1.0
收到:10.103.84.1与SMTP id m1mr2774225mul.26.1276652853684;星期二,15
2010年6月18:47:33 -0700(PDT)
收到:由10.123.143.4与HTTP;星期二,2010年6月15日18:47:33 -0700(PDT)
日期:星期二,2010年6月15日20:47:33 -0500
Message-ID:< AANLkTikFsIjJ3KYW1HJWcAqQlGXNiXE2YMzrj39I0tdB@mail.gmail.com> ;
主题:TEST 12
从:全名< username@sender.com>
至:username@domain.com
内容类型:文本/普通; charset = ISO-8859-1

ONE
TWO
THREE

我以这种方式解析消息:

 >>>导入电子邮件
>>> msg = email.message_from_string(full_message)

我想获取消息字段,如From,To '和'主题'。但是,消息对象不包含任何这些字段。



我做错了什么?

解决方案

你必须确保这些行不会被意外破坏(如上所述,尽管很难说这是一个复制粘贴问题) - 一个完整的信息,如:

 收到:(qmail 8580从网络调用); 15 Jun 2010 21:43:22 -0400 
收到:从mail-fx0-f44.google.com(209.85.161.44)由ip-73-187-35-131.ip.secureserver.net与SMTP; 15 Jun 2010 21:43:22 -0400
收到:由fxm19与SMTP id 19so170709fxm.3 for< username@domain.com> ;;星期二,2010年6月15日18:47:33 -0700(PDT)
MIME版本:1.0
收到:10.103.84.1与SMTP id m1mr2774225mul.26.1276652853684;星期二,2010年6月15日18:47:33 -0700(PDT)
收到:由10.123.143.4与HTTP;星期二,2010年6月15日18:47:33 -0700(PDT)
日期:星期二,2010年6月15日20:47:33 -0500
Message-ID:< AANLkTikFsIjJ3KYW1HJWcAqQlGXNiXE2YMzrj39I0tdB@mail.gmail.com> ;
主题:TEST 12
从:全名< username@sender.com>
至:username@domain.com
内容类型:文本/普通; charset = ISO-8859-1

ONE
TWO
THREE

然后

  msg = email.message_from_string(msgtxt)
打印msg ['主题']

打印 TEST 12 >

I'm writing a Python script to process emails returned from Procmail. As suggested in this question, I'm using the following Procmail config:

:0:
|$HOME/process_mail.py

My process_mail.py script is receiving an email via stdin like this:

From hostname Tue Jun 15 21:43:30 2010
Received: (qmail 8580 invoked from network); 15 Jun 2010 21:43:22 -0400
Received: from mail-fx0-f44.google.com (209.85.161.44)
by ip-73-187-35-131.ip.secureserver.net with SMTP; 15 Jun 2010 21:43:22 -0400
Received: by fxm19 with SMTP id 19so170709fxm.3
for <username@domain.com>; Tue, 15 Jun 2010 18:47:33 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.103.84.1 with SMTP id m1mr2774225mul.26.1276652853684; Tue, 15
Jun 2010 18:47:33 -0700 (PDT)
Received: by 10.123.143.4 with HTTP; Tue, 15 Jun 2010 18:47:33 -0700 (PDT)
Date: Tue, 15 Jun 2010 20:47:33 -0500
Message-ID: <AANLkTikFsIjJ3KYW1HJWcAqQlGXNiXE2YMzrj39I0tdB@mail.gmail.com>
Subject: TEST 12
From: Full Name <username@sender.com>
To: username@domain.com
Content-Type: text/plain; charset=ISO-8859-1

ONE
TWO
THREE

I'm trying to parse the message in this way:

>>> import email
>>> msg = email.message_from_string(full_message)

I want to get message fields like 'From', 'To' and 'Subject'. However, the message object does not contain any of these fields.

What am I doing wrong?

解决方案

You must ensure that the lines are not accidentally broken (as they are above, though it's hard to say if that was a copy-paste problem) -- with an intact message such as:

Received: (qmail 8580 invoked from network); 15 Jun 2010 21:43:22 -0400
Received: from mail-fx0-f44.google.com (209.85.161.44) by ip-73-187-35-131.ip.secureserver.net with SMTP; 15 Jun 2010 21:43:22 -0400
Received: by fxm19 with SMTP id 19so170709fxm.3 for <username@domain.com>; Tue, 15 Jun 2010 18:47:33 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.103.84.1 with SMTP id m1mr2774225mul.26.1276652853684; Tue, 15 Jun 2010 18:47:33 -0700 (PDT)
Received: by 10.123.143.4 with HTTP; Tue, 15 Jun 2010 18:47:33 -0700 (PDT)
Date: Tue, 15 Jun 2010 20:47:33 -0500
Message-ID: <AANLkTikFsIjJ3KYW1HJWcAqQlGXNiXE2YMzrj39I0tdB@mail.gmail.com>
Subject: TEST 12
From: Full Name <username@sender.com>
To: username@domain.com
Content-Type: text/plain; charset=ISO-8859-1

ONE
TWO
THREE

then

msg = email.message_from_string(msgtxt)
print msg['Subject']

prints TEST 12 as desired.

这篇关于使用Python解析电子邮件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆