可靠的方式只能获取电子邮件文本,不包括以前的电子邮件 [英] Reliable way to only get the email text, excluding previous emails

查看:100
本文介绍了可靠的方式只能获取电子邮件文本,不包括以前的电子邮件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在创建一个基本系统,允许用户通过电子邮件回复网站上的一个线程。但是,大多数电子邮件客户端都会在回复电子邮件中包含以前的电子邮件文本。这个文本在网站上是不需要的。

I'm creating a basic system that allows users to reply to a thread on the website via email. However, most email clients include the text of the previous emails in their reply emails. This text is unwanted on the website.

有没有一个可靠的方法,我只能提取新的消息,没有事先了解早期的电子邮件?我正在使用Python的电子邮件类。

Is there a reliable way in which I can extract only the new message, without prior knowledge about the earlier emails? I'm using the email class of Python.

Content-Type: text/plain; charset=ISO-8859-1

test message! This is the part I want.

On Thu, Mar 24, 2011 at 3:51 PM, <test@test.com> wrote:

> Hi!
>
> Herman just posted a comment on the website:
>
>
> From: Herman
> "Hi there! I might be interested"
>
>
> Regards,
> The Website Team
> http://www.test.com
>

这是gmail的回复邮件,我确定其他客户端可能会有所不同。一个好的开始可能是忽略以> 开始的行,但也可能在新消息之间存在这样的行,然后他们可能应该保留。我还会有内容类型的行和日期行。

This is a reply message from gmail, I'm sure other clients might do it differently. A good start would probably be to ignore the lines that start with >, but there could also be lines like that in between the new message, and then they probably should be kept. I'll also still have the content-type line and the date line.

推荐答案

电子邮件回复的格式取决于客户端。没有实际的方式来提取最新的消息,而不会有太多或不足够的风险。

The formatting of email replies depend on the clients. There is no realiable way to extract the newest message without the risk of removing too much or not enough.

但是,标记引号的常见方式是将其标记为> 所以从该字符开始的行 - 特别是如果在电子邮件的最后或开始有多个可能是引号。

However, a common way to mark quotes is by prefixing them with > so lines starting with that character - especially if there are multiple at the very end or beginning of the email - are likely to be quotes.

但是, 2011年3月24日下午3:51,< test@test.com>写道:从你的例子很难提取。在报价之前的一条以结尾的行:可能表示它属于报价,您不能确定,它也可能是新消息的一部分,冒号只是一个打字错误的。(在德语键盘上 SHIFT +。 code>)。

But the On Thu, Mar 24, 2011 at 3:51 PM, <test@test.com> wrote: from your example is hard to extract. A line ending with a : right before a quote might indicate that it belongs to the quote, you cannot know that for sure - it could also be part of the new message and the colon is just a typo'd . (on german keyboards : is SHIFT+.).

这篇关于可靠的方式只能获取电子邮件文本,不包括以前的电子邮件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆