是否可以以编程方式“清理”电子邮件? [英] Is it possible to programmatically 'clean' emails?
问题描述
在电子邮件中,有几个约定的标记意味着你想要的东西剥离您可以使用正则表达式查找这些行。我怀疑你不能很好地清理你的电子邮件,但有些东西你可以寻找:
- 以>开头的行(大于那个空格)标记一个引号
- 使用 - (两个连字符,然后是空格,然后换行)标记签名的开头,请参见维基百科上的签名块
- 多部分消息,边界以 - ,除此之外,您需要进行一些搜索,将邮件正文部分与不需要的部分(如base64图像)分开。
至于一个实际的C#实现,我为你或其他的人留下。
Does anyone have any suggestions as to how I can clean the body of incoming emails? I want to strip out disclaimers, images and maybe any previous email text that may be also be present so that I am left with just the body text content. My guess is it isn't going to be possible in any reliable way, but has anyone tried it? Are there any libraries geared towards this sort of thing?
In email, there is couple of agreed markings that mean something you wish to strip. You can look for these lines using regular expressions. I doubt you can't really well "sanitize" your emails, but some things you can look for:
- Line starting with "> " (greater than then whitespace) marks a quote
- Line with "-- " (two hyphens then whitespace then linefeed) marks the beginning of a signature, see Signature block on Wikipedia
- Multipart messages, boundaries start with --, beyond that you need to do some searching to separate the message body parts from unwanted parts (like base64 images)
As for an actual C# implementation, I leave that for you or other SOers.
这篇关于是否可以以编程方式“清理”电子邮件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!