反弹的电子邮件解析 [英] Bounced email parsing

查看:176
本文介绍了反弹的电子邮件解析的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前在捕捉,解析和排序反弹的电子邮件方面有一些困扰。我的基础设置很好,它做我想要的,这是很好的...问题是,似乎没有标准的邮件返回退回的邮件。



例如,一些服务器返回由RFC 1893指定的错误代码,我可以通过一个简单的正则表达式拾取十次。但是有时服务器只是回应说电子邮件已经弹起来了,没有任何理由给出或者说一个理由与所有标准完全不同。



所以我猜我的问题是有人有任何解决方案吗?我不想在电子邮件中找到十亿个可能的字符串,这是真实的。然而,不用诉诸理由不明或类似的东西也是很好的。



有没有其他人有这样的想法或运气?
干杯

解决方案

也不是一个确定的答案,但是与Kyle的回应类似,您可以使用贝叶斯/基于令牌的垃圾邮件过滤器,以学习有关反弹消息,然后自动将它们路由到任何您想要处理退回邮件。



换句话说,您有一个帐户在那里你训练垃圾邮件软件或spamprobe或者一堆不同的反弹消息(只有反弹消息)是垃圾,然后让垃圾邮件系统成为第二线过滤,无论你开发了什么。



所以,让我们来看看你的解决方案,第一个过滤器,找到90%的退回邮件。您的系统可以通过反弹进行任何操作,然后将其保存到一个弹跳消息邮箱,该邮箱将被spamassasin / spamprobe定期扫描,以将这些消息学习为垃圾。



你也有垃圾邮件或垃圾邮件或任何作为第二个过滤器(运行在任何你没有标记为反弹)做自己的反弹估计,以及任何它认为垃圾(因为你'我们经过训练去思考反弹=垃圾),你也路线到你的程序等。



仍然需要一些手动审查,但在理论上应该更好,随着时间的推移,您可以依靠垃圾邮件系统的学习来解决边缘案例。


I'm currently having a mess about with catching, parsing and sorting bounced emails. I have the basics set up nicely and it does what I want, which is nice... problem being is that there seems to be no standard to the messages returned in the bounced email.

For example, some servers return the error code as specified by RFC 1893 and I can nine times out of ten pick that up via a simple regex thing. But sometimes servers just respond saying that the email has bounced, with either no reason given or a reason worded entirely different to any standards.

So I guess my question is, has anyone got any solution to this? I don't want to be searching for a billion and one possible strings in the email returned to be honest. Yet it would be nice to not have to resort to 'reason unknown' or something similar.

Has anyone else had any luck with this or ideas? Cheers

解决方案

Also not a definitive answer, but in a similar spirit to Kyle's response, you could use a bayes/token based spam filter to "learn" about bounce messages and then automatically route them to whatever you want to handle the bounced mail.

In other words, you have an account where you train spamassassin or spamprobe or whatever that a bunch of different bounce messages (and only bounce messages) are "junk", then let that spam system be a second line of filtering after whatever you've developed.

So, let's say your solution, the first filter, finds 90% of bounced messages. You have your system do whatever it normally does with bounces, then save them to a bounce-messages mailbox, which is periodically scanned by spamassasin/spamprobe to learn those messages as "junk".

You also then have spamassassin or spamprobe or whatever as a second filter (run on anything yours doesn't flag as a bounce) do its own estimation of bounced-ness, and whatever it considers "junk" (because you've trained to to think bounce = junk), you also route to your program etc.

Still requires a little bit of manual review, but in theory it should get better and better over time as you rely on the spam system's learning to account for the edge cases.

这篇关于反弹的电子邮件解析的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆