目前最好的垃圾邮件过滤算法 [英] Currently best spam filter algorithm
问题描述
什么是检测垃圾邮件目前最好的方法是什么?尤其是在手机短信。 是否有任何资源或比较分析?
What is the currently best method to detect spam ? especially on mobile text message. are there any resource or comparison analysis ?
推荐答案
这是很好的寻找到监督学习技术。目前已经大量的研究,其中多项朴素贝叶斯分类器已经被用于垃圾邮件过滤了很多的成功。如果它为垃圾邮件过滤功能,那么它应该与短信过滤。你需要的是的例子垃圾短信文本巨大的数据集和训练分类吧。
It's good to look into supervised learning techniques. There've been a number of studies where the Multinomial Naive Bayes Classifier has been used for spam email filtering with a lot of success. If it worked for spam email filtering, then it should work with SMS filtering. What you need is a huge dataset of example spam SMS texts and train the classifier with it.
另外,也可以是有帮助的窥视支持向量机,其中;虽然少了广泛的垃圾邮件过滤使用;是一个更强大的技术。
Also, it may be helpful to look into the Support Vector Machine, which; although less widely used in spam filtering; is a much more powerful technique.
另外,刚训练算法上的原始文字可能不完全是向前迈进的最佳途径。有一项研究迈赫兰Sahami从1998年发现,他们实现的时候,他们采取了其他启发式考虑(如发送到邮件列表的电子邮件?是从在任一埃杜结束了一个域名发送的电子邮件性能优越,。com,。org进行?做了电子邮件包含多个标点符号(!)?,等等)。
Also, just training the algorithms on raw text may not quite be the best way forward. There was a study by Mehran Sahami from 1998 that found that they achieved superior performance when they took other heuristics into consideration (e.g. was the email sent to a mailing list? was the email sent from a domain name that ended in either ".edu",".com",".org"? did the email contain multiple punctuation marks ("!!!")?, and so forth).
但随着多项朴素贝叶斯分类器开始。这很容易实现,这是非常容易使用,并且从个人的经验:它有一个很短的训练时间,以及
But start off with the Multinomial Naive Bayes Classifier. It's very simple to implement, and it's very easy to use, and from personal experience: it has a very short training time, as well.
这篇关于目前最好的垃圾邮件过滤算法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!