检测电子邮件正文的语言 [英] Detecting language of email body
问题描述
我需要实现一个自动化的电子邮件回复系统。
I need to implement an automated email reply system.
这里对于系统,我需要检查收到的电子邮件,并以相同的语言回复电子邮件,其中收到电子邮件。
Here for the system i need to check the incoming emails and reply the email in the same language in which the email was received.
我该怎么做这样的事情,请提出一些想法?感谢提前。
How can i do such a thing , please suggest some ideas? Thanks in advance.
追加一个查询:
-
在电子邮件标题中还有一个标题:
In the email headers there is one more header of the kind:
Content-Type: text/plain; charset=ISO-8859-1
它可以证明确定电子邮件正文的语言?
How good it can prove in determining the language of the email body?
eg(从gmail中取出所有标题):
e.g (all headers taken out from gmail):
-
中文主体和身体
Content-Type:text / plain; charset = GB2312
韩国主体和身体 Content-Type:text / plain; charset = EUC-KR
法语/意大利主体和身体内容类型:text / html ; charset = ISO-8859-1
for french/italian subject and body Content-Type: text/html; charset=ISO-8859-1
还有一些人可以指导我
提前感谢
推荐答案
p> Google翻译可以猜测一个示例文本的语言。 查看API ,它可能是解决方案您的问题(如果您连接到互联网,并且不关心,发送邮件碎片到谷歌服务器...)。
Google translate can guess the language of a sample text. Have a look at the API, it could be a solution for your problem (if you're connected to the internet anyway and don't care, sending fragments of mails to google servers...).
对于离线评估,我发现 Java文本分类库。
For offline evaluation I found the Java Text Categorizing Library.
这篇关于检测电子邮件正文的语言的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!