检测文本的语言 [英] Detect language of text
问题描述
有没有能检测特定一段文本的语言的任何C#库?即对于输入文本这是一句
,它应该检测语言为英语
。或埃斯托ES UNA sentencia
应该检测语言为西班牙
。
Is there any C# library which can detect the language of a particular piece of text? i.e. for an input text "This is a sentence"
, it should detect the language as "English"
. Or for "Esto es una sentencia"
it should detect the language as "Spanish"
.
据我所知,从文字语言检测不确定性问题。但两者谷歌翻译和必应翻译A>有一个自动检测选项,其中最好的猜测输入语言。是否有类似的东西可以公开,preferably在C#?
I understand that language detection from text is not a deterministic problem. But both Google Translate and Bing Translator have an "Auto detect" option, which best-guesses the input language. Is there something similar available publicly, preferably in C#?
推荐答案
的确不错,TextCat是语言识别非常好。而且它有很多不同语言的实现。
Yes indeed, TextCat is very good for language identification. And it has a lot of implementations in different languages.
有在.NET没有端口。所以我写了一个: NTextCat codeplex.com
There were no ports in .Net. So I have written one: NTextCat.codeplex.com.
这是纯粹的.Net框架的dll +命令行界面吧。它与从TextCat 74语言模型完全兼容,所以它能够检测语言开箱
It is pure .Net Framework dll + command line interface to it. It is fully compatible with 74 language models from TextCat, so it is capable of detecting language out of the box.
任何反馈非常AP preciated!新思路和功能要求,欢迎太:)
Any feedback is very appreciated! New ideas and feature requests are welcomed too :)
这篇关于检测文本的语言的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!