检测文本的语言 [英] Detect language of text

查看:136
本文介绍了检测文本的语言的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有没有能检测特定一段文本的语言的任何C#库?即对于输入文本这是一句,它应该检测语言为英语。或埃斯托ES UNA sentencia应该检测语言为西班牙

Is there any C# library which can detect the language of a particular piece of text? i.e. for an input text "This is a sentence", it should detect the language as "English". Or for "Esto es una sentencia" it should detect the language as "Spanish".

据我所知,从文字语言检测不确定性问题。但两者谷歌翻译必应翻译有一个自动检测选项,其中最好的猜测输入语言。是否有类似的东西可以公开,preferably在C#?

I understand that language detection from text is not a deterministic problem. But both Google Translate and Bing Translator have an "Auto detect" option, which best-guesses the input language. Is there something similar available publicly, preferably in C#?

推荐答案

的确不错,TextCat是语言识别非常好。而且它有很多不同语言的实现。

Yes indeed, TextCat is very good for language identification. And it has a lot of implementations in different languages.

有在.NET没有端口。所以我写了一个: NTextCat codeplex.com

There were no ports in .Net. So I have written one: NTextCat.codeplex.com.

这是纯粹的.Net框架的dll +命令行界面吧。它与从TextCat 74语言模型完全兼容,所以它能够检测语言开箱

It is pure .Net Framework dll + command line interface to it. It is fully compatible with 74 language models from TextCat, so it is capable of detecting language out of the box.

任何反馈非常AP preciated!新思路和功能要求,欢迎太:)

Any feedback is very appreciated! New ideas and feature requests are welcomed too :)

这篇关于检测文本的语言的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆