如何检测一个字符串的语言? [英] How to detect the language of a string?

查看:160
本文介绍了如何检测一个字符串的语言?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

什么是检测一个字符串的语言的最佳方式?

What's the best way to detect the language of a string?

推荐答案

如果您code的情况下能上网,你可以尝试使用谷歌API的语言检测。
<一href=\"http://$c$c.google.com/apis/ajaxlanguage/documentation/\">http://$c$c.google.com/apis/ajaxlanguage/documentation/

If the context of your code have internet access, you can try to use the Google API for language detection. http://code.google.com/apis/ajaxlanguage/documentation/

var text = "¿Dónde está el baño?";
google.language.detect(text, function(result) {
  if (!result.error) {
    var language = 'unknown';
    for (l in google.language.Languages) {
      if (google.language.Languages[l] == result.language) {
        language = l;
        break;
      }
    }
    var container = document.getElementById("detection");
    container.innerHTML = text + " is: " + language + "";
  }
});

此外,由于使用的是C#,看看<一个href=\"http://www.esotericdelights.com/post/2008/11/Calling-Google-Ajax-Language-API-for-Translation-and-Language-Detection-from-C.aspx\">this如何从C#调用API文章。

更新:
那C#链接走了,这里是它的核心的缓存副本:

UPDATE: That c# link is gone, here's a cached copy of the core of it:

string s = TextBoxTranslateEnglishToHebrew.Text;
string key = "YOUR GOOGLE AJAX API KEY";
GoogleLangaugeDetector detector =
   new GoogleLangaugeDetector(s, VERSION.ONE_POINT_ZERO, key);

GoogleTranslator gTranslator = new GoogleTranslator(s, VERSION.ONE_POINT_ZERO,
   detector.LanguageDetected.Equals("iw") ? LANGUAGE.HEBREW : LANGUAGE.ENGLISH,
   detector.LanguageDetected.Equals("iw") ? LANGUAGE.ENGLISH : LANGUAGE.HEBREW,
   key);

TextBoxTranslation.Text = gTranslator.Translation;


基本上,你需要创建一个URI,并将其发送给谷歌,看起来像:


Basically, you need to create a URI and send it to Google that looks like:

<一个href=\"http://ajax.googleapis.com/ajax/services/language/translate?v=1.0&q=hello%20worled&langpair=en%7ciw&key=your_google_api_key_goes_here\">http://ajax.googleapis.com/ajax/services/language/translate?v=1.0&q=hello%20worled&langpair=en%7ciw&key=your_google_api_key_goes_here

这告诉你想翻译的Hello World从英语到希伯来语的API,谷歌的JSON响应看起来到这样的:

This tells the API that you want to translate "hello world" from English to Hebrew, to which Google's JSON response would look like:

{"responseData": {"translatedText":"שלום העולם"}, "responseDetails": null, "responseStatus": 200}

我选择做一个基类,再presents一个典型的谷歌JSON响应:

I chose to make a base class that represents a typical Google JSON response:

[Serializable]
public class JSONResponse
{
   public string responseDetails = null;
   public string responseStatus = null;
}

然后,从这个类继承的对象翻译:

Then, a Translation object that inherits from this class:

[Serializable]
public class Translation: JSONResponse
{
   public TranslationResponseData responseData = 
    new TranslationResponseData();
}

本翻译类有一个TranslationResponseData对象,看起来像这样:

This Translation class has a TranslationResponseData object that looks like this:

[Serializable]
public class TranslationResponseData
{
   public string translatedText;
}

最后,我们可以使GoogleTranslator类:

Finally, we can make the GoogleTranslator class:

using System;
using System.Collections.Generic;
using System.Text;

using System.Web;
using System.Net;
using System.IO;
using System.Runtime.Serialization.Json;

namespace GoogleTranslationAPI
{

   public class GoogleTranslator
   {
      private string _q = "";
      private string _v = "";
      private string _key = "";
      private string _langPair = "";
      private string _requestUrl = "";
      private string _translation = "";

      public GoogleTranslator(string queryTerm, VERSION version, LANGUAGE languageFrom,
         LANGUAGE languageTo, string key)
      {
         _q = HttpUtility.UrlPathEncode(queryTerm);
         _v = HttpUtility.UrlEncode(EnumStringUtil.GetStringValue(version));
         _langPair =
            HttpUtility.UrlEncode(EnumStringUtil.GetStringValue(languageFrom) +
            "|" + EnumStringUtil.GetStringValue(languageTo));
         _key = HttpUtility.UrlEncode(key);

         string encodedRequestUrlFragment =
            string.Format("?v={0}&q={1}&langpair={2}&key={3}",
            _v, _q, _langPair, _key);

         _requestUrl = EnumStringUtil.GetStringValue(BASEURL.TRANSLATE) + encodedRequestUrlFragment;

         GetTranslation();
      }

      public string Translation
      {
         get { return _translation; }
         private set { _translation = value; }
      }

      private void GetTranslation()
      {
         try
         {
            WebRequest request = WebRequest.Create(_requestUrl);
            WebResponse response = request.GetResponse();

            StreamReader reader = new StreamReader(response.GetResponseStream());
            string json = reader.ReadLine();
            using (MemoryStream ms = new MemoryStream(Encoding.Unicode.GetBytes(json)))
            {
               DataContractJsonSerializer ser =
                  new DataContractJsonSerializer(typeof(Translation));
               Translation translation = ser.ReadObject(ms) as Translation;

               _translation = translation.responseData.translatedText;
            }
         }
         catch (Exception) { }
      }
   }
}

这篇关于如何检测一个字符串的语言?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆