使用REST和C#实现谷歌音译API,面对UNI code和分析问题 [英] Implementing Google transliterate API using REST and C#, facing unicode and parsing issues

查看:132
本文介绍了使用REST和C#实现谷歌音译API,面对UNI code和分析问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在尝试使用RESTful方法作为其容易透过服务器端语言(这里是C#)这样做,使用谷歌API的音译。

I have been trying to use Google Transliterate API using the RESTful approach as its easy to do so through server side language (C# here).

所以,我碰到这个URL格式为:<一href=\"http://www.google.com/transliterate/indic?tlqt=1&langpair=en|hi&text=bharat%2Cindia&tl_app=3\" rel=\"nofollow\">http://www.google.com/transliterate/indic?tlqt=1&langpair=en|hi&text=bharat%2Cindia&tl_app=3它返回JSON格式为:

So, I came across this URL format: http://www.google.com/transliterate/indic?tlqt=1&langpair=en|hi&text=bharat%2Cindia&tl_app=3 which returns the JSON in the format:

[
{
"ew" : "bharat",
"hws" : [
"भारत","भरत","भरात","भारात","बहरत",
]
},
{
"ew" : "india",
"hws" : [
"इंडिया","इन्डिया","इण्डिया","ईन्डिया","इनडिया",
]
},
] 

我试过的HttpWebRequest HttpWebResponse 来获得JSON,但它的Uni code在返回的值网络浏览器,如:

I tried HttpWebRequest and HttpWebResponse to get the JSON but it returned values in Unicode on the web browser, such as:

[ { "ew" : "bharat", "hws" : [ "\u092D\u093E\u0930\u0924","\u092D\u0930\u0924","\u092D\u0930\u093E\u0924","\u092D\u093E\u0930\u093E\u0924","\u092C\u0939\u0930\u0924", ] }, { "ew" : "india", "hws" : [ "\u0907\u0902\u0921\u093F\u092F\u093E","\u0907\u0928\u094D\u0921\u093F\u092F\u093E","\u0907\u0923\u094D\u0921\u093F\u092F\u093E","\u0908\u0928\u094D\u0921\u093F\u092F\u093E","\u0907\u0928\u0921\u093F\u092F\u093E", ] }, ]

所以,我应用这个<一个href=\"http://stackoverflow.com/questions/1615559/converting-uni$c$c-strings-to-escaped-ascii-string\">article并通过它传递的JSON 字符串,然后它返回:

So, I applied this article and passed the JSON string via it, and it returned:

[ { "ew" : "bharat", "hws" : [ "भारत","भरत","भरात","भारात","बहरत", ] }, { "ew" : "india", "hws" : [ "इंडिया","इन्डिया","इण्डिया","ईन्डिया","इनडिया", ] }, ]

第一个问题:我做是正确的那么远?因为在浏览器中不显示最后],但]在HTML源代码中存在(不知道为什么会发生)。此外,当我尝试分析它,使用(我可能会使用这种技术是错误的):

FIRST QUESTION: Am I doing it right so far? Because in the browser it DOES NOT show the last " ] ", however " ] " exists in the HTML source (not sure why that happened). Also, when I try to parse it, using (I might be wrong using this technique):

var jss = new JavaScriptSerializer();
var dict = jss.Deserialize<Dictionary<string, dynamic>>(the_JSON_string);

它给我的错误说:

Its giving me error saying:

Invalid array passed in, extra trailing ','.

第二个问题:如果我做对为止,我能得到一些帮助的解析印地文词的?我应该采取什么办法使用preferably System.Web.Script.Serialization; 。最后,我想抓住作进一步处理的印地文文本。

SECOND QUESTION: If I am doing right so far, can I get some help parsing the Hindi words? What approach should I take using preferably System.Web.Script.Serialization;. Eventually I want to grab the Hindi text for further processing.

请帮忙,谢谢。

推荐答案

我会建议 Json.Net 解析JSON字符串。低于code(带样品的字符串)的作品,你不需要做任何事情来取消转义这些字符。 JSON解析器会为您处理它。

I would recommend Json.Net to parse json strings. Below code(with your sample string) works and you don't need to do anything to unescape those characters. Json parsers will handle it for you.

string json = @"[ { ""ew"" : ""bharat"", ""hws"" : [ ""\u092D\u093E\u0930\u0924"",""\u092D\u0930\u0924"",""\u092D\u0930\u093E\u0924"",""\u092D\u093E\u0930\u093E\u0924"",""\u092C\u0939\u0930\u0924"", ] }, { ""ew"" : ""india"", ""hws"" : [ ""\u0907\u0902\u0921\u093F\u092F\u093E"",""\u0907\u0928\u094D\u0921\u093F\u092F\u093E"",""\u0907\u0923\u094D\u0921\u093F\u092F\u093E"",""\u0908\u0928\u094D\u0921\u093F\u092F\u093E"",""\u0907\u0928\u0921\u093F\u092F\u093E"", ] }, ]";

dynamic obj = JsonConvert.DeserializeObject(json);
MessageBox.Show(obj[0].hws[0].ToString());

这篇关于使用REST和C#实现谷歌音译API,面对UNI code和分析问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆