C＃将Facebook响应转换为适当的编码字符串 [英] C# To transform Facebook Response to proper encoded string

查看：128 发布时间：2017/8/17 2:19:56 facebook facebook-graph-api encoding utf-8

本文介绍了C＃将Facebook响应转换为适当的编码字符串的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在使用常规的Stream Reader从Facebook图表API响应中获取回复
https://graph.facebook.com/XXXX?access_token=&fields=id,name,about,address,last_name

我正在阅读回复流，但它返回我
{id：XXXXX，name：K\\\ır\\\ınt\\\ı Reklam...}

我的代码如下 - 我未能成功尝试使用显式的UTF-8和iso-8859-9（土耳其语）编码并设置accept-charset标题。我读了乔尔关于编码的着名文章。它看起来像每个字符'\\'''你'1''3''1'来自Facebook的字符 - 我以为这将是UTF-8中的值131的2字节。我很困惑。我希望这个字符串是KırıntıReklam。

我可以简单地找到/替换这些字符串 - 但它将远离优雅和可维护。如何正确处理或转换带有重音符号的字符串的脸书图形api响应？

  using（WebResponse response = request.GetResponse ）
 {
 using（Stream dataStream = response.GetResponseStream（））
 {
 if（dataStream！= null）
 {
 using（StreamReader reader = new StreamReader（dataStream））
 {
 responseFromServer = reader.ReadToEnd（）; 
} 
 
} 
} 
}

谢谢你提前

解决方案

tldr; 使用JSON库 - 我喜欢 Json.NET - 不要担心。

所示的JSON是有效 JSON ，其中 \\\ꯍ JSON字符串中的code>表示UTF-16编码字符¹。内部JSON字符转义格式对于避免不得不处理Unicode 编码问题很有用 - 它允许JSON完全以ASCII / 7位清理字符表示（这是UTF-8的一个子集）。

 
 
 使用符合一致的JSON库解析具有这种转义的JSON可将JSON恢复为适当的对象图，其中一些值将被正确解码字符串值。图书馆负责理解JSON并酌情转换/阅读它 - 这包括正确处理任何这样的 \u 转义序列。
 
 
  流本身（ JSON文本）应该使用服务器所说的编码，由BOM指示或已经预先协商但是真的，只是UTF-8这里。这是JSON文本的编码方式，但与JSON字符串中的转义序列无关。
 
 
 
 
 
  ¹根据 RFC 4627，应用程序/ json媒体类型对于JavaScript对象表示法（JSON）：
 
  任何字符都可以转义。在Basic 
多语言平面（U + 0000到U + FFFF）中，那么它可能是
表示为六个字符的顺序：一个反向的固定符，后面是
的小写字母u ，后跟四个十六进制数字，
编码字符的代码点。十六进制字母A虽然
 F可以是大写或小写。所以，例如，一个包含
的字符串只有一个反向的固态字符可以表示为
 \。
 
 
 或者，有一些流行字符的双字符序列转义
表示。所以，例如，一个仅包含单个反向固态字符的
字符串可能是
更为紧凑地表示为\\。 p> 
 
 
要转义不在基本多语言
平面中的扩展字符，该字符表示为十二个字符的序列，
编码UTF- 16代理对。因此，例如，只包含G clef字符（U + 1D11E）的字符串
可以表示为
 \\\�\\\� 
 
 
 
I am using regular Stream Reader to get response from Facebook graph API response 
https://graph.facebook.com/XXXX?access_token=&fields=id,name,about,address,last_name

I am reading the response stream yet it returns me
{"id":"XXXXX","name":"K\u0131r\u0131nt\u0131 Reklam"...}

My code is below - I unsuccessfully tried using explicitly UTF-8 and "iso-8859-9" (Turkish) encodings and setting accept-charset headers. I read Joel's famous article about encodings. It looks like each of the chars '\' 'u' '1' '3' '1' are coming as characters from facebook - I thought this would have been 2 bytes for value 131 in UTF-8. I am confused. I expect this string to be "Kırıntı Reklam".

I could simply find/replace those strings - yet it would be far from elegant and maintainable. How should I properly process or convert the facebook graph api response for strings with accents?
using (WebResponse response = request.GetResponse())
{
using (Stream dataStream = response.GetResponseStream())
{
    if (dataStream != null)
    {
        using (StreamReader reader = new StreamReader(dataStream))
        {
            responseFromServer = reader.ReadToEnd();
        }

    }
}
}
Thank you in advance
 解决方案 
tldr; use a JSON library - I like Json.NET - and don't worry about it.

The JSON shown is valid JSON where \uABCD in a JSON string represents a UTF-16 encoded character¹. The internal JSON character escaping format is useful to avoid having to deal with Unicode stream encoding issues - it allows JSON to be represented entirely in ASCII/7-bit-clean characters (which is a subset of UTF-8).

Using a conforming JSON library to parse the JSON with such escapes would restore the JSON into an appropriate object-graph, of which some values will be properly-decoded String values. The library is responsible for understanding JSON and converting/reading it as appropriate - this includes correctly handling any such \u escape sequences.

The stream itself (that of the JSON text) should use the encoding that the server says, is indicated by a BOM, or has been pre-negotiated: but really, just UTF-8 here. This is how the JSON text is encoded, but has no bearing on the escape sequences found in JSON strings.



¹ Per RFC 4627, The application/json Media Type for JavaScript Object Notation (JSON):

  Any character may be escaped.  If the character is in the Basic
     Multilingual Plane (U+0000 through U+FFFF), then it may be
     represented as a six-character sequence: a reverse solidus, followed
     by the lowercase letter u, followed by four hexadecimal digits that
     encode the character's code point.  The hexadecimal letters A though
     F can be upper or lowercase.  So, for example, a string containing
     only a single reverse solidus character may be represented as
     "\u005C".
  
  Alternatively, there are two-character sequence escape
     representations of some popular characters.  So, for example, a
     string containing only a single reverse solidus character may be
     represented more compactly as "\\".
  
  To escape an extended character that is not in the Basic Multilingual
     Plane, the character is represented as a twelve-character sequence,
     encoding the UTF-16 surrogate pair.  So, for example, a string
     containing only the G clef character (U+1D11E) may be represented as
     "\uD834\uDD1E"


                        
这篇关于C＃将Facebook响应转换为适当的编码字符串的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

C＃将Facebook响应转换为适当的编码字符串 [英] C# To transform Facebook Response to proper encoded string

问题描述

相关文章

开发方法最新文章

热门教程

热门工具

登录关闭

C＃将Facebook响应转换为适当的编码字符串 [英] C# To transform Facebook Response to proper encoded string

问题描述

相关文章

开发方法最新文章

热门教程

热门工具

登录 关闭

登录关闭