如何将十六进制实体转换为其等效的Utf-8实体 [英] How Do I Convert Hex Entity To Its Equivalent Utf-8 Entities

查看:112
本文介绍了如何将十六进制实体转换为其等效的Utf-8实体的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果有人给我转换

HEX 到其等效的 UTF-8 的解决方案,那将是非常值得注意的。



例如:

我需要将'转换为â€,这里是HEX实体†™这是单一报价的实体。







等待您的宝贵回复.... 。!!!!

解决方案

如果这是Unicode代码点,您可以使用通常的整数表示形式的整数值与UTF32-LE相同的事实给定角色的表示。因此,您可以使用此功能: https:// msdn.microsoft.com/en-us/library/system.char.convertfromutf32%28v=vs.110%29.aspx



有趣的是,这不是一个字符,而是一个字符串。这与.NET中字符串的内存中表示的特性有关:在内部使用UTF-16,而BMP之外的字符表示为代理对。正式地,.NET表示处理代理对与两个字符一样(如果你试图手动操作这样的字符串,可能会导致形成无效字符串 - 从不这样做),尽管从Unicode的角度来看,这个字符串片段由4个字节表示的实际上是一个字符。你需要小心这种情况,永远不要在字节基础上运行这样的字符串。



获得字符串后(让我们称之为 someString ,你可以使用它或表示为UTF-8,它总是一些字节数组:

  int  codePoint =  //   ...  
string someString = char .ConvertFromUtf32(codePoint);
byte [] utf8 = System.Text.Encoding.UTF8.GetBytes(someString);

请参阅:

https://msdn.microsoft.com/en-us/library/system.text.encoding.utf8 (v = vs.110).aspx

https://msdn.microsoft.com/en-us/library/ds4kkd55(v = vs.110).aspx

https://msdn.microsoft.com/en- us / library / system.text.encoding%28v = vs.110%29.aspx



-SA

It would be very much appreciable if someone give me solution for Converting
"HEX" to its equivalent "UTF-8".

For example :
I need to convert "’" into "’", here ’ is HEX Entity of "’" which is Entity of Single Quote.



Waiting for your valuable response .....!!!!

解决方案

If this is Unicode code point, you can use the fact that its integer value using the usual integer representation is the same as UTF32-LE representation of the given character. So, you can use this function: https://msdn.microsoft.com/en-us/library/system.char.convertfromutf32%28v=vs.110%29.aspx.

Interestingly, this is not a character, but a string. This is related to the peculiarities of in-memory representation of strings in .NET: internally, UTF-16 is used, and the characters beyond BMP are represented as surrogate pairs. Formally, .NET representation deals with the surrogate pair as with two characters (which can lead to forming invalid string if you try to manipulate such string "manually" — never do it), even though, from the Unicode standpoint, this fragment of string represented by 4 bytes is really one character. You need to be careful with such cases, never operate such string on byte basics.

After you got the string (let's call it someString, you can use it or represent as UTF-8, which is always some array of bytes:

int codePoint = //...
string someString = char.ConvertFromUtf32(codePoint);
byte[] utf8 = System.Text.Encoding.UTF8.GetBytes(someString);

Please see:
https://msdn.microsoft.com/en-us/library/system.text.encoding.utf8(v=vs.110).aspx,
https://msdn.microsoft.com/en-us/library/ds4kkd55(v=vs.110).aspx,
https://msdn.microsoft.com/en-us/library/system.text.encoding%28v=vs.110%29.aspx.

—SA


这篇关于如何将十六进制实体转换为其等效的Utf-8实体的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆