是否可以在.NET中将Unicode十六进制\ u0092显示(转换为)? [英] Is it possible to display (convert?) the unicode hex \u0092 to an unicode html entity in .NET?
问题描述
我有一些包含以下代码/值的字符串:
您不会找到...的."
该字符串似乎包含Right Apostrophe特殊字符.
- ref1:
请注意
&符
已正确编码....解决方案似乎存在编码混淆.在.NET中,字符串通常编码为UTF-16,右撇号应表示为
\ u2019
.但是在您的示例中,右撇号表示为\ x92
,这表明原始编码为 Windows代码页1252 .如果将字符串包含在Unicode文档中,则字符\ x92
将无法正确解释.您可以通过将字符串重新编码为UTF-16来解决此问题.为此,请将字符串视为字节数组,然后使用1252代码页将字节转换回Unicode:
字符串标题=您不会找到更便宜的公寓*桑拿和水疗中心";byte [] bytes = title.Select(c =>(byte)c).ToArray();title = Encoding.GetEncoding(1252).GetString(bytes);//结果:您找不到便宜的公寓*桑拿和水疗中心"
I have some string that contains the following code/value:
"You won\u0092t find a ...."
It looks like that string contains the Right Apostrophe special character.
I'm not sure how to display this to the webbrowser. It keeps displaying the TOFU square-box character instead. I'm under the impression that the unicode (hex) value
00092
can be converted to unicode (html)’
Is my understanding correct?
Update 1:
It was suggested by @sam-axe that I HtmlEncode the unicode. That didn't work. Here it is...
Note the
ampersand
got correctly encoded....解决方案It looks like there's an encoding mix-up. In .NET, strings are normally encoded as UTF-16, and a right apostrophe should be represented as
\u2019
. But in your example, the right apostrophe is represented as\x92
, which suggests the original encoding was Windows code page 1252. If you include your string in a Unicode document, the character\x92
won't be interpreted properly.You can fix the problem by re-encoding your string as UTF-16. To do so, treat the string as an array of bytes, and then convert the bytes back to Unicode using the 1252 code page:
string title = "You won\u0092t find a cheaper apartment * Sauna & Spa"; byte[] bytes = title.Select(c => (byte)c).ToArray(); title = Encoding.GetEncoding(1252).GetString(bytes); // Result: "You won’t find a cheaper apartment * Sauna & Spa"
这篇关于是否可以在.NET中将Unicode十六进制\ u0092显示(转换为)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!