如何转换UNI code转义序列UNI code字符的字符串。NET [英] How do convert unicode escape sequences to unicode characters in a .NET string

查看：180 发布时间：2015/11/24 11:39:41 c# .net windows unicode

本文介绍了如何转换UNI code转义序列UNI code字符的字符串。NET的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

假设你已经加载一个文本文件转换为字符串，你想将所有UNI code逃进字符串内实际UNI code字符。

Say you've loaded a text file into a string and you'd like to convert all unicode escapes into actual unicode characters inside of the string.

例如：

以下是在单向code的组成字符\ u2320'的上半部分，这是下半'\ U2321'。

"The following is the top half of an integral character in unicode '\u2320', and this is the lower half '\U2321'."

我发现我的作品和答案，如果如下。

I found an answer that works for me and if follows.

推荐答案

这是我想出了答案。这很简单，用琴弦行之有效到至少severl万字。

This is the answer that I came up with. It's simple and works well with strings up to at least severl thousand characters.

例1：

Regex  rx = new Regex( @"\\[uU]([0-9A-F]{4})" );
result = rx.Replace( result, match => ((char) Int32.Parse(match.Value.Substring(2), NumberStyles.HexNumber)).ToString() );

例2：

Regex  rx = new Regex( @"\\[uU]([0-9A-F]{4})" );
result = rx.Replace( result, delegate (Match match) { return ((char) Int32.Parse(match.Value.Substring(2), NumberStyles.HexNumber)).ToString(); } );

第一个例子是使用一个lambda前pression（C＃3.0）而作出的repacement，第二使用委托其应与C＃2.0。

The first example shows the repacement being made using a Lambda Expression (C# 3.0) and the second uses a delegate which should work with C# 2.0.

要打破这是怎么回事就在这里，我们首先创建一个常规的前pression：

To break down what's going on here, first we create a regular expression:

new Regex( @"\\[uU]([0-9A-F]{4})" );

然后我们调用替换（）以字符串'结果'和匿名方法（拉姆达EX pression在第一个例子，在第二委托 - 委托也可以是一个普通的方法），其将每个常规这是字符串中找到前pression。

Then we call Replace() with the string 'result' and an anonymous method (Lambda expression in the first example and the delegate in the second - the delegate could also be a regular method) that converts each regular expression that is found in the string.

单向code转义是这样处理的：

The unicode escape is processed like this:

((char) Int32.Parse(match.Value.Substring(2), NumberStyles.HexNumber)).ToString(); } );

获取字符串再presenting逃逸的号码的一部分（跳过前两个字符）。

Get the string representing the number part of the escape (skip the first two characters).

      match.Value.Substring(2)

解析使用Int32.Parse（），它接受字符串和数字格式解析（）函数应该期望在这种情况下是一个十六进制数字的字符串。

Parse that string using Int32.Parse() which takes the string and the number format that Parse() function should expect which in this case is a hex number.

      NumberStyles.HexNumber

然后我们投得到的数字为单code字

Then we cast the resulting number to a unicode character

      (char)

和finaly我们所说的ToString（）在UNI code字这给了我们它的弦重新presentation这是值传递回替换（）

and finaly we call ToString() on the unicode character which gives us it's string representation which is the value passed back to Replace()

      .ToString()

请注意，抓取文本，而不是要与一个子串调用你可以使用匹配参数的GroupCollection和SUBEX pressions在常规EX pression转换捕捉刚数（2320），但是这更复杂，不易阅读。

Note, instead of grabbing the text to be converted with a Substring call you could use the match parameter's GroupCollection, and a subexpressions in the regular expression to capture just the number ('2320') but that's more complicated and less readable.

这篇关于如何转换UNI code转义序列UNI code字符的字符串。NET的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何转换UNI code转义序列UNI code字符的字符串。NET [英] How do convert unicode escape sequences to unicode characters in a .NET string

问题描述

推荐答案

相关文章

C#/.NET最新文章

热门教程

热门工具

登录关闭

如何转换UNI code转义序列UNI code字符的字符串。NET [英] How do convert unicode escape sequences to unicode characters in a .NET string

问题描述

推荐答案

相关文章

C#/.NET最新文章

热门教程

热门工具

登录 关闭

登录关闭