反转字符串口音字符? [英] Reverse a string with accent chars?
问题描述
所以我看到乔恩双向飞碟视频并有一个code样品:</ em>的
So I saw Jon's skeet video and there was a code sample :
目前应该是一个问题的电子
- 倒车后视,但我想它在失败.NET2(恕我直言),反正它的工作对我来说,我确实看到了中的正确的逆转字符串。
There should have been a problem with the é
- after reversing but I guess it fails on .net2 (IMHO), anyway it did work for me and I did see the correct reversed string.
char[] a="Les Misérables".ToCharArray();
Array.Reverse(a);
string n= new string(a);
Console.WriteLine (n); //selbarésiM seL
但我把它进一步的:
But I took it further:
在希伯来文就有了阿勒夫字符:א
In Hebrew there is the "Alef" char : א
和我能加标点符号,如:אֳ
(我相信包括2个字符 - 但显示为一个)
and I can add punctuation like : אֳ
( which I believe consists of 2 chars - yet displayed as one.)
但现在看会发生什么:
char[] a="Les Misאֳrables".ToCharArray();
Array.Reverse(a);
string n= new string(a);
Console.WriteLine (n); //selbarֳאsiM seL
有一个分裂...
There was a split...
我可以理解为什么它正在发生:
I can understand why it is happening :
Console.WriteLine ("אֳ".Length); //2
所以我在想,如果有一个变通方法对于这种问题在C#(或者我应该建立自己的机制......)
So I was wondering if there's a workaround for this kind of issue in C# ( or should I build my own mechanism....)
推荐答案
问题是, Array.Reverse
不知道炭
值可以结合以形成一个单一的字符,或者字形,并且因此不应该被逆转。你必须使用的东西,了解统一code组合字符序列,如<一href="http://msdn.microsoft.com/en-us/library/system.globalization.textelementenumerator.aspx">TextElementEnumerator:
The problem is that Array.Reverse
isn't aware that certain sequences of char
values may combine to form a single character, or "grapheme", and thus shouldn't be reversed. You have to use something that understands Unicode combining character sequences, like TextElementEnumerator:
// using System.Globalization;
TextElementEnumerator enumerator =
StringInfo.GetTextElementEnumerator("Les Misאֳrables");
List<string> elements = new List<string>();
while (enumerator.MoveNext())
elements.Add(enumerator.GetTextElement());
elements.Reverse();
string reversed = string.Concat(elements); // selbarאֳsiM seL
这篇关于反转字符串口音字符?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!