反转带有重音字符的字符串? [英] Reverse a string with accent chars?
问题描述
所以我看到了 Jon 的飞碟视频,并且有一个代码示例:
So I saw Jon's skeet video and there was a code sample :
é
应该有问题 - 倒车后,但我想它在 .net2 上失败了(恕我直言),无论如何它确实对我有用,我确实看到了正确 反转字符串.
There should have been a problem with the é
- after reversing but I guess it fails on .net2 (IMHO), anyway it did work for me and I did see the correct reversed string.
char[] a="Les Misérables".ToCharArray();
Array.Reverse(a);
string n= new string(a);
Console.WriteLine (n); //selbarésiM seL
但我更进一步:
在希伯来语中有Alef"字符:א
In Hebrew there is the "Alef" char : א
我可以添加标点符号,如:אֳ
(我相信它由 2 个字符组成 - 但显示为一个.)
and I can add punctuation like : אֳ
( which I believe consists of 2 chars - yet displayed as one.)
但是现在看看会发生什么:
But now look what happens :
char[] a="Les Misאֳrables".ToCharArray();
Array.Reverse(a);
string n= new string(a);
Console.WriteLine (n); //selbarֳאsiM seL
有分裂...
我能理解为什么会这样:
I can understand why it is happening :
Console.WriteLine ("אֳ".Length); //2
所以我想知道 C# 中是否有解决此类问题的方法(或者我应该构建自己的机制......)
So I was wondering if there's a workaround for this kind of issue in C# ( or should I build my own mechanism....)
推荐答案
问题是 Array.Reverse
不知道 char
值的某些序列可能组合形成单个字符或字素",因此不应颠倒.您必须使用能够理解 Unicode 组合字符序列的东西,例如 TextElementEnumerator:
The problem is that Array.Reverse
isn't aware that certain sequences of char
values may combine to form a single character, or "grapheme", and thus shouldn't be reversed. You have to use something that understands Unicode combining character sequences, like TextElementEnumerator:
// using System.Globalization;
TextElementEnumerator enumerator =
StringInfo.GetTextElementEnumerator("Les Misאֳrables");
List<string> elements = new List<string>();
while (enumerator.MoveNext())
elements.Add(enumerator.GetTextElement());
elements.Reverse();
string reversed = string.Concat(elements); // selbarאֳsiM seL
这篇关于反转带有重音字符的字符串?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!