六位数字的Unicode转义值比较 [英] Six digit unicode escaped value comparison
问题描述
我有一个六位数的Unicode字符,例如 U + 100000
我希望做一个比较的另一个字符
在我的C#代码。
我的的 MSDN文档是该字符不能表示的字符
,而必须由字符串
。
在范围内的Unicode字符U + 10000至U + 10FFFF是不是在允许的字符文字和字符串文字
用一个Unicode代理项对表示块引用>
我觉得我失去了一些东西显而易见的,但如何能你得到的后续比较正常工作:
公共BOOL IsCharLessThan(CHAR myChar,串上界)
{
返回myChar<上界; //将不会编译为一个char是无法相比的字符串
}
Assert.IsTrue(AnExample('\\\f,\\\က00));
Assert.IsFalse(AnExample(\\\က00,\\\က00)); //再次将无法编译,因为这是一个字符串,而不是一个char
修改
K,我想我需要两种方法,一种接受字符,另外接受大字符,即字符串。所以:
公共BOOL IsCharLessThan(CHAR myChar,串上界)
{
返回真; //每一个char是不到BigChar
}
公共BOOL IsCharLessThan(字符串myBigChar,串上界)
{
返回的String.Compare(myBigChar,上界) < 0;
}
Assert.IsTrue(AnExample('\\\f,\\\က00));
Assert.IsFalse(AnExample(\\\က22,\\ \က00));
解决方案要构建一个字符串使用字符串文字的Unicode代码点U + 10FFFF,你需要制定出的代理对参与
在这种情况下,你需要:
字符串bigCharacter =\\\�\\\�;
或者你可以使用
char.ConvertFromUtf32
:字符串bigCharacter = char.ConvertFromUtf32(在0x10FFFF) ;
目前还不清楚你想要什么你的方法来实现,但如果你需要它,字符工作不是在BMP,你需要使它接受
INT
而不是字符
的,或字符串
。
中为每个的
字符串
,如果要遍历字符串中的字符作为完整的Unicode值,请使用< A HREF =http://msdn.microsoft.com/en-us/library/system.globalization.textelementenumerator.aspx相对=nofollow>TextElementEnumerator
一个>或StringInfo
。
请注意,您的做的需要显式地做到这一点。如果只是用序数值,它会检查的 UTF-16 的代码单元,而不是UTF-32代码点。例如:
字符串文本=\\\;
串UPPERBOUND =\\\�\\\�;
Console.WriteLine(的String.Compare(文字,上界,StringComparison.Ordinal));
这打印出大于零的值,表明
文本
比UPPERBOUND
这里更大。相反,你应该使用char.ConvertToUtf32
:字符串文本=\\\;
串UPPERBOUND =\\\�\\\�;
INT textUtf32 = char.ConvertToUtf32(文字,0);
INT upperBoundUtf32 = char.ConvertToUtf32(UPPERBOUND,0);
Console.WriteLine(textUtf32< upperBoundUtf32); //真
所以,这可能是你需要在你的方法做什么。您可能要使用
StringInfo。 LengthInTextElements
以检查串真的是一个UTF-32代码点第一。I have a six digit unicode character, for example
U+100000
which I wish to make a comparison with a anotherchar
in my C# code.My reading of the MSDN documentation is that this character cannot be represented by a
char
, and must instead be represented by astring
.a Unicode character in the range U+10000 to U+10FFFF is not permitted in a character literal and is represented using a Unicode surrogate pair in a string literal
I feel that I'm missing something obvious, but how can you get the follow comparison to work correctly:
public bool IsCharLessThan(char myChar, string upperBound) { return myChar < upperBound; // will not compile as a char is not comparable to a string } Assert.IsTrue(AnExample('\u0066', "\u100000")); Assert.IsFalse(AnExample("\u100000", "\u100000")); // again won't compile as this is a string and not a char
edit
k, I think I need two methods, one to accept chars and another to accept 'big chars' i.e. strings. So:
public bool IsCharLessThan(char myChar, string upperBound) { return true; // every char is less than a BigChar } public bool IsCharLessThan(string myBigChar, string upperBound) { return string.Compare(myBigChar, upperBound) < 0; } Assert.IsTrue(AnExample('\u0066', "\u100000)); Assert.IsFalse(AnExample("\u100022", "\u100000"));
解决方案To construct a string with the Unicode code point U+10FFFF using a string literal, you need to work out the surrogate pair involved.
In this case, you need:
string bigCharacter = "\uDBFF\uDFFF";
Or you can use
char.ConvertFromUtf32
:string bigCharacter = char.ConvertFromUtf32(0x10FFFF);
It's not clear what you want your method to achieve, but if you need it to work with characters not in the BMP, you'll need to make it accept
int
instead ofchar
, or astring
.As per the documentation for
string
, if you want to iterate over characters in a string as full Unicode values, useTextElementEnumerator
orStringInfo
.Note that you do need to do this explicitly. If you just use ordinal values, it will check UTF-16 code units, not the UTF-32 code points. For example:
string text = "\uF000"; string upperBound = "\uDBFF\uDFFF"; Console.WriteLine(string.Compare(text, upperBound, StringComparison.Ordinal));
This prints out a value greater than zero, suggesting that
text
is greater thanupperBound
here. Instead, you should usechar.ConvertToUtf32
:string text = "\uF000"; string upperBound = "\uDBFF\uDFFF"; int textUtf32 = char.ConvertToUtf32(text, 0); int upperBoundUtf32 = char.ConvertToUtf32(upperBound, 0); Console.WriteLine(textUtf32 < upperBoundUtf32); // True
So that's probably what you need to do in your method. You might want to use
StringInfo.LengthInTextElements
to check that the strings really are single UTF-32 code points first.这篇关于六位数字的Unicode转义值比较的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!