六位数字的Unicode转义值比较 [英] Six digit unicode escaped value comparison

查看:199
本文介绍了六位数字的Unicode转义值比较的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个六位数的Unicode字符,例如 U + 100000 我希望做一个比较的另一个字符在我的C#代码。



我的的 MSDN文档是该字符不能表示的字符,而必须由字符串。




在范围内的Unicode字符U + 10000至U + 10FFFF是不是在允许的字符文字和字符串文字


用一个Unicode代理项对表示

我觉得我失去了一些东西显而易见的,但如何能你得到的后续比较正常工作:

 公共BOOL IsCharLessThan(CHAR myChar,串上界)
{
返回myChar<上界; //将不会编译为一个char是无法相比的字符串
}

Assert.IsTrue(AnExample('\\\f,\\\က00));
Assert.IsFalse(AnExample(\\\က00,\\\က00)); //再次将无法编译,因为这是一个字符串,而不是一个char

修改



K,我想我需要两种方法,一种接受字符,另外接受大字符,即字符串。所以:

 公共BOOL IsCharLessThan(CHAR myChar,串上界)
{
返回真; //每一个char是不到BigChar
}

公共BOOL IsCharLessThan(字符串myBigChar,串上界)
{
返回的String.Compare(myBigChar,上界) < 0;
}

Assert.IsTrue(AnExample('\\\f,\\\က00));
Assert.IsFalse(AnExample(\\\က22,\\ \က00));


解决方案

要构建一个字符串使用字符串文字的Unicode代码点U + 10FFFF,你需要制定出的代理对参与



在这种情况下,你需要:

 字符串bigCharacter =\\\�\\\�; 

或者你可以使用 char.ConvertFromUtf32

 字符串bigCharacter = char.ConvertFromUtf32(在0x10FFFF) ; 

目前还不清楚你想要什么你的方法来实现,但如果你需要它,字符工作不是在BMP,你需要使它接受 INT 而不是字符的,或字符串



中为每个的 字符串 ,如果​​要遍历字符串中的字符作为完整的Unicode值,请使用< A HREF =htt​​p://msdn.microsoft.com/en-us/library/system.globalization.textelementenumerator.aspx相对=nofollow> TextElementEnumerator StringInfo



请注意,您的的需要显式地做到这一点。如果只是用序数值,它会检查的 UTF-16 的代码单元,而不是UTF-32代码点。例如:

 字符串文本=\\\; 
串UPPERBOUND =\\\�\\\�;
Console.WriteLine(的String.Compare(文字,上界,StringComparison.Ordinal));

这打印出大于零的值,表明文本 UPPERBOUND 这里更大。相反,你应该使用 char.ConvertToUtf32

 字符串文本=\\\; 
串UPPERBOUND =\\\�\\\�;
INT textUtf32 = char.ConvertToUtf32(文字,0);
INT upperBoundUtf32 = char.ConvertToUtf32(UPPERBOUND,0);
Console.WriteLine(textUtf32< upperBoundUtf32); //真



所以,这可能是你需要在你的方法做什么。您可能要使用 StringInfo。 LengthInTextElements 以检查串真的是一个UTF-32代码点第一。


I have a six digit unicode character, for example U+100000 which I wish to make a comparison with a another char in my C# code.

My reading of the MSDN documentation is that this character cannot be represented by a char, and must instead be represented by a string.

a Unicode character in the range U+10000 to U+10FFFF is not permitted in a character literal and is represented using a Unicode surrogate pair in a string literal

I feel that I'm missing something obvious, but how can you get the follow comparison to work correctly:

public bool IsCharLessThan(char myChar, string upperBound)
{
    return myChar < upperBound; // will not compile as a char is not comparable to a string
}

Assert.IsTrue(AnExample('\u0066', "\u100000"));
Assert.IsFalse(AnExample("\u100000", "\u100000")); // again won't compile as this is a string and not a char

edit

k, I think I need two methods, one to accept chars and another to accept 'big chars' i.e. strings. So:

public bool IsCharLessThan(char myChar, string upperBound)
{
    return true; // every char is less than a BigChar
}

public bool IsCharLessThan(string myBigChar, string upperBound)
{
    return string.Compare(myBigChar, upperBound) < 0;
}

Assert.IsTrue(AnExample('\u0066', "\u100000));
Assert.IsFalse(AnExample("\u100022", "\u100000"));

解决方案

To construct a string with the Unicode code point U+10FFFF using a string literal, you need to work out the surrogate pair involved.

In this case, you need:

string bigCharacter = "\uDBFF\uDFFF";

Or you can use char.ConvertFromUtf32:

string bigCharacter = char.ConvertFromUtf32(0x10FFFF);

It's not clear what you want your method to achieve, but if you need it to work with characters not in the BMP, you'll need to make it accept int instead of char, or a string.

As per the documentation for string, if you want to iterate over characters in a string as full Unicode values, use TextElementEnumerator or StringInfo.

Note that you do need to do this explicitly. If you just use ordinal values, it will check UTF-16 code units, not the UTF-32 code points. For example:

string text = "\uF000";
string upperBound = "\uDBFF\uDFFF";
Console.WriteLine(string.Compare(text, upperBound, StringComparison.Ordinal));

This prints out a value greater than zero, suggesting that text is greater than upperBound here. Instead, you should use char.ConvertToUtf32:

string text = "\uF000";
string upperBound = "\uDBFF\uDFFF";
int textUtf32 = char.ConvertToUtf32(text, 0);
int upperBoundUtf32 = char.ConvertToUtf32(upperBound, 0);
Console.WriteLine(textUtf32 < upperBoundUtf32); // True

So that's probably what you need to do in your method. You might want to use StringInfo.LengthInTextElements to check that the strings really are single UTF-32 code points first.

这篇关于六位数字的Unicode转义值比较的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆