对于字符不能被解析为int为什么Char.IsDigit返回true? [英] Why Char.IsDigit returns true for chars which can't be parsed to int?
问题描述
我经常使用 Char.IsDigit
来检查字符
是一个数字是特别方便,在LINQ查询到pre-检查 int.Parse
如下:123。所有(Char.IsDigit)
但也有它的字符是数字,但不能被解析到 INT
如 5
。
//真
布尔ISDIGIT = Char.IsDigit('5');
VAR文化= CultureInfo.GetCultures(CultureTypes.SpecificCultures);
INT NUM;
// 假
布尔isIntForAnyCulture =文化
。任何(C => int.TryParse('5'.ToString(),NumberStyles.Any,C,出NUM));
这是为什么?是我的 int.Parse
- ?通过 Char.IsDigit
pre检查这样不正确
有310个字符这是个数字:
名单,其中,焦炭> digitList = Enumerable.Range(0,UInt16.MaxValue)
。选择(ⅰ=> Convert.ToChar(i))的
。凡(C => Char.IsDigit(C))
.ToList();
下面的 Char.IsDigit
在.NET 4的实施(ILSpy):
公共静态布尔ISDIGIT(字符C)
{
如果(char.IsLatin1(c))的
{
返回c取代; ='0'和;&安培; ℃下='9';
}
返回CharUni codeInfo.GetUni codeCategory(三)==统一codeCategory.DecimalDigitNumber;
}
那么,为什么会有属于<一个字符href="http://msdn.microsoft.com/en-us/library/system.globalization.uni$c$ccategory%28v=vs.110%29.aspx"相对=nofollow> DecimalDigitNumber
-category (十进制数字字符,也就是说,范围为0到9的字符... 的),不能在任何文化中被解析到一个 INT
?
这是因为它正在检查所有的统一code数字数字,十进制数的范畴,这里列出的:
http://www.fileformat.info/信息/ UNI code /分类/ ND / list.htm
这并不意味着它是在当前语言环境的有效数字字符。事实上使用 int.Parse()
,你只能解析区域设置的普通的英语数字,而不管。
例如,这个的不的工作:
内部测试= int.Parse(3,CultureInfo.GetCultureInfo(AR));
尽管 3
是一个有效的阿拉伯数字字符,而AR是阿拉伯语的区域设置标识符。
微软的文章如何:解析统一code数字指出:
这是.NET Framework的解析为小数是ASCII数字0到9,由code指定唯一的统一code位值U + 0030到U + 0039。 .NET框架解析所有其他的Uni code位的字符。
不过,请注意,您可以使用 char.GetNumericValue()
以一个单code数字字符转换成等价的数值为双。
原因返回的值是一个双,而不是一个int是因为事情是这样的:
Console.WriteLine(char.GetNumericValue('四分之一')); //输出0.25
您可以使用像这样把所有的数字字符串中的字符到其对应的ASCII码:
公共字符串ConvertNumericChars(字符串输入)
{
StringBuilder的输出=新的StringBuilder();
的foreach(在输入字符CH)
{
如果(char.IsDigit(CH))
{
双值= char.GetNumericValue(CH);
如果((值GT = 0)及及(值小于= 9)及及(价值==(INT)值))
{
output.Append((炭)('0'+(INT)值));
继续;
}
}
output.Append(CH);
}
返回output.ToString();
}
I often use Char.IsDigit
to check if a char
is a digit which is especially handy in LINQ queries to pre-check int.Parse
as here: "123".All(Char.IsDigit)
.
But there are chars which are digits but which can't be parsed to int
like ۵
.
// true
bool isDigit = Char.IsDigit('۵');
var cultures = CultureInfo.GetCultures(CultureTypes.SpecificCultures);
int num;
// false
bool isIntForAnyCulture = cultures
.Any(c => int.TryParse('۵'.ToString(), NumberStyles.Any, c, out num));
Why is that? Is my int.Parse
-precheck via Char.IsDigit
thus incorrect?
There are 310 chars which are digits:
List<char> digitList = Enumerable.Range(0, UInt16.MaxValue)
.Select(i => Convert.ToChar(i))
.Where(c => Char.IsDigit(c))
.ToList();
Here's the implementation of Char.IsDigit
in .NET 4 (ILSpy):
public static bool IsDigit(char c)
{
if (char.IsLatin1(c))
{
return c >= '0' && c <= '9';
}
return CharUnicodeInfo.GetUnicodeCategory(c) == UnicodeCategory.DecimalDigitNumber;
}
So why are there chars that belong to the DecimalDigitNumber
-category("Decimal digit character, that is, a character in the range 0 through 9...") which can't be parsed to an int
in any culture?
It's because it is checking for all digits in the Unicode "Number, Decimal Digit" category, as listed here:
http://www.fileformat.info/info/unicode/category/Nd/list.htm
It doesn't mean that it is a valid numeric character in the current locale. In fact using int.Parse()
, you can ONLY parse the normal English digits, regardless of the locale setting.
For example, this doesn't work:
int test = int.Parse("٣", CultureInfo.GetCultureInfo("ar"));
Even though ٣
is a valid Arabic digit character, and "ar" is the Arabic locale identifier.
The Microsoft article "How to: Parse Unicode Digits" states that:
The only Unicode digits that the .NET Framework parses as decimals are the ASCII digits 0 through 9, specified by the code values U+0030 through U+0039. The .NET Framework parses all other Unicode digits as characters.
However, note that you can use char.GetNumericValue()
to convert a unicode numeric character to its numeric equivalent as a double.
The reason the return value is a double and not an int is because of things like this:
Console.WriteLine(char.GetNumericValue('¼')); // Prints 0.25
You could use something like this to convert all numeric characters in a string into their ASCII equivalent:
public string ConvertNumericChars(string input)
{
StringBuilder output = new StringBuilder();
foreach (char ch in input)
{
if (char.IsDigit(ch))
{
double value = char.GetNumericValue(ch);
if ((value >= 0) && (value <= 9) && (value == (int)value))
{
output.Append((char)('0'+(int)value));
continue;
}
}
output.Append(ch);
}
return output.ToString();
}
这篇关于对于字符不能被解析为int为什么Char.IsDigit返回true?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!