在Unicode中,为什么阿拉伯数字有两种表示形式? [英] In Unicode, why are there two representations for the Arabic digits?
问题描述
我正在阅读Unicode @ Wikipedia的规范(阿拉伯语) 而且我看到每个阿拉伯数字都有2个Unicode代码点. 例如,将1定义为U + 0661和U + 06F1.
I was reading the specification of Unicode @ Wikipedia (Arabic Unicode) and I see that each of the Arabic digits has 2 Unicode code points. For example 1 is defined as U+0661 and as U+06F1.
我应该使用哪个?
推荐答案
根据代码图表 ,U + 0660 .. U + 0669是0到9的阿拉伯数字索引值,而U + 06F0 .. U + 06F9是0到9的扩展阿拉伯数字值.
According to the code charts, U+0660 .. U+0669 are ARABIC-INDIC DIGIT values 0 through 9, while U+06F0 .. U+06F9 are EXTENDED ARABIC-INDIC DIGIT values 0 through 9.
在Unicode 3.0书中(当前版本为5.2,但设置后这些内容并没有太大变化),U + 066n系列字形被标记为阿拉伯-印度数字",而U + 06Fn系列字形被标记为被标记为东部阿拉伯语-印度数字(波斯语和乌尔都语)". 它还指出:
In the Unicode 3.0 book (5.2 is the current version, but these things don't change much once set), the U+066n series of glyphs are marked 'Arabic-Indic digits' and the U+06Fn series of glyphs are marked 'Eastern Arabic-Indic digits (Persian and Urdu)'. It also notes:
- U + 06F4-波斯语和乌尔都语中的不同字形"
- U + 06F5-波斯语和乌尔都语具有与阿拉伯语不同的字形"
- U + 06F6-'波斯字形与阿拉伯字不同'
- U + 06F7-乌尔都语字形不同于阿拉伯语"
为进行比较:
- U + 066n:٠١٢٣٤٥٦٧٨٩
- U + 06Fn:۰۱۲۳۴۵۶۷۸۹
或者,通过将信息制成标题来进行放大:
Or, enlarged by making the information into a title:
或者:
U+066n U+06Fn
0 ٠ ۰
1 ١ ۱
2 ٢ ۲
3 ٣ ۳
4 ٤ ۴
5 ٥ ۵
6 ٦ ۶
7 ٧ ۷
8 ٨ ۸
9 ٩ ۹
(是否可以看到其中的任何一个,以及它们之间的区别有多清晰,可能取决于您的浏览器和计算机上安装的字体以及其他内容.我可以清楚地看到4和6的区别; 5看起来很多两者都一样.)
(Whether you can see any of those, and how clearly they are differentiated may depend on your browser and the fonts installed on your machine as much as anything else. I can see the difference on 4 and 6 clearly; 5 looks much the same in both.)
根据此信息,如果您正在使用中东的阿拉伯语,请使用U + 066n系列数字;如果您使用的是波斯语或乌尔都语,请使用U + 06Fn系列数字.作为Unicode应用程序,您应该接受其中一组代码作为有效数字(但是您可能会对混合了两组数字的序列存有疑问-或者您可能会独自呆着).
Based on this information, if you are working with Arabic from the Middle East, use the U+066n series of digits; if you are working with Persian or Urdu, use the U+06Fn series of digits. As a Unicode application, you should accept either set of codes as valid digits (but you might look askance at a sequence that mixed the two sets of digits - or you might just leave well alone).
这篇关于在Unicode中,为什么阿拉伯数字有两种表示形式?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!