阿拉伯语:“源" Unicode以最终显示Unicode [英] Arabic: 'source' Unicode to final display Unicode

查看:392
本文介绍了阿拉伯语:“源" Unicode以最终显示Unicode的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

一个简单的问题:

这是我正在寻找的最终显示字符串

this is the final display string I am looking for

لعبةديدة

下面是每个单独的字符,然后将它们胶合"在一起(因此,我在每个字符之间都留了一个空格以停止连接)

now below is each of the separate characters, before being 'glued' together (so I've put a space between each of them to stop the joining)

لعبةديدة

请注意它们不是相同的字符,有一些神奇的转换将它们融合在一起,然后将它们转换为新的Unicode字符.

note how they are NOT the same characters, there is some magical transform that melds them together and converts them to new Unicode characters.

然后在上方,字符实际上是从右到左出现(在内存中,它们是从左到右)

and then in that above, the characters are actually appearing right to left (in memory, they are left to right)

所以我的简单问题是:在哪里可以得到一个平台无关的c/c ++函数,该函数将使用源16位Unicode字符串,并对其进行转换以生成Unicode字符串,该Unicode字符串将创建第一个引用的字符串以上?进行RTL转换以及加入?

so my simple question is this: where do I get a platform independent c/c++ function that will take my source 16 bit Unicode string, and do the transform on it to result in the Unicode string that will create the one first quoted above? doing the RTL conversion, and the joining?

这就是我想要的,一个功能可以做到这一点.

that's all I want, one function that does that.

更新:

好的,是的,我知道上面两个示例中的字符"是相同的,它们是相同的字母",但是(使用chrome或最新的IE浏览)任何人都可以清楚地看到字形是不同的.现在,我非常有信心,可以在unicode级别上完成此转换,因为我的字体文件和unicode标准似乎为字符的单独版本和各种合并版本指定了不同的字形./字母. (unicode.org/charts/PDF/UFB50.pdf unicode.org/charts/PDF/UFE70.pdf)

ok, yes, I know that the 'characters' are the same in the two above examples, they are the same 'letters' but (viewing in chrome, or latest IE) anyone can CLEARLY see that the glyphs are different. now I'm fairly confident that this transform that needs to be done can be done on the unicode level, because my font file, and the unicode standard, seems to specify the different glyphs for both the separate, and various joined versions of the characters/letters. (unicode.org/charts/PDF/UFB50.pdf unicode.org/charts/PDF/UFE70.pdf)

那么,我可以将我的unicode放到一个函数中并取出转换后的unicode吗?

so, can I just put my unicode into a function and get the transformed unicode out?

推荐答案

联接和RTL转换不会发生在Unicode字符级别.

The joining and RTL conversion don't happen at the level of Unicode characters.

换句话说:在此过程中,不变字符的顺序实际的unicode代码点.

In other words: the order of the characters and the actual unicode codepoints are not changed during this process.

实际上,合并和处理RTL/LTR过渡是由文本呈现引擎处理的.

In fact, the merging and handling RTL/LTR transitions is handled by the text rendering engine.

有关阿拉伯字母的Wikipedia文章中的这句话很好地说明了这一点:

This quote from the Wikipedia article on the Arabic alphabet explains it quite nicely:

最后,阿拉伯语的Unicode编码是按照逻辑顺序进行的,也就是说,按照书写和发音的顺序输入字符并存储在计算机内存中,而不必担心方向它们将显示在纸上或屏幕上.同样,使用Unicode的双向,它由呈现引擎按照正确的方向显示字符.文字功能.在这方面,如果此页面上的阿拉伯语单词是从左到右书写的,则表明用来显示它们的Unicode渲染引擎已过期.

Finally, the Unicode encoding of Arabic is in logical order, that is, the characters are entered, and stored in computer memory, in the order that they are written and pronounced without worrying about the direction in which they will be displayed on paper or on the screen. Again, it is left to the rendering engine to present the characters in the correct direction, using Unicode's bi-directional text features. In this regard, if the Arabic words on this page are written left to right, it is an indication that the Unicode rendering engine used to display them is out-of-date.

这篇关于阿拉伯语:“源" Unicode以最终显示Unicode的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆