Vim:Utf-8ې字符中断显示的字符串 [英] Vim: Utf-8 ې character breaks displayed string

查看:103
本文介绍了Vim:Utf-8ې字符中断显示的字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含十六进制内容的文件:db90 3031 46,应在vim中显示为ې",后跟"01F",但我注意到它从未正确显示.然后我注意到在其他地方也一样,例如在终端机和浏览器中,我总是得到ې01 F?这是为什么?只需将其粘贴到google中并尝试一下,您将永远无法将ې"和0用作下一个字符.

I have file that has hex content: db90 3031 46, which should be displayed in vim as "ې" followed by "01F", but what I noticed is that it is never displayed correctly. Then I noticed It is the same in other places like in terminal and browser I always get ې01F? Why is that? Just paste that in google and try yourself you will never be able to put "ې" and 0 as next character.

推荐答案

这是带有从右至左指示符的阿拉伯字符,因此您可能需要切换回从左至右的模式,例如U+200e.

That's an Arabic character with right-to-left indicator, so you probably need to switch back to left-to-right mode, such as with U+200e.

Unicode双向内容相当复杂-您看到的行为可能是由于以下事实造成的:拉丁数字标记为EN = European number(弱类型),而诸如F之类的字母标记为L = left to right(强类型).

The Unicode bidirectional stuff is rather complex - the behaviour you are seeing is probably caused by the fact that the Latin digits are marked EN = European number (a weak type), while letters such as F are marked L = left to right (a strong type).

弱类型在Unicode规范中的处理方式有所不同,例如此引号涵盖了您的特殊情况(我强调):

Weak types are treated differently in the Unicode specification, such as with this quote which covers your particular case (my emphasis):

当从右到左的段落以从左到右的字符开头,或者存在不同方向的文本的嵌套段,或者在方向边界上存在弱的字符时,可能会出现问题. 在这种情况下,可能需要嵌入或方向标记才能正确显示.

Problematic cases may occur when a right-to-left paragraph begins with left-to-right characters, or there are nested segments of different-direction text, or there are weak characters on directional boundaries. In these cases, embeddings or directional marks may be required to get the right display.

因此,您的代码点后跟一个数字,将呈现为ې7"(我在阿拉伯字符的之后中键入了7,尽管事实上它出现在了它的前面),而在其后跟着一个字母给出ېX".

So your code point followed by a digit renders as "ې7" (I typed that 7 in after the Arabic character despite the fact it's showing up before it), while following it with a letter gives "ېX".

在这里,通过在两个字符之间插入‎来生成文本ې‎ 7",HTML相当于U+200e Unicode代码点.

For what it's worth, the text "ې‎7" was generated here by inserting ‎ between the two characters, the HTML equivalent of the U+200e Unicode code point.

如果您直接转到此UTF-8编解码器站点,然后在解码部分输入%u06D0%u200e7,您会看到它按您想要的顺序显示(删除%200e会按照您在问题中描述的顺序显示它).

If you head on over to this UTF-8 codec site and enter %u06D0%u200e7 into the decoding section, you'll see that it comes out in your desired order (removing the %200e shows it in the order you're describing in your question).

这篇关于Vim:Utf-8ې字符中断显示的字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆