Windows 控制台对 Unicode 有哪些限制? [英] What limitations does the Windows console have regarding Unicode?

查看:13
本文介绍了Windows 控制台对 Unicode 有哪些限制?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

可以使用 WriteConsoleW 函数.在我的 Windows 7 机器上,控制台似乎不支持基本多语言平面之外的字符.此外,组合字符显示在基本字符之后,而不是实际组合.

It is possible to write Unicode characters to the Windows console using the WriteConsoleW function. On my Windows 7 machine, it looks like the console does not support characters outside the Basic Multilingual Plane. Also, combining characters are displayed after the base character, not actually combined.

这些限制是否也存在于更高版本的 Windows 中?Windows 控制台中的 Unicode 是否还有其他限制?

Are these limitations also present in later versions of Windows? Are there other limitations on Unicode in the Windows console?

推荐答案

我在我对另一个问题的回答中写了部分答案;这是一个全面披露的好地方.我的背景:我很可能拥有最广泛的 完全支持 Windows 的控制台字体(它是用 Unifont 进行了非常深入的重写fonts/dejavu-fonts/releases" rel="noreferrer">DejaVu 添加).

I wrote a partial answer in my answer to a different question; here is a good place for a full disclosure. My background: I maintain what is in all probability the most extensive console font which fully supports Windows (it is a very deep rewrite of Unifont with elements of DejaVu added).

我从其他答案中已经提到的限制开始:

I start with the limitations already mentioned in other answers:

  • 每个单元格包含 16 位字符数据.换句话说:仅显示 UCS-2 代码点.(特别是,对于 BMP 之外的字符,它的分解为 UCS-2"会改为显示,使用代理字符.)

  • Every cell contains 16 bits of character data. In other words: only UCS-2 codepoints are shown. (In particular, for a character out of BMP, its "decomposition into UCS-2" is shown instead, using surrogate characters.)

仅支持简单的文本渲染.即使使用 TTF 字体,控制台也不会考虑字体的高级功能".高级排版(连字等),甚至用于组成字符的字形或从右到左的脚本¹⁾(在 LtR 环境中)都不会按预期工作.

only simple text rendering is supported. Even if one uses TTF fonts, no advanced "features" of the font are considered by the console. Neither advance typography (ligatures etc.), nor even composing glyphs for composing characters or right-to-left scripts¹⁾ (in LtR environment) would work as expected.

¹⁾ 应用程序应该重新排列字符以获得正确的双向渲染.

    ¹⁾ It is the application which should rearrange the characters for a correct bidi-rendering.

其他限制是由于控制台的字体过滤造成的.字体必须非常特殊才能被控制台接受(显示在字体选择对话框中,并且此选择有效"¹⁾).

Other limitations are due to font filtering by a console. A font must be quite special to be accepted by the console (be shown in the font selection dialogue, and this selection "to work"¹⁾).

¹⁾ 我不记得是否可以显示字体,但无法选择(我对这种情况有模糊的记忆,但不能相信这种记忆).

    ¹⁾ I do not recall whether a font may be shown, but won’t be selectable (I have vague memory of this happening, but cannot trust this memory).

  • 字体必须标记为等宽字体.由于应用程序的期望,²⁾此类字体必须具有相同宽度的所有字形.

  • The font must be marked as monospaced. Due to expectations of applications,²⁾ such fonts must have all the glyphs of the same width.

²⁾后一种条件仅在想要在控制台之外使用字体时才相关.原则上,控制台不检查字形的宽度.但是,每个字形都显示为具有默认宽度".在许多(全部?)情况下,只会显示默认边界框"内的字形部分.我找不到任何技巧来规避这个限制.

    ²⁾The latter condition is relevant only if one wants to use the font outside of console. In principle, the console does not check the widths of the glyphs. However, every glyph is shown as if it had the "default width". In many (all?) situations only the part of the glyph inside the "default bounding box" is going to be shown. I could not find any trick to circumvent this limitation.

在 Windows 的非东亚版本中,该字体不能声称它支持 4 种东亚代码页中的任何一种.³⁾

On non-EastAsian releases of Windows, the font cannot claim that it supports any one of 4 East Asian codepages.³⁾

³⁾ 请注意,这只是字体标题的限制声明 — 它只有 4 位出现在标题中.字体可能存在这些语言的字形,并且它们会显示得很好 — 只要字体不要求支持.有问题的代码页(在标题的 OS/2⫽字符集部分)是 932、936、949、950(JIS、简体中文、韩文 Wansung、繁体中文).

    ³⁾ Note that this is only a limitation of what the font header claims — it is just 4 bits present in the header. The font may have glyphs for these languages present, and they would show fine — as far as the font does not claim the support. The codepages in question (in the OS/2⫽Charsets section of the header) are 932, 936, 949, 950 (JIS, Simplified Chinese, Korean Wansung, Traditional Chinese).

  • 虽然 Windows 的控制台不支持 Underline 属性(除了 DBCS codepages),字体头的Underline position"字段在大小时考虑计算屏幕上字符 bbox 的数量.这可能会导致字体的意外纵横比,和/或预期连接在一起"的字形之间的中断.

  • Although Windows’ console does not support Underline attribute (except for DBCS codepages), the "Underline position" field of the font header is taken into account when the size of the on-screen character bbox is calculated. This may lead to unexpected aspect ratio of the font, and/or to interruptions between glyphs which are expected to "join together".

控制台对不支持的字符"的替换字形非常挑剔.我找不到如何使这样的字形与 U+0000 和/或 U+0001 的字形共存.(如果控制台在字体中找到后两个字形之一,它会忽略替换字形.)

The console is very picky about the replacement glyph for "unsupported characters". I could not find how to make such a glyph to coexist with presence of glyphs for U+0000 and/or U+0001. (If the console finds one of the latter two glyphs in a font, it ignores the replacement glyph.)

(这是一个非常晦涩的错误;它需要进行非常技术性的讨论.)替换字形的另一个问题是字符 U+30FB ·(为什么?!).如果字体中存在此字符,则该字符的字形将用作替换字形 — 但仅用于 PUA 中缺少的字符!

(This is a very obscure bug; it requires a very technical discussion.) Another problem with the replacement glyph is the character U+30FB ・ (WHY?!). If this character is present in the font, the glyph for this character is used as a replacement glyph — but only for missing characters in PUA!

基本上就是这样!我没有发现任何其他限制.

这篇关于Windows 控制台对 Unicode 有哪些限制?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆