这是Bug(Windows API)吗? [英] Is this a bug (Windows API)?

查看:73
本文介绍了这是Bug(Windows API)吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个关于字符串规范化的问题,并且已经回答了,但是问题是,我无法正确规范化需要3次击键的韩文字符
使用输入ㅁㅜㄷ"(来自击键"ane"),它显示为무ㄷ"而不是묻".
通过输入ㅌㅐㅇ"(来自击键"xod"),它显示为태ㅇ"而不是탱".

I had a question about string normalization and it was already answered, but the problem is, I cannot correctly normalize korean characters that require 3 keystrokes
With the input "ㅁㅜㄷ"(from keystrokes "ane"), it comes out "무ㄷ" instead of "묻".
With the input "ㅌㅐㅇ"(from keystrokes "xod"), it comes out "태ㅇ" instead of "탱".

这是Dean先生的回答,虽然它适用于我最初给出的示例……但与我上面引用的示例不兼容.

This is Mr. Dean's answer and while it worked on the example I gave at first...it doesn't work with the one's I cited above.

如果您使用的是.NET,则可以进行以下操作:

If you are using .NET, the following will work:

var s = "ㅌㅐㅇ";
s = s.Normalize(NormalizationForm.FormKC);

在本机Win32中,相应的调用为 NormalizeString :

In native Win32, the corresponding call is NormalizeString:

wchar_t *input = "ㅌㅐㅇ";
wchar_t output[100];
NormalizeString(NormalizationKC, input, -1, output, 100);

NormalizeString仅在Windows Vista +中可用.您需要"Microsoft国际化如果要在XP上使用域名(IDN)缓解API"(为什么要在IDN下载中找到它,我不明白...)

NormalizeString is only available in Windows Vista+. You need the "Microsoft Internationalized Domain Name (IDN) Mitigation APIs" installed if you want to use it on XP (why it's in the IDN download, I don't understand...)

请注意,这两种方法实际上都不需要使用IME,无论您是否安装了朝鲜语IME,它们都可以使用.

Note that neither of these methods actually requires use of the IME - they work regardless of whether you've got the Korean IME installed or not.

这是我在delphi(XP)中使用的代码:

This is the code I'm using in delphi (with XP):

      var  buf: array [0..20] of char;
      temporary: PWideChar;
      const NORMALIZATIONKC=5;
      ...
      temporary:='ㅌㅐㅇ';
      NormalizeString(NORMALIZATIONKC , temporary, -1, buf, 20);
      showmessage(buf);

这是一个错误吗?我的代码中有不正确的地方吗?代码可以在您的计算机上正确运行吗?用什么语言?您使用的是哪个Windows版本?

推荐答案

您正在使用的Jamo(ㅌㅐㅇ)位于名为 Hangul Jamo (,不添加空格,这些空格只是为了防止浏览器对其进行标准化),可以将它们重新组成

The jamo you're using (ㅌㅐㅇ)are in the block called Hangul Compatibility Jamo, which is present due to legacy code pages. If you were to take your target character and decompose it (using NFKD), you get jamo from the block Hangul Jamo (ᄐ ᅢ ᆼ, sans the spaces, which are just there to prevent the browser from normalizing it), and these can be re-composed just fine.

Unicode 5.2 状态:

当韩文相容性Jamo是兼容转换归一化形式NFKD或NFKC,字符转换为相应的联合尊宝人物.

When Hangul compatibility jamo are transformed with a compatibility normalization form, NFKD or NFKC, the characters are converted to the corresponding conjoining jamo characters.

(...)

表12-11说明了两个韩文兼容性尊宝可以在显示,即使在转换它们之后NFKD或NFKC.

Table 12-11 illustrates how two Hangul compatibility jamo can be separated in display, even after transforming them with NFKD or NFKC.

这表明NFKC应该通过将它们视为常规Jamo来正确组合它们,但是Windows似乎没有这样做.但是,使用NFKD确实可以将它们转换为正常的Jamo,然后可以在其上运行NFKC以获得正确的字符.

This suggests that NFKC should combine them correctly by treating them as regular Jamo, but Windows doesn't appear to be doing that. However, using NFKD does appear to convert them to the normal Jamo, and you can then run NFKC on it to get the right character.

由于这些字符似乎来自外部程序(IME),所以我建议您要么手动进行转换以转换那些兼容Jamo,要么先执行NFKD,然后执行NFKC.或者,您可以重新配置IME以输出正常" Jamo,而不是兼容的Jamo.

Since those characters appear to come from an external program (the IME), I would suggest you either do a manual pass to convert those compatibility Jamo, or start by doing NFKD, then NFKC. Alternatively, you may be able to reconfigure the IME to output "normal" Jamo instead of comaptibility Jamo.

这篇关于这是Bug(Windows API)吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆