为什么wprintf独立的Uni code结扎成两个不同的字形？ [英] Why does wprintf separate Unicode ligature into two different graphemes?

查看：160 发布时间：2016/8/23 11:44:19 c unicode wchar ligature

本文介绍了为什么wprintf独立的Uni code结扎成两个不同的字形？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

code：

#include <stdio.h>
#include <wchar.h>
#define USE_W
int main()
{
#ifdef USE_W
    const wchar_t *ae_utf16 = L"\x00E6 & ASCII text ae\n";
    wprintf(ae_utf16);
#else
    const char *ae_utf8 = "\xC3\xA6 & ASCII text ae\n";
    printf(ae_utf8);
#endif
    return 0;
}

输出：

AE＆安培; ASCII文本自动曝光

ae & ASCII text ae

虽然printf的产生正确的UTF-8的输出：

While printf produces correct UTF-8 output:

AE＆安培; ASCII文本自动曝光

æ & ASCII text ae

您可以测试这个 rel=\"nofollow\">。

You can test this here.

推荐答案

的printf 只需发送到你的终端原始字节;它不知道编码任何事情。如果你的终端碰巧被配置为可互preT，作为UTF-8，它会显示正确的字符。

printf just sends raw bytes to your terminal; it does not know anything about encodings. If your terminal happens to be configured to interpret that as UTF-8, it will show the right characters.

wprintf ，而另一方面，不知道编码。它行为就像它使用功能的 wcrtomb ，该连接codeS宽字符（ wchar_t的）到一个多字节序列，的根据当前的区域的。如果默认区域恰好是C，这是相当简约，字符æ被转换为或多或少相当于字节序列 AE 。

wprintf, on the other hand, does know about encodings. It behaves as though it uses the function wcrtomb, which encodes a wide character (wchar_t) into a multibyte sequence, depending on the current locale. If the default locale happens to be "C", which is quite minimalistic, the character æ gets converted to the "more or less equivalent" byte sequence ae.

如果您在使用UTF-8，如的en_US.UTF-8，输出的是如预期。当然，一套支持的语言环境的每个系统不同，所以它没有好硬code这一点。

If you set the locale explicitly to something using UTF-8, like "en_US.UTF-8", the output is as expected. Of course, the set of supported locales differs per system, so it's no good to hardcode this.

这篇关于为什么wprintf独立的Uni code结扎成两个不同的字形？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

为什么wprintf独立的Uni code结扎成两个不同的字形？ [英] Why does wprintf separate Unicode ligature into two different graphemes?

问题描述

推荐答案

相关文章

C/C++最新文章

热门教程

热门工具

登录关闭

为什么wprintf独立的Uni code结扎成两个不同的字形？ [英] Why does wprintf separate Unicode ligature into two different graphemes?

问题描述

推荐答案

相关文章

C/C++最新文章

热门教程

热门工具

登录 关闭

登录关闭