用printf显示宽字符 [英] Displaying wide chars with printf

查看:419
本文介绍了用printf显示宽字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图了解printf如何处理宽字符(wchar_t).

I'm trying to understand how does printf work with wide characters (wchar_t).

我已经制作了以下代码示例:

I've made the following code samples :

#include <stdio.h>
#include <stdlib.h>

int     main(void)
{
    wchar_t     *s;

    s = (wchar_t *)malloc(sizeof(wchar_t) * 2);
    s[0] = 42;
    s[1] = 0;
    printf("%ls\n", s);
    free(s);
    return (0);
}

输出:

*

这里一切都很好:我的角色(*)已正确显示.

Everything is fine here : my character (*) is correctly displayed.

我想显示另一种字符.在我的系统上,wchar_t似乎编码为4个字节.所以我试图显示以下字符: É

I wanted to display an other kind of character. On my system, wchar_t seem encoded on 4 bytes. So I tried to display the following character : É

#include <stdio.h>
#include <stdlib.h>

int     main(void)
{
    wchar_t     *s;

    s = (wchar_t *)malloc(sizeof(wchar_t) * 2);
    s[0] = 0xC389;
    s[1] = 0;
    printf("%ls\n", s);
    free(s);
    return (0);
}

但是这次没有任何输出,我尝试使用"encoding"参数中的许多值. s[0](0xC389、201、0xC9)的部分(请参阅上一链接)...但是我从来没有显示É字符.我也尝试使用%S而不是%ls.

But there is no output this time, I tried with many values from the "encoding" section (cf. previous link) for s[0] (0xC389, 201, 0xC9)... But I never get the É character displayed. I also tried with %S instead of %ls.

如果我尝试这样调用printf:printf("<%ls>\n", s)唯一打印的字符是'<',则显示将被截断.

If I try to call printf like this : printf("<%ls>\n", s) the only character printed is '<', the display is truncated.

为什么会有这个问题?我该怎么办?

Why do I have this problem? How should I do?

推荐答案

为什么会有这个问题?

确保检查errnoprintf的返回值!

#include <stdio.h>
#include <stdlib.h>
#include <wchar.h>

int main(void)
{
    wchar_t *s;
    s = (wchar_t *) malloc(sizeof(wchar_t) * 2);
    s[0] = 0xC389;
    s[1] = 0;

    if (printf("%ls\n", s) < 0) {
        perror("printf");
    }

    free(s);
    return (0);
}

查看输出:

$ gcc test.c && ./a.out
printf: Invalid or incomplete multibyte or wide character

如何修复

首先,C程序的默认语言环境是C(也称为POSIX),它仅是ASCII.您需要将呼叫添加到setlocale,尤其是setlocale(LC_ALL,"").

How to fix

First of all, the default locale of a C program is C (also known as POSIX) which is ASCII-only. You will need to add a call to setlocale, specifically setlocale(LC_ALL,"").

如果您的LC_ALLLC_CTYPELANG环境变量未设置为在空白时允许UTF-8,则必须显式选择一个语言环境. setlocale(LC_ALL, "C.UTF-8")在大多数系统上都可以使用-C是标准的,并且通常实现CUTF-8子集.

If your LC_ALL, LC_CTYPE or LANG environment variables are not set to allow UTF-8 when blank, you'll have to explicitly select a locale. setlocale(LC_ALL, "C.UTF-8") works on most systems - C is standard, and the UTF-8 subset of C is generally implemented.

#include <stdio.h>
#include <stdlib.h>
#include <locale.h>
#include <wchar.h>

int main(void)
{
    wchar_t *s;
    s = (wchar_t *) malloc(sizeof(wchar_t) * 2);
    s[0] = 0xC389;
    s[1] = 0;

    setlocale(LC_ALL, "");

    if (printf("%ls\n", s) < 0) {
        perror("printf");
    }

    free(s);
    return (0);
}

查看输出:

$ gcc test.c && ./a.out
쎉

打印出不正确字符的原因是因为wchar_t代表宽字符(例如UTF-32),而不是多字节字符(例如UTF-8).请注意,wchar_t在GNU C库中始终为32位宽,但是C标准不需要.如果使用UTF-32BE编码(即0x000000C9)初始化字符,则可以正确打印出该字符:

The reason why the incorrect character printed out is because wchar_t represents a wide character (such as UTF-32), not a multibyte character (such as UTF-8). Note that wchar_t is always 32 bits wide in the GNU C Library, but the C standard doesn't require it to be. If you initialize the character using the UTF-32BE encoding (i.e. 0x000000C9), then it prints out correctly:

#include <stdio.h>
#include <stdlib.h>
#include <locale.h>
#include <wchar.h>

int main(void)
{
    wchar_t *s;
    s = (wchar_t *) malloc(sizeof(wchar_t) * 2);
    s[0] = 0xC9;
    s[1] = 0;

    setlocale(LC_ALL, "");

    if (printf("%ls\n", s) < 0) {
        perror("printf");
    }

    free(s);
    return (0);
}

输出:

$ gcc test.c && ./a.out
É

请注意,您还可以通过命令行设置LC(语言环境)环境变量:

Note that you can also set the LC (locale) environment variables via command line:

$ LC_ALL=C.UTF-8
$ ./a.out
É

这篇关于用printf显示宽字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆