理解和用 C 编写 wchar_t [英] Understanding and writing wchar_t in C

查看:72
本文介绍了理解和用 C 编写 wchar_t的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前正在为学校项目重写(一部分)printf() 函数.总的来说,我们需要用几个标志、转换、长度修饰符来重现函数的行为......

I'm currently rewriting (a part of) the printf() function for a school project. Overall, we were required to reproduce the behaviour of the function with several flags, conversions, length modifiers ...

我唯一要做的事情就是标记 %C/%S (或 %lc/%ls).

The only thing I have left to do and that gets me stuck are the flags %C / %S (or %lc / %ls).

到目前为止,我已经收集到 wchar_t 是一种可以在多个字节上存储字符的类型,以便接受更多的字符或符号,从而与几乎所有语言兼容,不管他们的字母和特殊字符.

So far, I've gathered that wchar_t is a type that can store characters on more than one byte, in order to accept more characters or symbols and therefore be compatible with pretty much every language, regardless of their alphabet and special characters.

但是,我无法找到有关 wchar 对机器的外观的任何具体信息,它是实际长度(显然会因编译器、操作系统 ...) 或如何实际编写它们.

However, I wasn't able to find any concrete information on what a wchar looks like for the machine, it's actual length (which apparently vary based on several factors including the compiler, the OS ...) or how to actually write them.

提前致谢

请注意,我们被允许使用的功能是有限的.唯一允许的函数是 write()malloc()free()exit().我们必须能够自己编写任何其他必需的函数.

Note that we are limited in the functions we are allowed to use. The only allowed functions are write(), malloc(), free(), and exit(). We must be able to code any other required function ourselves.

总而言之,我在这里问的是关于如何手动"解释和编写任何 wchar_t 字符的一些信息,使用尽可能少的代码,以便我可以尝试理解整个过程并自己编写代码.

To sum this up, what I'm asking here is some informations on how to interpret and write "manually" any wchar_t character, with as little code as possible so that I can try to understand the whole process and code it myself.

推荐答案

wchar_t 与 char 类似,因为它是一个数字,但在显示 char 时code> 或 wchar_t 我们不想看到数字,而是数字对应的绘制字符.charwchar_t 都没有定义从数字到字符的映射,它们取决于系统.所以 charwchar_t 的最终用法没有区别,只是它们的大小不同.

A wchar_t is similar to a char in the sense that it is a number, but when displaying a char or wchar_t we don't want to see the number, but the drawn character corresponding to the number. The mapping from the number to the characters aren't defined by neither char nor wchar_t, they depend on the system. So there is no difference in the end usage between char and wchar_t except for their sizes.

鉴于上述内容,printf("%ls") 的最简单实现是您知道用于 charchar 的系统编码是什么代码>wchar_t.例如,在我的系统中,char 有 8 位,编码为 UTF-8,而 wchar_t 为 32 位,编码为 UTF-32.所以 printf 实现只是将 UTF-32 转换为 UTF-8 并输出结果.

Given the above, the most trivial implementation of printf("%ls") is one where you know what are the system encodings for use with char and wchar_t. For example, in my system, char has 8 bits, has encoding UTF-8, while wchar_t is 32 bits and has encoding UTF-32. So the printf implementation just converts from UTF-32 to UTF-8 and outputs the result.

更通用的实现必须支持不同且可配置的编码,并且可能需要检查当前的编码是什么.在这种情况下,必须使用 wcsnrtombs()iconv() 等函数.

A more general implementation must support different and configurable encodings and may need to inspect what's the current encoding. In this case functions like wcsnrtombs() or iconv() must be used.

这篇关于理解和用 C 编写 wchar_t的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆