使用UTF-8时应该使用wchar_t吗? [英] Should I use wchar_t when using UTF-8?

查看:291
本文介绍了使用UTF-8时应该使用wchar_t吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

UTF-8可以编码为1、2和4个字节.我系统上的单个char是1个字节.我是否应该使用wchar_t作为预防措施,以便能够适应任意UTF-8编码的任意字符?

UTF-8 can encode in 1, 2, and up to 4 bytes. A single char on my system is 1 byte. Should I use wchar_t as a precaution so that I will be able to fit any arbitrary UTF-8 encoded character?

推荐答案

不,您不应该! Unicode 4.0标准(ISO 10646:2003)指出:

No, you should not! The Unicode 4.0 standard (ISO 10646:2003) notes that:

wchar_t的宽度是编译器特定的,并且可以小至8位.因此,需要跨任何C或C ++编译器移植的程序不应使用wchar_t来存储Unicode文本.

The width of wchar_t is compiler-specific and can be as small as 8 bits. Consequently, programs that need to be portable across any C or C++ compiler should not use wchar_t for storing Unicode text.

在大多数情况下,UTF-8文本的字符性质"与您的程序无关,因此将其视为char元素的数组,就像其他字符串一样,就足够了.但是,如果需要提取单个字符,则这些字符应以至少24位宽(例如uint32_t)的类型存储,以适应所有Unicode代码点.

Under most circumstances, the "character nature" of UTF-8 text will not be relevant to your program, so treating it as an array of char elements, just like any other string, will be sufficient. If you need to extract individual characters, though, those characters should be stored in a type that is at least 24 bits wide (e.g, uint32_t), in order to accomodate all Unicode code points.

这篇关于使用UTF-8时应该使用wchar_t吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆