wcstombs:字符编码? [英] wcstombs: character encoding?

查看:157
本文介绍了wcstombs:字符编码?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

wcstombs 文档说,它将广泛的序列字符代码到多字节字符串。但它从来没有说过什么是宽字符。

wcstombs documentation says, it "converts the sequence of wide-character codes to multibyte string". But it never says what is a "wide-character".

它是隐式的,就像它将utf-16转换为utf-8或转换是由某个环境变量定义的?

Is it implicit, like say it converts utf-16 to utf-8 or the conversion is defined by some environment variable?

还有什么是wcstombs的典型用例?

Also what is the typical use case of wcstombs?

推荐答案

使用 setlocale (或 LC_ALL )类别中的标准函数来设置映射()标准函数的 LC_CTYPE 库使用 wchar_t 字符和多字节字符。传递给 setlocale()的实际区域设置名称是实现定义的,因此您需要在编译器的文档中查找。

You use the setlocale() standard function with the LC_CTYPE (or LC_ALL) category to set the mapping the library uses between wchar_t characters and multibyte characters. The actual locale name passed to setlocale() is implementation defined, so you'll need to look it up in your compiler's docs.

例如,对于MSVC,您可以使用

For example, with MSVC you might use

setlocale( LC_ALL, ".1252" );

设置C运行时使用代码页1252作为多字节字符集。请注意,MSVC文档明确指出,对于多字节字符集,不能将区域设置为UTF-7或UTF8:

to set the C runtime to use codepage 1252 as the multibyte character set. Note that MSVC docs explicitly indicates that the locale cannot be set to UTF-7 or UTF8 for the multibyte character sets:


语言,国家/地区代码和代码页包括Win32 NLS API支持的所有那些,除了每个字符需要多于两个字节的代码页,例如UTF-7和UTF-8。如果您提供类似UTF-7或UTF-8的代码页,则setlocale将失败,返回NULL。

The set of available languages, country/region codes, and code pages includes all those supported by the Win32 NLS API except code pages that require more than two bytes per character, such as UTF-7 and UTF-8. If you provide a code page like UTF-7 or UTF-8, setlocale will fail, returning NULL.

wchar_t 类型意在能够支持系统支持的任何字符集 - 标准没有定义 wchar_t 类型(它可以小到 char 或任何较大的整数类型)。在Windows上,它是系统的'内部'Unicode编码,它是UTF-16(在WinXP之前的UCS-2)。老实说,我不能在MSVC文档中找到一个直接的报价,虽然。严格来说,实现应该调用这个,但我找不到它。

The "wide-character" wchar_t type is intended to be able to support any character set the system supports - the standard doesn't define the size of a wchar_t type (it could be as small as a char or any of the larger integer types). On Windows it's the system's 'internal' Unicode encoding, which is UTF-16 (UCS-2 before WinXP). Honestly, I can't find a direct quote on that in the MSVC docs, though. Strictly speaking, the implementation should call this out, but I can't find it.

这篇关于wcstombs:字符编码?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆