iostreams-将wchar_t或charXX_t值打印为字符 [英] iostreams - Print `wchar_t` or `charXX_t` value as a character
问题描述
如果将wchar_t
,char16_t
或char32_t
值输入到狭窄的ostream中,它将打印代码点的数字值.
If you feed a wchar_t
, char16_t
, or char32_t
value to a narrow ostream, it will print the numeric value of the code point.
#include <iostream>
using std::cout;
int main()
{
cout << 'x' << L'x' << u'x' << U'x' << '\n';
}
打印x120120120
.这是因为basic_ostream
与charT
的特定组合有一个operator<<
,但是其他字符类型没有类似的运算符,因此它们被静默转换为int
并以这种方式打印.同样,非窄字符串文字(L"x"
,u"x"
,U"X"
)将被静默转换为void*
并打印为指针值,而非窄字符串 objects (wstring
,u16string
,u32string
)甚至无法编译.
prints x120120120
. This is because there is an operator<<
for the specific combination of basic_ostream
with its charT
, but there aren't analogous operators for the other character types, so they get silently converted to int
and printed that way. Similarly, non-narrow string literals (L"x"
, u"x"
, U"X"
) will be silently converted to void*
and printed as the pointer value, and non-narrow string objects (wstring
, u16string
, u32string
) won't even compile.
所以,问题是:在狭窄的ostream上将wchar_t
,char16_t
或char32_t
值打印为字符而不是字符的最糟糕的方法是什么?代码点的数值?它应将在ostream编码中可表示的所有代码点正确转换为该编码,并在无法表示代码点时报告错误. (例如,给定u'…'
和UTF-8 ostream,应将三字节序列0xE2 0x80 0xA6写入流;但是给定u'â'
和KOI8-R ostream,则应报告错误.)
So, the question: What is the least awful way to print a wchar_t
, char16_t
, or char32_t
value on a narrow ostream, as the character, rather than as the numeric value of the codepoint? It should correctly convert all codepoints that are representable in the encoding of the ostream, to that encoding, and should report an error when the codepoint is not representable. (For instance, given u'…'
and a UTF-8 ostream, the three-byte sequence 0xE2 0x80 0xA6 should be written to the stream; but given u'â'
and a KOI8-R ostream, an error should be reported.)
类似地,如何在狭窄的ostream上打印非狭窄的C字符串或字符串对象,转换为输出编码?
Similarly, how can one print a non-narrow C-string or string object on a narrow ostream, converting to the output encoding?
如果在ISO C ++ 11中无法做到这一点,我将针对特定平台给出答案.
If this can't be done within ISO C++11, I'll take platform-specific answers.
(灵感来自此问题 )
推荐答案
正如您所指出的,对于狭窄的ostream没有operator<<(std::ostream&, const wchar_t)
.但是,如果要使用语法,则可以教ostream
如何使用wchar
s,以便该例程被选为更好的重载方法,而不是首先需要转换为整数的例程.
As you noted, there is no operator<<(std::ostream&, const wchar_t)
for a narrow ostream. If you want to use the syntax you can however teach ostream
how to do with wchar
s so that that routine is picked as a better overload that the one requiring a conversion to an integer first.
如果您喜欢冒险:
namespace std {
ostream& operator<< (ostream& os, wchar_t wc) {
if(unsigned(wc) < 256) // or another upper bound
return os << (unsigned char)wc;
else
throw your_favourite_exception; // or handle the error in some other way
}
}
否则,请创建一个简单的struct
,它透明地包含一个wchar_t
并具有一个自定义的friend operator<<
,然后将宽字符转换为输出之前的宽字符.
Otherwise, make a simple struct
that transparently encompasses a wchar_t
and has a custom friend operator<<
and convert your wide characters to that before outputting them.
To make an on-the-fly conversion to and from the locale, you can use the functions from <cwchar>
, like:
ostream& operator<< (ostream& os, wchar_t wc) {
std::mbstate_t state{};
std::string mb(MB_CUR_MAX, '\0');
size_t ret = std::wcrtomb(&mb[0], wc, &state);
if(ret == static_cast<std::size_t>(-1))
deal_with_the_error();
return os << mb;
}
别忘了将您的语言环境设置为系统默认值:
Don't forget to set your locale to the system default:
std::locale::global(std::locale(""));
std::cout << L'ŭ';
这篇关于iostreams-将wchar_t或charXX_t值打印为字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!