iostreams-将wchar_t或charXX_t值打印为字符 [英] iostreams - Print `wchar_t` or `charXX_t` value as a character

查看:151
本文介绍了iostreams-将wchar_t或charXX_t值打印为字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果将wchar_tchar16_tchar32_t值输入到狭窄的ostream中,它将打印代码点的数字值.

If you feed a wchar_t, char16_t, or char32_t value to a narrow ostream, it will print the numeric value of the code point.

#include <iostream>
using std::cout;
int main()
{
    cout << 'x' << L'x' << u'x' << U'x' << '\n';
}

打印x120120120.这是因为basic_ostreamcharT的特定组合有一个operator<<,但是其他字符类型没有类似的运算符,因此它们被静默转换为int并以这种方式打印.同样,非窄字符串文字(L"x"u"x"U"X")将被静默转换为void*并打印为指针值,而非窄字符串 objects (wstringu16stringu32string)甚至无法编译.

prints x120120120. This is because there is an operator<< for the specific combination of basic_ostream with its charT, but there aren't analogous operators for the other character types, so they get silently converted to int and printed that way. Similarly, non-narrow string literals (L"x", u"x", U"X") will be silently converted to void* and printed as the pointer value, and non-narrow string objects (wstring, u16string, u32string) won't even compile.

所以,问题是:在狭窄的ostream上将wchar_tchar16_tchar32_t值打印为字符而不是字符的最糟糕的方法是什么?代码点的数值?它应将在ostream编码中可表示的所有代码点正确转换为该编码,并在无法表示代码点时报告错误. (例如,给定u'…'和UTF-8 ostream,应将三字节序列0xE2 0x80 0xA6写入流;但是给定u'â'和KOI8-R ostream,则应报告错误.)

So, the question: What is the least awful way to print a wchar_t, char16_t, or char32_t value on a narrow ostream, as the character, rather than as the numeric value of the codepoint? It should correctly convert all codepoints that are representable in the encoding of the ostream, to that encoding, and should report an error when the codepoint is not representable. (For instance, given u'…' and a UTF-8 ostream, the three-byte sequence 0xE2 0x80 0xA6 should be written to the stream; but given u'â' and a KOI8-R ostream, an error should be reported.)

类似地,如何在狭窄的ostream上打印非狭窄的C字符串或字符串对象,转换为输出编码?

Similarly, how can one print a non-narrow C-string or string object on a narrow ostream, converting to the output encoding?

如果在ISO C ++ 11中无法做到这一点,我将针对特定平台给出答案.

If this can't be done within ISO C++11, I'll take platform-specific answers.

(灵感来自此问题 )

推荐答案

正如您所指出的,对于狭窄的ostream没有operator<<(std::ostream&, const wchar_t).但是,如果要使用语法,则可以教ostream如何使用wchar s,以便该例程被选为更好的重载方法,而不是首先需要转换为整数的例程.

As you noted, there is no operator<<(std::ostream&, const wchar_t) for a narrow ostream. If you want to use the syntax you can however teach ostream how to do with wchars so that that routine is picked as a better overload that the one requiring a conversion to an integer first.

如果您喜欢冒险:

namespace std {
  ostream& operator<< (ostream& os, wchar_t wc) {
    if(unsigned(wc) < 256) // or another upper bound
      return os << (unsigned char)wc;
    else
      throw your_favourite_exception; // or handle the error in some other way
  }
}

否则,请创建一个简单的struct,它透明地包含一个wchar_t并具有一个自定义的friend operator<<,然后将宽字符转换为输出之前的宽字符.

Otherwise, make a simple struct that transparently encompasses a wchar_t and has a custom friend operator<< and convert your wide characters to that before outputting them.

编辑:要在语言环境中进行即时转换,可以使用

To make an on-the-fly conversion to and from the locale, you can use the functions from <cwchar>, like:

ostream& operator<< (ostream& os, wchar_t wc) {
    std::mbstate_t state{};
    std::string mb(MB_CUR_MAX, '\0');
    size_t ret = std::wcrtomb(&mb[0], wc, &state);
    if(ret == static_cast<std::size_t>(-1))
        deal_with_the_error();
    return os << mb;
}

别忘了将您的语言环境设置为系统默认值:

Don't forget to set your locale to the system default:

std::locale::global(std::locale(""));
std::cout << L'ŭ';

这篇关于iostreams-将wchar_t或charXX_t值打印为字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆