如何在Mac OS或Unix终端上打印C ++ wstring UTF-8字符? [英] How to print C++ wstring UTF-8 characters to Mac OS or Unix terminal?

查看:53
本文介绍了如何在Mac OS或Unix终端上打印C ++ wstring UTF-8字符?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何使用 std :: wcout 打印 std :: wstring ?

我尝试了以下操作,建议在此处进行以下操作,但它仅适用于打印此¡,但不是此 Japan :

I tried the following, which was recommended here, but it works only for printing this ¡Hola! but not this 日本:

#include <iostream>
#include <clocale>

int main(int argc, char* argv[])
{
  char* locale = setlocale(LC_ALL, ""); 
  std::cout << "locale: " << locale << std::endl; // "C" for me
  std::locale lollocale(locale);
  setlocale(LC_ALL, locale); 
  std::wcout.imbue(lollocale);
  std::wcout << L"¡Hola!" << std::endl; // ok
  std::wcout << L"日本" << std::endl;    // empty :(
  return 0;
}

以下内容(建议在此处)完全不打印日语字符:

Also the following (which was recommended here) does not print the Japanese characters at all:

#include <stdio.h>
#include <string>
#include <locale>
#include <iostream>

using namespace std;

int main()
{

        std::locale::global(std::locale(""));
        wstring japan = L"日本";
        wstring message = L"Welcome! Japan is ";

        message += japan;

        wprintf(message.c_str());
        wcout << message << endl;
}

所有这些都在Mac OS 10.6.8上.使用g ++ 4.2.1,使用终端2.1.2.

All this is on Mac OS 10.6.8. using g++ 4.2.1, using Terminal 2.1.2.

终端通常可以很好地显示字符,例如,当我 cat 源代码时.同样,此命令在 cout<<上也能正常工作.日本"<<std :: endl; ,但我确实需要打印 wstring .

The terminal can display the characters just fine in general, e.g., when I cat the source code. Also, this command works fine cout << "日本" << std::endl;, but I do need to print wstring.

我的 $ LANG 是这样的:

$ echo $LANG 
en_US.UTF-8

推荐答案

打印wstring的方法是将其转换为基于UTF-8 char的字符串.严重地 wchar_t在Windows或其他各种平台库之一之外毫无意义,不幸的是,在清楚之前,wchar_t已采用wchar_t这是一个坏主意.

The way you print wstring is by converting it to a UTF-8 char based string. Seriously wchar_t is pointless outside of Windows or one of the various other platform libraries that unfortunately adopted use of wchar_t before it became clear what a bad idea it is.

// move to clang and libc++ then
#include <codecvt>

int main(){
    std::wstring_convert<std::codecvt_utf8<wchar_t>,wchar_t> convert; // converts between UTF-8 and UCS-4 (given sizeof(wchar_t)==4)
    std:wstring s = L"日本";
    std::cout << convert.to_bytes(s);
}


只是为了说明您显示的代码出了什么问题?


And just to explain what's going wrong in the code you show;

char* locale = setlocale(LC_ALL, ""); 
std::cout << "locale: " << locale << std::endl; // "C" for me

此处的语言环境字符串是应用更改后的语言环境名称.因为您说得到"C",所以这意味着您使用的是"C"语言环境.通常,人们会得到一个类似于"en_US.UTF-8"的名称,但是由于任何原因,您的环境都没有为此正确设置.您显示 $ LANG 的设置正确,但是其他区域设置环境变量之一的设置可能有所不同.

The locale string here is the locale name after applying changes. Since you say you get "C" it means you're using the "C" locale. Normally one would get a name like "en_US.UTF-8" but for whatever reason your environment isn't set up correctly for that. You show that $LANG is set correctly but perhaps one of the other locale environment variables is set differently.

在任何情况下,您都使用"C"语言环境,仅支持基本字符集才需要.我相信在OS X上您将获得的行为是,任何 char 都将直接转换为相同的 wchar_t 值,而在其中仅 wchar_t char 支持的范围将转换回去.这实际上与使用基于ISO 8859-1的语言环境相同,因此日语字符将不起作用.

In any case you're using the "C" locale, which is only required to support the basic character set. I believe on OS X the behavior you'll get is that any char will directly convert to the same wchar_t value, and only wchar_t values in the range supported by char will convert back. That's effectively the same as using an ISO 8859-1 based locale, so Japanese characters will not work.

如果您真的坚持要让这种基于语言环境的东西能够正常工作,那么您需要获取一个适当的语言环境,该语言环境使用UTF-8.您可以找出环境出了什么问题,也可以使用不可移植的显式语言环境名称.

If you really insist on getting this locale based stuff to work then you need to get an appropriate locale, one that uses UTF-8. You can either figure out what's wrong with your environment or you can use a non-portable, explicit locale name.

std::wcout.imbue(std::locale("en_US.UTF-8"));
std::wcout << L"¡Hola!\n";
std::wcout << L"日本\n";

此外,如果您使用的是libstdc ++,则应该知道它在OS X上不正确支持语言环境.您必须使用libc ++才能获得OS X的语言环境名称(例如,"en_US.UTF-8")上班.

Also, if you're using libstdc++ you should know that it doesn't support locales properly on OS X. You'll have to use libc++ in order for OS X's locale names (e.g., "en_US.UTF-8") to work.

这篇关于如何在Mac OS或Unix终端上打印C ++ wstring UTF-8字符?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆