将wstring转换为以UTF-8编码的字符串 [英] Convert wstring to string encoded in UTF-8

查看:126
本文介绍了将wstring转换为以UTF-8编码的字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要在wstring和string之间转换。我想,使用codecvt facet应该做的伎俩,但它似乎并不适用于utf-8区域。

I need to convert between wstring and string. I figured out, that using codecvt facet should do the trick, but it doesn't seem to work for utf-8 locale.

我的想法是,当我阅读utf-8编码的文件到字符,一个utf-8字符读入两个正常字符(这是utf-8的工作原理)。我想从我的代码中使用的库的wstring表示创建这个utf-8字符串。

My idea is, that when I read utf-8 encoded file to chars, one utf-8 character is read into two normal characters (which is how utf-8 works). I'd like to create this utf-8 string from wstring representation for library I use in my code.

有没有人知道怎么做?

我已经尝试过了:

  locale mylocale("cs_CZ.utf-8");
  mbstate_t mystate;

  wstring mywstring = L"čřžýáí";

  const codecvt<wchar_t,char,mbstate_t>& myfacet =
    use_facet<codecvt<wchar_t,char,mbstate_t> >(mylocale);

  codecvt<wchar_t,char,mbstate_t>::result myresult;  

  size_t length = mywstring.length();
  char* pstr= new char [length+1];

  const wchar_t* pwc;
  char* pc;

  // translate characters:
  myresult = myfacet.out (mystate,
      mywstring.c_str(), mywstring.c_str()+length+1, pwc,
      pstr, pstr+length+1, pc);

  if ( myresult == codecvt<wchar_t,char,mbstate_t>::ok )
   cout << "Translation successful: " << pstr << endl;
  else cout << "failed" << endl;
  return 0;

对cs_CZ.utf-8区域设置返回failed,并对cs_CZ.iso8859-2 locale。

which returns 'failed' for cs_CZ.utf-8 locale and works correctly for cs_CZ.iso8859-2 locale.

推荐答案

C ++不知道Unicode。使用外部库(例如ICU)( UnicodeString class )或Qt( QString class ),都支持Unicode,包括UTF-8。

C++ has no idea of Unicode. Use an external library such as ICU (UnicodeString class) or Qt (QString class), both support Unicode, including UTF-8.

这篇关于将wstring转换为以UTF-8编码的字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆