比较std :: wstring和std :: string [英] Compare std::wstring and std::string

查看:374
本文介绍了比较std :: wstring和std :: string的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何比较 wstring (例如 LHello)到 string

解决方案

因为你问,这里是我的标准转换函数从字符串到宽字符串,使用C ++ std :: string std :: wstring / p>

首先,请务必使用 set_locale 启动程序:

  #include< clocale> 

int main()
{
std :: setlocale(LC_CTYPE,); // before any string operations
}

首先,从一个窄字符串中获取一个宽字符串:

  #include< string& 
#include< vector>
#include< cassert>
#include< cstdlib>
#include< cwchar>
#include< cerrno>

//虚拟重载
std :: wstring get_wstring(const std :: wstring& s)
{
return s;
}

//真正的工作者
std :: wstring get_wstring(const std :: string& s)
{
const char * cs = s.c_str();
const size_t wn = std :: mbsrtowcs(NULL,& cs,0,NULL);

if(wn == size_t(-1))
{
std :: cout< Error in mbsrtowcs():< errno<< std :: endl;
return L;
}

std :: vector< wchar_t> buf(wn + 1);
const size_t wn_again = std :: mbsrtowcs(buf.data(),& cs,wn + 1,NULL);

if(wn_again == size_t(-1))
{
std :: cout< Error in mbsrtowcs():< errno<< std :: endl;
return L;
}

assert(cs == NULL); //成功转换

return std :: wstring(buf.data(),wn);
}

然后返回,从一个宽字符串中创建一个窄字符串。我调用窄字符串locale字符串,因为它是一个平台相关的编码,取决于当前的语言环境:

  / Dummy 
std :: string get_locale_string(const std :: string& s)
{
return s;
}

//真正的工作者
std :: string get_locale_string(const std :: wstring& s)
{
const wchar_t * cs = s.c_str();
const size_t wn = std :: wcsrtombs(NULL,& cs,0,NULL);

if(wn == size_t(-1))
{
std :: cout< wcsrtombs()中的错误:< errno<< std :: endl;
return;
}

std :: vector< char> buf(wn + 1);
const size_t wn_again = std :: wcsrtombs(buf.data(),& cs,wn + 1,NULL);

if(wn_again == size_t(-1))
{
std :: cout< wcsrtombs()中的错误:< errno<< std :: endl;
return;
}

assert(cs == NULL); //成功转换

return std :: string(buf.data(),wn);
}

一些注意事项:




  • 如果您没有 std :: vector :: data(),可以说& buf [0]

  • 我发现 r $ c> mbsrtowcs 和 wcsrtombs 在Windows上不能正常工作。在那里,你可以使用 mbstowcs wcstombs mbstowcs(buf.data ,cs,wn + 1); wcstombs(buf.data(),cs,wn + 1);






为回答您的问题,如果您想要比较两个字符串,可以将它们都转换为宽字符串,然后比较那些。如果您正在从磁盘读取已知编码的文件,则应使用 iconv()将已知编码的文件转换为WCHAR,然后与宽字符串。



但是请注意,复杂的Unicode文字可能有多种不同的表示形式作为代码点序列,你可能会认为它们是相等的。如果这是一种可能,您需要使用更高级别的Unicode处理库(如ICU),并将您的字符串标准化为一些常见的可比较的形式。


How can I compare a wstring, such as L"Hello", to a string? If I need to have the same type, how can I convert them into the same type?

解决方案

Since you asked, here's my standard conversion functions from string to wide string, implemented using C++ std::string and std::wstring classes.

First off, make sure to start your program with set_locale:

#include <clocale>

int main()
{
  std::setlocale(LC_CTYPE, "");  // before any string operations
}

Now for the functions. First off, getting a wide string from a narrow string:

#include <string>
#include <vector>
#include <cassert>
#include <cstdlib>
#include <cwchar>
#include <cerrno>

// Dummy overload
std::wstring get_wstring(const std::wstring & s)
{
  return s;
}

// Real worker
std::wstring get_wstring(const std::string & s)
{
  const char * cs = s.c_str();
  const size_t wn = std::mbsrtowcs(NULL, &cs, 0, NULL);

  if (wn == size_t(-1))
  {
    std::cout << "Error in mbsrtowcs(): " << errno << std::endl;
    return L"";
  }

  std::vector<wchar_t> buf(wn + 1);
  const size_t wn_again = std::mbsrtowcs(buf.data(), &cs, wn + 1, NULL);

  if (wn_again == size_t(-1))
  {
    std::cout << "Error in mbsrtowcs(): " << errno << std::endl;
    return L"";
  }

  assert(cs == NULL); // successful conversion

  return std::wstring(buf.data(), wn);
}

And going back, making a narrow string from a wide string. I call the narrow string "locale string", because it is in a platform-dependent encoding depending on the current locale:

// Dummy
std::string get_locale_string(const std::string & s)
{
  return s;
}

// Real worker
std::string get_locale_string(const std::wstring & s)
{
  const wchar_t * cs = s.c_str();
  const size_t wn = std::wcsrtombs(NULL, &cs, 0, NULL);

  if (wn == size_t(-1))
  {
    std::cout << "Error in wcsrtombs(): " << errno << std::endl;
    return "";
  }

  std::vector<char> buf(wn + 1);
  const size_t wn_again = std::wcsrtombs(buf.data(), &cs, wn + 1, NULL);

  if (wn_again == size_t(-1))
  {
    std::cout << "Error in wcsrtombs(): " << errno << std::endl;
    return "";
  }

  assert(cs == NULL); // successful conversion

  return std::string(buf.data(), wn);
}

Some notes:

  • If you don't have std::vector::data(), you can say &buf[0] instead.
  • I've found that the r-style conversion functions mbsrtowcs and wcsrtombs don't work properly on Windows. There, you can use the mbstowcs and wcstombs instead: mbstowcs(buf.data(), cs, wn + 1);, wcstombs(buf.data(), cs, wn + 1);


In response to your question, if you want to compare two strings, you can convert both of them to wide string and then compare those. If you are reading a file from disk which has a known encoding, you should use iconv() to convert the file from your known encoding to WCHAR and then compare with the wide string.

Beware, though, that complex Unicode text may have multiple different representations as code point sequences which you may want to consider equal. If that is a possibility, you need to use a higher-level Unicode processing library (such as ICU) and normalize your strings to some common, comparable form.

这篇关于比较std :: wstring和std :: string的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆