比较std :: wstring和std :: string [英] Compare std::wstring and std::string
问题描述
如何比较 wstring
(例如 LHello
)到 string
?
因为你问,这里是我的标准转换函数从字符串到宽字符串,使用C ++ std :: string
和 std :: wstring
/ p>
首先,请务必使用 set_locale
启动程序:
#include< clocale>
int main()
{
std :: setlocale(LC_CTYPE,); // before any string operations
}
首先,从一个窄字符串中获取一个宽字符串:
#include< string&
#include< vector>
#include< cassert>
#include< cstdlib>
#include< cwchar>
#include< cerrno>
//虚拟重载
std :: wstring get_wstring(const std :: wstring& s)
{
return s;
}
//真正的工作者
std :: wstring get_wstring(const std :: string& s)
{
const char * cs = s.c_str();
const size_t wn = std :: mbsrtowcs(NULL,& cs,0,NULL);
if(wn == size_t(-1))
{
std :: cout< Error in mbsrtowcs():< errno<< std :: endl;
return L;
}
std :: vector< wchar_t> buf(wn + 1);
const size_t wn_again = std :: mbsrtowcs(buf.data(),& cs,wn + 1,NULL);
if(wn_again == size_t(-1))
{
std :: cout< Error in mbsrtowcs():< errno<< std :: endl;
return L;
}
assert(cs == NULL); //成功转换
return std :: wstring(buf.data(),wn);
}
然后返回,从一个宽字符串中创建一个窄字符串。我调用窄字符串locale字符串,因为它是一个平台相关的编码,取决于当前的语言环境:
/ Dummy
std :: string get_locale_string(const std :: string& s)
{
return s;
}
//真正的工作者
std :: string get_locale_string(const std :: wstring& s)
{
const wchar_t * cs = s.c_str();
const size_t wn = std :: wcsrtombs(NULL,& cs,0,NULL);
if(wn == size_t(-1))
{
std :: cout< wcsrtombs()中的错误:< errno<< std :: endl;
return;
}
std :: vector< char> buf(wn + 1);
const size_t wn_again = std :: wcsrtombs(buf.data(),& cs,wn + 1,NULL);
if(wn_again == size_t(-1))
{
std :: cout< wcsrtombs()中的错误:< errno<< std :: endl;
return;
}
assert(cs == NULL); //成功转换
return std :: string(buf.data(),wn);
}
一些注意事项:
- 如果您没有
std :: vector :: data()
,可以说& buf [0]
。 - 我发现
r
$ c> mbsrtowcs 和wcsrtombs
在Windows上不能正常工作。在那里,你可以使用mbstowcs
和wcstombs
:mbstowcs(buf.data ,cs,wn + 1);
,wcstombs(buf.data(),cs,wn + 1);
为回答您的问题,如果您想要比较两个字符串,可以将它们都转换为宽字符串,然后比较那些。如果您正在从磁盘读取已知编码的文件,则应使用 iconv()
将已知编码的文件转换为WCHAR,然后与宽字符串。
但是请注意,复杂的Unicode文字可能有多种不同的表示形式作为代码点序列,你可能会认为它们是相等的。如果这是一种可能,您需要使用更高级别的Unicode处理库(如ICU),并将您的字符串标准化为一些常见的可比较的形式。
How can I compare a wstring
, such as L"Hello"
, to a string
? If I need to have the same type, how can I convert them into the same type?
Since you asked, here's my standard conversion functions from string to wide string, implemented using C++ std::string
and std::wstring
classes.
First off, make sure to start your program with set_locale
:
#include <clocale>
int main()
{
std::setlocale(LC_CTYPE, ""); // before any string operations
}
Now for the functions. First off, getting a wide string from a narrow string:
#include <string>
#include <vector>
#include <cassert>
#include <cstdlib>
#include <cwchar>
#include <cerrno>
// Dummy overload
std::wstring get_wstring(const std::wstring & s)
{
return s;
}
// Real worker
std::wstring get_wstring(const std::string & s)
{
const char * cs = s.c_str();
const size_t wn = std::mbsrtowcs(NULL, &cs, 0, NULL);
if (wn == size_t(-1))
{
std::cout << "Error in mbsrtowcs(): " << errno << std::endl;
return L"";
}
std::vector<wchar_t> buf(wn + 1);
const size_t wn_again = std::mbsrtowcs(buf.data(), &cs, wn + 1, NULL);
if (wn_again == size_t(-1))
{
std::cout << "Error in mbsrtowcs(): " << errno << std::endl;
return L"";
}
assert(cs == NULL); // successful conversion
return std::wstring(buf.data(), wn);
}
And going back, making a narrow string from a wide string. I call the narrow string "locale string", because it is in a platform-dependent encoding depending on the current locale:
// Dummy
std::string get_locale_string(const std::string & s)
{
return s;
}
// Real worker
std::string get_locale_string(const std::wstring & s)
{
const wchar_t * cs = s.c_str();
const size_t wn = std::wcsrtombs(NULL, &cs, 0, NULL);
if (wn == size_t(-1))
{
std::cout << "Error in wcsrtombs(): " << errno << std::endl;
return "";
}
std::vector<char> buf(wn + 1);
const size_t wn_again = std::wcsrtombs(buf.data(), &cs, wn + 1, NULL);
if (wn_again == size_t(-1))
{
std::cout << "Error in wcsrtombs(): " << errno << std::endl;
return "";
}
assert(cs == NULL); // successful conversion
return std::string(buf.data(), wn);
}
Some notes:
- If you don't have
std::vector::data()
, you can say&buf[0]
instead. - I've found that the
r
-style conversion functionsmbsrtowcs
andwcsrtombs
don't work properly on Windows. There, you can use thembstowcs
andwcstombs
instead:mbstowcs(buf.data(), cs, wn + 1);
,wcstombs(buf.data(), cs, wn + 1);
In response to your question, if you want to compare two strings, you can convert both of them to wide string and then compare those. If you are reading a file from disk which has a known encoding, you should use iconv()
to convert the file from your known encoding to WCHAR and then compare with the wide string.
Beware, though, that complex Unicode text may have multiple different representations as code point sequences which you may want to consider equal. If that is a possibility, you need to use a higher-level Unicode processing library (such as ICU) and normalize your strings to some common, comparable form.
这篇关于比较std :: wstring和std :: string的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!