Unicode字符的小写 [英] Lowercase of Unicode character
问题描述
我正在一个需要从 unicode文本
获取数据的 C ++
项目.我有一个问题,我不能降低某些 unicode字符
.我使用 wchar_t
来存储从unicode文件读取的unicode字符.之后,我使用 _wcslwr
降低了 wchar_t
字符串.还有很多情况还不低,例如:
I am working on a C++
project that need to get data from unicode text
.
I have a problem that I can't lower some unicode character
.
I use wchar_t
to store unicode character which read from a unicode file. After that, I use _wcslwr
to lower a wchar_t
string. There are many case still not lower such as:
Đ Â Ă Ê Ô Ơ Ư Ấ Ắ Ế Ố Ớ Ứ Ầ Ằ Ề Ồ Ờ Ừ Ậ Ặ Ệ Ộ Ợ Ự
小写为:
đ â ă ê ô ơ ư ấ ắ ế ố ớ ứ ầ ằ ề ồ ờ ừ ậ ặ ệ ộ ợ ự
我尝试了 tolower
,但仍然无法正常工作.
I have try tolower
and it is still not working.
推荐答案
如果仅调用 tolower
,它将从标头 clocale中调用
std :: tolower
,它将仅对ansi字符调用 lower
.
If you call only tolower
, it will call std::tolower
from header clocale
which will call the tolower
for ansi character only.
正确的签名应该是:
template< class charT >
charT tolower( charT ch, const locale& loc );
以下是两个版本,效果很好:
Here below is 2 versions which works well:
#include <iostream>
#include <cwctype>
#include <clocale>
#include <algorithm>
#include <locale>
int main() {
std::setlocale(LC_ALL, "");
std::wstring data = L"Đ Â Ă Ê Ô Ơ Ư Ấ Ắ Ế Ố Ớ Ứ Ầ Ằ Ề Ồ Ờ Ừ Ậ Ặ Ệ Ộ Ợ Ự";
std::wcout << data << std::endl;
// C std::towlower
for(auto c: data)
{
std::wcout << static_cast<wchar_t>(std::towlower(c));
}
std::wcout << std::endl;
// C++ std::tolower(charT, std::locale)
std::locale loc("");
for(auto c: data)
{
// This is recommended
std::wcout << std::tolower(c, loc);
}
std::wcout << std::endl;
return 0;
}
参考:
这篇关于Unicode字符的小写的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!